Shibo Hao(郝世博)
Ph.D. student at UCSD
Hello! I’m Shibo Hao, a Ph.D. candidate at UC San Diego, advised by Zhiting Hu. My research is funded by the Bloomberg Fellowship. I was a research scientist intern at Meta FIAR lab, mentored by Yuandong Tian and Jason Weston. I received my B.S. in Computer Science from Peking University. I’m currently looking for a full-time job in industry.
My research goal is to push the boundaries of machine reasoning. My work includes training LLMs to reason in latent space (Coconut, Coconut-theory, Coconut-dynamics), with reinforcement learning (Guru, OREO, FoR), building a system-2 reasoning framework using world-model planning (Reasoning via Planning, LLM Reasoners, Pandora), and augmenting LLMs with external tools (ToolkenGPT).
*Bold indicates first or co-first authorship.
News
| Dec 2, 2025 | I’m attending NeurIPS 2025 at San Diego, and would love to discuss full-time opportunities related to reasoning, agents, and post-training. |
|---|---|
| Dec 2, 2025 | We released CocoaBench, a new benchmark for general digital agents! |
| Sep 25, 2025 | Guru, our exploration of cross-domain RL for LLM reasoning, and Reasoning-by-Superposition, A theoretical perspective on Coconut are accepted by NeurIPS 2025. |
| Apr 14, 2025 | Coconut 🥥 is featured in Quanta Magazine! |
| Dec 21, 2024 | Introducing OREO (Offline REasoning Optimization) (Arxiv, Twitter) |
| Dec 9, 2024 | Honored to receive the Bloomberg Data Science Ph.D. Fellowship! |
| Jul 10, 2024 | LLM Reasoners is accepted to the first Conference of Language Modeling (COLM 2024). |
| May 24, 2024 | Check out Pandora, our new work towards a general world model 🌎 |
| Nov 17, 2023 | ToolkenGPT is accepted to NeurIPS 2023 as an oral presentation, and received the best paper award at SoCalNLP 2023🎉! |
| Oct 25, 2023 | Reasoning via Planning (RAP) has been featured in State of AI Report 2023. |
Selected publications
2025
- NeurIPSAdvances in Neural Information Processing Systems, 2025
- ACLIn Findings of the Association for Computational Linguistics: ACL 2025, 2025
- NeurIPSAdvances in Neural Information Processing Systems, 2025
- COLM
- PreprintarXiv preprint arXiv:2509.23365, 2025
2024
- PreprintarXiv preprint arXiv:2406.09455, 2024
- COLMIn Conference on Language Model (COLM), 2024Also to appear at Large Language Model (LLM) Agents workshop at ICLR 2024
2023
- NeurIPS
- EMNLPIn Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023