Shibo Hao
Ph.D. student at UCSD
Hello! I’m Shibo Hao, a Ph.D. student at UC San Diego, advised by Zhiting Hu. My research is funded by the Bloomberg Fellowship. Previously, I received my B.S. in Computer Science from Peking University.
My research goal is to push the boundaries of machine reasoning. My work includes training LLMs to reason with reinforcement learning (Guru, OREO, FoR), exploring reasoning in latent space (Coconut, Coconut-theory, Coconut-dynamics), building a system-2 reasoning framework using world-model planning (Reasoning via Planning, Pandora, LLM Reasoners), and augmenting LLMs with external tools (ToolkenGPT).
News
| Sep 25, 2025 | Guru, our exploration of cross-domain RL for LLM reasoning, and Reasoning-by-Superposition, A theoretical perspective on Coconut are accepted by NeurIPS 2025. |
|---|---|
| Apr 14, 2025 | Coconut 🥥 is featured in Quanta Magazine! |
| Dec 21, 2024 | Introducing OREO (Offline REasoning Optimization) (Arxiv, Twitter) |
| Dec 9, 2024 | Honored to receive the Bloomberg Data Science Ph.D. Fellowship! |
| Jul 10, 2024 | LLM Reasoners is accepted to the first Conference of Language Modeling (COLM 2024). |
| May 24, 2024 | Check out Pandora, our new work towards a general world model 🌎 |
| Nov 17, 2023 | ToolkenGPT is accepted to NeurIPS 2023 as an oral presentation, and received the best paper award at SoCalNLP 2023🎉! |
| Oct 25, 2023 | Reasoning via Planning (RAP) has been featured in State of AI Report 2023. |
Selected publications
2025
- NeurIPSAdvances in Neural Information Processing Systems, 2025
- ACLIn Findings of the Association for Computational Linguistics: ACL 2025, 2025
- NeurIPSAdvances in Neural Information Processing Systems, 2025
- COLM
2024
- PreprintarXiv preprint arXiv:2406.09455, 2024
- COLMIn Conference on Language Model (COLM), 2024Also to appear at Large Language Model (LLM) Agents workshop at ICLR 2024
2023
- NeurIPS
- EMNLPIn Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023