I am a Ph.D. candidate at UC San Diego, advised by Zhiting Hu. My research is funded by the Bloomberg Fellowship. I was a research scientist intern at Meta FAIR lab, mentored by Yuandong Tian and Jason Weston. I received my B.S. in Computer Science from Peking University. I'm currently looking for a full-time position in industry.
My research focuses on pushing the boundaries of machine reasoning in large language models. My work includes developing latent-space reasoning methods (Coconut, Coconut-Theory, Coconut-Dynamics), training LLMs to reason via reinforcement learning (Guru, OREO), building system-2 inference-time reasoning frameworks (RAP, LLM Reasoners), and finding better ways for LLM agents to interact with the world (ToolkenGPT, CocoaBench).
Bold indicates first or co-first authorship.
News
- Dec 2025 Released CocoaBench, a new benchmark for general digital agents.
- Sep 2025 Guru and Reasoning-by-Superposition both accepted to NeurIPS 2025.
- Apr 2025 Coconut featured in Quanta Magazine.
- Dec 2024 Honored to receive the Bloomberg Data Science Ph.D. Fellowship for 2024–2025.
- Dec 2024 Introducing OREO (Offline REasoning Optimization) for multi-step LLM reasoning.
- Jul 2024 LLM Reasoners accepted to the first Conference on Language Modeling (COLM 2024).
- May 2024 Check out Pandora, our new work towards a general world model.
- Nov 2023 ToolkenGPT accepted to NeurIPS 2023 as an oral presentation and received the Best Paper Award at SoCalNLP 2023.
- Oct 2023 RAP featured in the State of AI Report 2023.
Selected Publications
* equal contribution · View all publications →