Chujie Zheng 郑楚杰

I am a researcher in the Qwen Team and a final-year Ph.D candidate at Tsinghua University, advised by Prof. Minlie Huang.

I am dedicated to building scalable, generalist AI systems. My expertise is mainly centered on data and algorithms, while I’m also actively developing skills related to infrastructure.

  • My current work focuses on advancing the reasoning capabilities of Qwen models (e.g., QwQ) through large-scale reinforcement learning (RL), enabling them to tackle increasingly complex tasks.
  • My research interests also broadly include model architecture design, interpretability and dynamics, as well as trustworthiness and alignment.
  • Previously, I conducted systematic research on LLMs for social good, with a main focus on building emotional support systems.

You can find my CV here.

News

  • [03/2025] Release the QwQ-32B reasoning model [blog] [🤗 model] [🤗 demo]
  • [01/2025] Release Qwen2.5-Math-PRM-7B/72B for process supervision in mathematical reasoning [paper] [🤗 model]
  • [12/2024] Release ProcessBench for process supervision in mathematical reasoning [paper] [repo] [🤗 data]
  • [12/2024] Release the Yi-Lightning technical report [paper]
  • [11/2024] Release the QwQ-32B-Preview reasoning model [blog] [🤗 model] [🤗 demo]
  • [10/2024] Release Yi-Lightning, which ranks #6 on Chatbot Arena (as of 10/14/2024)
  • [05/2024] The DRO paper is accpeted at ICML 2024 [paper]
  • [04/2024] Release the paper of model extrapolation (ExPO) for efficient LLM alignment [paper]
  • [01/2024] Release the paper of safety prompt optimization (DRO) for LLM safeguarding [paper]
  • [01/2024] The PriDe paper is accpeted as Spotlight (5%) at ICLR 2024 [paper]
  • [11/2023] Start my visiting research at UCLA, hosted by Nanyun (Violet) Peng
  • [09/2023] Release the paper of LLM debiasing (PriDe) in multiple-choice QA [paper]

Selected Projects

You can find my full paper list on Google Scholar.

  1. QwQ-32B: Embracing the Power of Reinforcement Learning
    Qwen Team
    [blog] [🤗 model] [🤗 demo]
  2. QwQ: Reflect Deeply on the Boundaries of the Unknown
    Qwen Team
    [blog] [🤗 model] [🤗 demo]
  3. ProcessBench: Identifying Process Errors in Mathematical Reasoning
    Chujie Zheng, Zhenru Zhang, Beichen Zhang, Runji Lin, Keming Lu, Bowen Yu, Dayiheng Liu, Jingren Zhou, Junyang Lin
    [paper] [repo] [🤗 data]
  4. The Lessons of Developing Process Reward Models in Mathematical Reasoning
    Zhenru Zhang, Chujie Zheng, Yangzhen Wu, Beichen Zhang, Runji Lin, Keming Lu, Bowen Yu, Dayiheng Liu, Jingren Zhou, Junyang Lin
    [paper] [🤗 model]
  5. Yi-Lightning Technical Report
    01.AI
    [paper]
  6. Model Extrapolation Expedites Alignment
    Chujie Zheng, Ziqi Wang, Heng Ji, Minlie Huang, Nanyun Peng
    [paper] [repo] [🤗 model] (130K+ downloads)
  7. On Prompt-Driven Safeguarding for Large Language Models
    Chujie Zheng, Fan Yin, Hao Zhou, Fandong Meng, Jie Zhou, Kai-Wei Chang, Minlie Huang, Nanyun Peng
    ICML 2024
    [paper] [repo]
  8. Large Language Models Are Not Robust Multiple Choice Selectors
    Chujie Zheng, Hao Zhou, Fandong Meng, Jie Zhou, Minlie Huang
    ICLR 2024 (Spotlight: 5%)
    [paper] [repo]
  9. Chat Templates for 🤗 HuggingFace Large Language Models
    Chujie Zheng
    GitHub Repository (600+ stars)
    [repo]

Education

  • Aug 2020 – present. Ph.D candidate in Computer Science and Technology, Tsinghua University. Advisor: Minlie Huang
  • Nov 2023 – Jun 2024. Visiting Researcher, University of California, Los Angeles. Host: Nanyun Peng
  • Aug 2016 – Jul 2020. B.Sc. in Foundational Mathematics and Physics, Tsinghua University

Work Experiences

  • Oct 2024 – Present. Researcher. Qwen Team, Alibaba Group
    • Built the QwQ series reasoning model
    • Built ProcessBench and Qwen2.5-Math-PRM-7B/72B for process supervision in mathematical reasoning
  • Jul 2024 – Oct 2024. Research Intern. 01.AI
    • Built Yi-Lightning, which ranks #6 on Chatbot Arena (as of 10/14/2024)
  • Feb 2022 – Jun 2022. Research Intern. Baidu

Services

  • Area Chair: ACL (24/25), EMNLP (24/25), NAACL (25), ACL Rolling Review (24/25)
  • Reviewer: ICLR (25), NeurIPS (24/25), ICML (24), COLM (24/25), ACL (22/23), EMNLP (21/22), NAACL (24), EACL (23), ACL Rolling Review (21/22/23), CogSci (24), AAAI (22/23)

Awards and Honors

  • Comprehensive Merit Scholarship, Tsinghua University, 2021 – 2024
  • Outstanding Undergraduate, Tsinghua University, 2020
  • National Scholarship (Top 2/100), Ministry of Education of China, 2019