Chujie Zheng 郑楚杰

I am a researcher in the Qwen Team, Alibaba Group. I received my doctoral degree in Computer Science and Technology at Tsinghua University.

I am dedicated to building scalable, generalist AI systems. Specifically, I am interested in methodologies that can consistently and efficiently improve the intelligence and task-solving abilities of AI systems with increasing compute and data. My current work focuses on advancing the reasoning capabilities of the Qwen models (e.g., Qwen3, QwQ) and developing large-scale reinforcement learning (RL) approaches.

My research interests also broadly span model architecture, interpretability, safety, and alignment. Previously, I conducted extensive research on LLMs for social good, with a main focus on building emotional support systems.

You can find my CV here.

News

  • [06/2025] Release the paper of driving reasoning RL with high-entropy tokens (80/20 Rule) [paper]
  • [05/2025] Release the paper of modeling world preference (WorldPM) [paper]
  • [05/2025] Release the Qwen3 technical report [paper]
  • [05/2025] The ExPO, ProcessBench, and Qwen2.5-Math-PRM papers are accepted to ACL 2025
  • [04/2025] Release the Qwen3 series foundation models [blog] [model] [chat]
  • [03/2025] Release the QwQ-32B reasoning model [blog] [model] [demo]
  • [02/2025] Release the SuperGPQA benchmark for comprehensive LLM evaluation [paper] [data]
  • [01/2025] Release the Qwen2.5-Math-PRM models for process supervision in mathematical reasoning [paper] [model]
  • [12/2024] Release the ProcessBench benchmark for process supervision in mathematical reasoning [paper] [repo] [data]
  • [12/2024] Release the Yi-Lightning technical report [paper]
  • [11/2024] Release the QwQ-32B-Preview reasoning model [blog] [model] [demo]
  • [10/2024] Release the Yi-Lightning foundation model
  • [05/2024] The DRO paper is accpeted to ICML 2024
  • [04/2024] Release the paper of model extrapolation (ExPO) for efficient LLM alignment [paper]
  • [01/2024] Release the paper of safety prompt optimization (DRO) for LLM safeguarding [paper]
  • [01/2024] The PriDe paper is accpeted to ICLR 2024 as Spotlight (5%)
  • [11/2023] Start my visiting research at UCLA, hosted by Nanyun (Violet) Peng
  • [09/2023] Release the paper of LLM debiasing (PriDe) in multiple-choice QA [paper]

Recent Projects

You can find my full paper list on Google Scholar.

  1. Qwen3 Technical Report
    Qwen Team
    [paper] [model] [chat]
  2. QwQ-32B: Embracing the Power of Reinforcement Learning
    Qwen Team
    [blog] [model] [demo]
  3. QwQ: Reflect Deeply on the Boundaries of the Unknown
    Qwen Team
    [blog] [model] [demo]

Education

  • Aug 2020 – Jun 2025. Ph.D in Computer Science and Technology, Tsinghua University. Advisor: Minlie Huang
  • Nov 2023 – Jun 2024. Visiting Researcher. University of California, Los Angeles. Host: Nanyun Peng
  • Aug 2016 – Jul 2020. B.Sc. in Mathematics and Physics, Tsinghua University

Work Experiences

  • Oct 2024 – Present. Researcher. Qwen Team, Alibaba Group
    • Built the Qwen3 series foundation models
    • Built the QwQ series reasoning models
    • Built ProcessBench and Qwen2.5-Math-PRM for process supervision in mathematical reasoning
  • Jul 2024 – Oct 2024. Research Intern. 01.AI
    • Built the Yi-Lightning foundation model
  • Feb 2022 – Jun 2022. Research Intern. Baidu

Services

  • Area Chair: ACL (24/25), EMNLP (24/25), NAACL (25), ACL Rolling Review (24/25)
  • Reviewer: ICLR (25), NeurIPS (24/25), ICML (24), COLM (24/25), ACL (22/23), EMNLP (21/22), NAACL (24), EACL (23), ACL Rolling Review (21/22/23), CogSci (24), AAAI (22/23)

Awards and Honors

  • Outstanding Graduete, DCST, Tsinghua University, 2025
  • Outstanding Undergraduate, Tsinghua University, 2020
  • National Scholarship (Top 2/100), Ministry of Education of China, 2019