Chujie Zheng

I am a researcher at the Qwen Team. I am studying and scaling reinforcement learning for the next-generation Qwen models.

Here is my CV.

News

  • [04/2026] Release the Qwen3.6 series updates [blog]
  • [02/2026] Release the Qwen3.5 series foundation models [blog] [model]
  • [12/2025] Release the paper of large-scale stable RL training recipes [paper]
  • [07/2025] Release the Group Sequence Policy Optimization (GSPO) algorithm for large-scale MoE RL training [paper]
  • [07/2025] Release the Qwen3-2507 series updates
  • [05/2025] Release the Qwen3 technical report [paper]
  • [04/2025] Release the Qwen3 series foundation models [blog] [model]
  • [03/2025] Release the QwQ-32B reasoning model [blog] [model]
  • [01/2025] Release the Qwen2.5-Math-PRM models for process supervision in mathematical reasoning [paper] [model]
  • [12/2024] Release the ProcessBench benchmark for process supervision in mathematical reasoning [paper] [repo] [data]
  • [11/2024] Release the QwQ-32B-Preview reasoning model [blog] [model]

Recent Projects

  1. Qwen3.6-Plus: Towards Real World Agents
    Qwen Team
    [blog]
  2. Qwen3.5: Towards Native Multimodal Agents
    Qwen Team
    [blog] [model]
  3. Stabilizing Reinforcement Learning with LLMs: Formulation and Practices
    Chujie Zheng, Kai Dang, Bowen Yu, Mingze Li, Huiqiang Jiang, Junrong Lin, Yuqiong Liu, Hao Lin, Chencan Wu, Feng Hu, An Yang, Jingren Zhou, Junyang Lin
    [paper]
  4. Group Sequence Policy Optimization
    Chujie Zheng, Shixuan Liu, Mingze Li, Xiong-Hui Chen, Bowen Yu, Chang Gao, Kai Dang, Yuqiong Liu, Rui Men, An Yang, Jingren Zhou, Junyang Lin
    [paper]
  5. Qwen3: Think Deeper, Act Faster
    Qwen Team
    [blog] [model]
  6. QwQ-32B: Embracing the Power of Reinforcement Learning
    Qwen Team
    [blog] [model]
  7. QwQ: Reflect Deeply on the Boundaries of the Unknown
    Qwen Team
    [blog] [model]