Chujie Zheng 郑楚杰

Welcome! I am Chujie Zheng, a final-year Ph.D candidate in CoAI Group at Tsinghua University, advised by Prof. Minlie Huang. I was a visiting scholar in PlusLab at UCLA, hosted by Prof. Nanyun (Violet) Peng. You can find my CV here.

I have a broad research interest in building efficient, scalable, and trustworthy AI systems, with the current focus on LLM alignment (preprint, ICLR 2024 Spotlight, ICML 2024, ACL 2023). My research goal is to advance and oversee AI systems with minimal human intervention and ensure they work responsibly and transparently.

I have conducted extensive research on LLMs for social good, with a main focus on building LLMs for emotional support (ACL 2021, ACL 2023 Findings). I have also built a series of popular NLP datasets, including ChID, KDConv, ESConv, and CDConv.

I maintain the GitHub repository of chat templates for 🤗 LLMs, which has received 500+ stars.

News

  • [10/2024] The Chat Template repository received 500+ stars [repo]
  • [10/2024] Yi-Lightning, which I contributed to during internship at 01.AI, ranks #6 on Chatbot Arena and #3 in Math Category (as of 10/14/2024)! Huge congrats to the team! 🍻
  • [06/2024] Our ExPO paper is accpeted at Models of Human Feedback for AI Alignment Workshop @ ICML 2024 [paper]
  • [06/2024] My Google Scholar profile reached 1000+ citations!
  • [05/2024] Our uploaded ExPO-enhanced LLMs [paper] received 10K+ downloads in 2 weeks on 🤗 HuggingFace
  • [05/2024] Our DRO paper is accpeted at ICML 2024 [paper]
  • [04/2024] Arxived our paper on LLM alignment via model extrapolation (ExPO) [paper]
  • [03/2024] Our DRO paper is accpeted for Oral presentation (5%) at the Secure and Trustworthy LLM Workshop @ ICLR 2024 [paper]
  • [02/2024] Released a GitHub repository of chat templates for HuggingFace LLMs [repo]
  • [01/2024] Arxived our paper on safety prompt optimization (DRO) for safeguarding LLMs [paper]
  • [01/2024] Our PriDe paper is accpeted for Spotlight presentation (5%) at ICLR 2024 [paper]
  • [11/2023] Started my visiting research at UCLA, hosted by Nanyun Peng
  • [09/2023] Arxived our paper on debiasing LLMs (PriDe) in MCQ evaluation [paper]

Selected Projects

  1. Weak-to-Strong Extrapolation Expedites Alignment
    Chujie Zheng, Ziqi Wang, Heng Ji, Minlie Huang, Nanyun Peng
    MHFAIA Workshop @ ICML 2024 (70K+ downloads) (Cited by DeepMind, Oxford, etc.)
    [paper] [repo] [🤗 HuggingFace]
  2. On Prompt-Driven Safeguarding for Large Language Models
    Chujie Zheng, Fan Yin, Hao Zhou, Fandong Meng, Jie Zhou, Kai-Wei Chang, Minlie Huang, Nanyun Peng
    ICML 2024 || SeT LLM Workshop @ ICLR 2024 (Oral: 5%) (Cited by Anthropic, MIT, etc.)
    [paper] [repo]
  3. Chat Templates for 🤗 HuggingFace Large Language Models
    Chujie Zheng
    GitHub Repository (490+ stars)
    [repo]
  4. Large Language Models Are Not Robust Multiple Choice Selectors
    Chujie Zheng, Hao Zhou, Fandong Meng, Jie Zhou, Minlie Huang
    ICLR 2024 (Spotlight: 5%) (Cited by Meta’s LLaMA-3, DeepMind, etc.)
    [paper] [repo]
  5. Click: Controllable Text Generation with Sequence Likelihood Contrastive Learning
    Chujie Zheng, Pei Ke, Zheng Zhang, Minlie Huang
    Findings of ACL 2023 (Early work of preference optimization. Completed in 2022)
    [paper] [repo]
  6. AugESC: Dialogue Augmentation with Large Language Models for Emotional Support Conversation
    Chujie Zheng, Sahand Sabour, Jiaxin Wen, Zheng Zhang, Minlie Huang
    Findings of ACL 2023 (Early work of LLM-based data synthesis. Completed in 2021)
    [paper] [repo]
  7. Towards Emotional Support Dialog Systems
    Siyang Liu*, Chujie Zheng*, Orianna Demasi, Sahand Sabour, Yu Li, Zhou Yu, Yong Jiang, Minlie Huang (*: Equal contribution)
    ACL-IJCNLP 2021 (Oral)
    [paper] [repo]

You can find my full paper list on Google Scholar.

Education

  • Aug 2020 - present. Ph.D candidate in Computer Science and Technology, Tsinghua University. Advisor: Minlie Huang
  • Nov 2023 - Jun 2024. Visiting Scholar, UCLA. Host: Nanyun (Violet) Peng
  • Aug 2016 - Jul 2020. B.Sc. in Foundational Mathematics and Physics, Tsinghua University. Major GPA: 3.98/4.00 (ranking 2/59)

Work Experiences

  • Oct 2024 - Present. Research Intern. Qwen Post-training Team, Alibaba Cloud
  • Jul 2024 - Oct 2024. Research Intern. AI Alignment Team, 01.AI
    • Improved Yi’s mathematical reasoning ability via reward model enhancement
    • Yi-Lightning ranks #6 on Chatbot Arena and #3 in Math Category (as of 10/14/2024)
  • Feb 2022 - Jun 2022. Research Intern. General Dialogue Team, Baidu
    • Improved PLATO’s dialogue consistency via building contradiction detectors

Services

  • Area Chair: ACL (24), EMNLP (24), NAACL (25), ACL Rolling Review (24)
  • Reviewer: ICLR (25), NeurIPS (24), ICML (24), COLM (24), ACL (22/23), EMNLP (21/22), NAACL (24), EACL (23), ACL Rolling Review (21/22/23), CogSci (24), AAAI (22/23)

Awards and Honors

  • Schlumberger Scholarship, Tsinghua University, 2023
  • Comprehensive Merit Scholarship, Tsinghua University, 2022/2021
  • Chi-Sun YEH (叶企孙) Scholarship (Top 5/100), Department of Physics, Tsinghua University, 2020
  • Outstanding Undergraduate, Tsinghua University, 2020
  • China National Scholarship (Top 2/100), 2019
  • Comprehensive Merit Scholarship, Tsinghua University, 2018

Talks

  • Jul 2024, AI Tlite Think Tank, WAIC 2024. Towards Efficient LLM Alignment
  • Jun 2024, AI Time. On Prompt-Driven Safeguarding for Large Language Models (ICML 2024) [video]
  • Feb 2024, AI Time. Large Language Models Are Not Robust Multiple Choice Selectors (ICLR 2024 Spotlight) [video]
  • Nov 2022, Shanghai AI Lab. Towards Well-behaved Dialogue Systems
  • Jul 2021, AI Time. Approaches of Empathy Expression and Emotional Support in Dialogue Systems (ACL 2021) [video]
  • Nov 2020, Biendata & PaperWeekly. Difference-aware Knowledge Selection for Knowledge-grounded Conversation Generation (Findings of EMNLP 2020) [video]
  • Jul 2020, AI Time. KdConv: A Chinese Multi-domain Dialogue Dataset Towards Multi-turn Knowledge-driven Conversation (ACL 2020) [video]