Hi there, I’m Zerui Cheng (程泽瑞), a Ph.D. candidate at Princeton University advised by Prof. Pramod Viswanath. I was a student researcher at ByteDance Seed (contributor to Seed 2.0 Pro) and Tencent Hunyuan (contributor to Hy3 Preview). Before Princeton, I completed my B.Eng. in Computer Science from Yao Class at Tsinghua University, graduating summa cum laude and receiving the prestigious Yao Award.
My research focuses on Evaluation of LLMs and Agents and Synthetic Data, two areas that I believe are foundational to the goal of self-evolving agents for long-horizon tasks.
LLM evaluation is critical for two reasons. First, for frontier AI labs, evaluation defines and formalizes the tasks, goals, and benchmarks that guide the direction of AI progress. Second, for the broader public, it helps users understand which models are best suited for different use cases. At its core, the philosophy of LLM evaluation concerns how we define tasks, goals, and rubrics, as well as how we ensure that evaluation results and leaderboards are unbiased, robust, and trustworthy.
Synthetic data is equally important because human expertise is ultimately scarce, difficult to scale, and unable to keep pace with the rapid development of AI. For AI systems to continue improving in the future, they must increasingly learn to evolve themselves. The most foundational step toward this vision is providing AI with the “fuel” it needs to progress: data. In the long run, this data should not only be consumed by AI, but also proposed, generated, and refined by AI itself.
My research has been featured in venues including Nature, NeurIPS, ICLR, ICML, COLM, AAAI, ACM CCS, EuroSys, and IEEE Transactions on Networking, and has contributed to the technical whitepapers of high-profile startups including Sentient, Kite AI, and PolyHedra.
Beyond research, I am a member of the Competitive Programming Hall of Fame. I served as the President of the Yao Class Students’ Congress during my undergraduate years, and I was once a contestant on the TV show Super Brain (江苏卫视《最强大脑》第10季).
I’m always open to research and industry collaborations, especially around LLM evaluation, synthetic data, and frontier AI systems. Feel free to contact me and chat!
Ph.D. student (2023 - now)
Electrical and Computer Engineering, Princeton University
B.Eng. in Computer Science (2019 - 2023)
Yao Class, the Insititute for Interdisciplinary Information Sciences (IIIS), Tsinghua University
Two papers (FrontierCS, the Generalization Spectrum) accepted to ICML 2026!
One paper (ValueMine) accepted to the journal IEEE Transactions on Networking!
Two first-authored papers done at ByteDance Seed are online now!
Three papers are accepted in various venues this month!
One paper (HLE) accepted to Nature!
One paper (TAO) accepted to EuroSys 2026!
One paper (AutoCode) accepted to ICLR 2026!
CAIA gets accepted and selected for oral presentation (top 10%) to AAAI 2026 AI4Finance!
Two papers (LiveCodeBench Pro, PeerBench) accepted to NeurIPS 2025!
For most recent updates, please refer to my Google Scholar profile. Here are some selected publications.
OML: Open, Monetizable, Loyal AI (2024, NeurIPS 2025 Lock-LLM)
zkBridge (ACM CCS 2022)
VeRA: Verified Reasoning Data Augmentation at Scale
CAIA: Crypto AI Agent Benchmark
LiveCodeBench Pro (NeurIPS 2025) - Comprehensive, hard, and contamination-free code generation benchmark
Humanity’s Last Exam (2025) - Ultimate test for AI capabilities