-
Learning What to Learn: Curriculum Curation for Test-Time Agent Learning
Qizheng Zhang*, Sherry Ruan*, Shubhangi Upasani, Fenglu Hong, Changxiu Ji, Changran Hu, Bo Li, Hanchen Li, Kunle Olukotun
ICLR Workshop on AI with Recursive Self-Improvement, 2026
-
Test-Time Adaptation via Many-Shot Prompting: Benefits, Limits, and Pitfalls
Shubhangi Upasani, Chen Wu, Jay Rainton, Changran Hu, Qizheng Zhang, Bo Li, Urmish Thakker
ICLR Workshop on AI with Recursive Self-Improvement, 2026
-
Adaptation of Agentic AI: A Survey of Post-Training, Memory, and Skills
Pengcheng Jiang, Jiacheng Lin, ..., Qizheng Zhang, ..., Jiawei Han (35 authors in total)
arXiv preprint, 2025
-
FrontierCS: Evolving Challenges for Evolving Intelligence
Qiuyang Mang, Wenhao Chai, ..., Qizheng Zhang, ..., Alvin Cheung (52 authors in total)
arXiv preprint, 2025
-
EvicPress: Joint KV-Cache Compression and Eviction for Efficient LLM Serving
Shaoting Feng, Yuhan Liu, Xiaokun Chen, Hanchen Li, Samuel Shen, Kuntai Du, Zhuohan Gu, Rui Zhang, Yuyang Huang, Yihua Cheng, Jiayi Yao, Qizheng Zhang, Ganesh Ananthanarayanan, Junchen Jiang
arXiv preprint, 2025
-
Continuum: Efficient and Robust Multi-Turn LLM Agent Scheduling with KV Cache Time-to-Live
Hanchen Li, Qiuyang Mang, Runyuan He, Qizheng Zhang, Huanzhi Mao, Xiaokun Chen, Alvin Cheung, Joseph Gonzalez, Ion Stoica
arXiv preprint, 2025
-
A Benchmark of Expert-Level Academic Questions to Assess AI Capabilities
Center for AI Safety, Scale AI, HLE Contributors Consortium
Nature, 2026
-
FlowRL: Matching Reward Distributions for LLM Reasoning
Xuekai Zhu, Daixuan Cheng, Dinghuai Zhang, Hengli Li, Kaiyan Zhang, Che Jiang, Youbang Sun, Ermo Hua, Yuxin Zuo, Xingtai Lv, Qizheng Zhang, Lin Chen, Fanghao Shao, Bo Xue, Yunchong Song, Zhenjie Yang, Ganqu Cui, Ning Ding, Jianfeng Gao, Xiaodong Liu, Bowen Zhou, Hongyuan Mei, Zhouhan Lin
International Conference on Learning Representations (ICLR), 2026
-
Agentic Bridge Framework: Closing the Gap Between Agentic Capability and Performance Benchmarks
Yun Du, Rubens Lacouture, Qizheng Zhang, Genghan Zhang, Tian Zhao, Kunle Olukotun
NeurIPS Workshop on Machine Learning for Systems, 2025
-
LowRA: Accurate and Efficient LoRA Fine-Tuning of LLMs under 2 Bits
Zikai Zhou, Qizheng Zhang, Hermann Kumbong, Kunle Olukotun
International Conference on Machine Learning (ICML), 2025
-
Agentic Plan Caching: Test-Time Memory for Fast and Cost-Efficient LLM Agents
Qizheng Zhang, Michael Wornow, Kunle Olukotun
Conference on Neural Information Processing Systems (NeurIPS), 2025
-
CacheBlend: Fast Large Language Model Serving for RAG with Cached Knowledge Fusion
Jiayi Yao, Hanchen Li, Yuhan Liu, Siddhant Ray, Yihua Cheng, Qizheng Zhang, Kuntai Du, Shan Lu, Junchen Jiang
ACM European Conference on Computer Systems (EuroSys), 2025
EuroSys Best Paper Award
-
CacheGen: KV Cache Compression and Streaming for Fast Large Language Model Serving
Yuhan Liu, Hanchen Li, Yihua Cheng, Siddhant Ray, Yuyang Huang, Qizheng Zhang, Kuntai Du, Jiayi Yao, Shan Lu, Ganesh Ananthanarayanan, Michael Maire, Henry Hoffmann, Ari Holtzman, Junchen Jiang
ACM Special Interest Group on Data Communication (SIGCOMM), 2024
-
Caravan: Practical Online Learning of In-Network ML Models with Labeling Agents
Qizheng Zhang, Ali Imran, Enkeleda Bardhi, Tushar Swamy, Nathan Zhang, Muhammad Shahbaz, Kunle Olukotun
USENIX Symposium on Operating Systems Design and Implementation (OSDI), 2024
SRC JUMP 2.0 Best Paper Award
-
The Dataflow Abstract Machine Simulator Framework
Nathan Zhang, Rubens Lacouture, Gina Sohn, Paul Mure, Qizheng Zhang, Fredrik Kjolstad, Kunle Olukotun
ACM/IEEE International Symposium on Computer Architecture (ISCA), 2024
ISCA Distinguished Artifact Award
-
GRACE: Loss-Resilient Real-Time Video through Neural Codecs
Yihua Cheng, Ziyi Zhang, Hanchen Li, Anton Arapin, Yue Zhang, Qizheng Zhang, Yuhan Liu, Kuntai Du, Xu Zhang, Francis Y. Yan, Amrita Mazumdar, Nick Feamster, Junchen Jiang
USENIX Symposium on Networked Systems Design and Implementation (NSDI), 2024
-
OneAdapt: Fast Adaptation for Deep Learning Applications via Backpropagation
Kuntai Du, Yuhan Liu, Yitian Hao, Qizheng Zhang, Haodong Wang, Yuyang Huang, Ganesh Ananthanarayanan, Junchen Jiang
ACM Symposium on Cloud Computing (SoCC), 2023
-
Optimizing Real-Time Video Experience with Data Scalable Codec
Hanchen Li*, Yihua Cheng*, Ziyi Zhang, Qizheng Zhang, Anton Arapin, Nick Feamster, Amrita Mazumdar
ACM SIGCOMM Workshop on Emerging Multimedia Systems (EMS), 2023
-
AccMPEG: Optimizing Video Encoding for Video Analytics
Kuntai Du, Qizheng Zhang, Anton Arapin, Haodong Wang, Zhengxu Xia, Junchen Jiang
Conference on Machine Learning and Systems (MLSys), 2022
-
Understanding the Potential of Server-Driven Edge Video Analytics
Qizheng Zhang, Kuntai Du, Neil Agarwal, Ravi Netravali, Junchen Jiang
ACM International Workshop on Mobile Computing Systems and Applications (HotMobile), 2022
-
Server-Driven Video Streaming for Deep Learning Inference
Kuntai Du*, Ahsan Pervaiz*, Xin Yuan, Aakanksha Chowdhery, Qizheng Zhang, Henry Hoffmann, Junchen Jiang
ACM Special Interest Group on Data Communication (SIGCOMM), 2020