-
Adaptation of Agentic AI
Pengcheng Jiang, Jiacheng Lin, Zhiyi Shi, Zifeng Wang, Luxi He, Yichen Wu, Ming Zhong, Peiyang Song, Qizheng Zhang, Heng Wang, Xueqiang Xu, Hanwen Xu, Pengrui Han, Dylan Zhang, Jiashuo Sun, Chaoqi Yang, Kun Qian, Tian Wang, Changran Hu, Manling Li, Quanzheng Li, Hao Peng, Sheng Wang, Jingbo Shang, Chao Zhang, Jiaxuan You, Liyuan Liu, Pan Lu, Yu Zhang, Heng Ji, Yejin Choi, Dawn Song, Jimeng Sun, Jiawei Han
arXiv preprint, 2025
-
FrontierCS: Evolving Challenges for Evolving Intelligence
Qiuyang Mang, Wenhao Chai, Zhifei Li, Huanzhi Mao, Shang Zhou, Alexander Du, Hanchen Li, Shu Liu, Edwin Chen, Yichuan Wang, Xieting Chu, Zerui Cheng, Yuan Xu, Tian Xia, Zirui Wang, Tianneng Shi, Jianzhu Yao, Yilong Zhao, Qizheng Zhang, Charlie Ruan, Zeyu Shen, Kaiyuan Liu, Runyuan He, Dong Xing, Zerui Li, Zirong Zeng, Yige Jiang, Lufeng Cheng, Ziyi Zhao, Youran Sun, Wesley Zheng, Meiyuwang Zhang, Ruyi Ji, Xuechang Tu, Zihan Zheng, Zexing Chen, Kangyang Zhou, Zhaozi Wang, Jingbang Chen, Aleksandra Korolova, Peter Henderson, Pramod Viswanath, Vijay Ganesh, Saining Xie, Zhuang Liu, Dawn Song, Sewon Min, Ion Stoica, Joseph E. Gonzalez, Jingbo Shang, Alvin Cheung
arXiv preprint, 2025
-
EvicPress: Joint KV-Cache Compression and Eviction for Efficient LLM Serving
Shaoting Feng, Yuhan Liu, Xiaokun Chen, Hanchen Li, Samuel Shen, Kuntai Du, Zhuohan Gu, Rui Zhang, Yuyang Huang, Yihua Cheng, Jiayi Yao, Qizheng Zhang, Ganesh Ananthanarayanan, Junchen Jiang
arXiv preprint, 2025
-
Continuum: Efficient and Robust Multi-Turn LLM Agent Scheduling with KV Cache Time-to-Live
Hanchen Li, Qiuyang Mang, Runyuan He, Qizheng Zhang, Huanzhi Mao, Xiaokun Chen, Alvin Cheung, Joseph Gonzalez, Ion Stoica
arXiv preprint, 2025
-
A Benchmark of Expert-Level Academic Questions to Assess AI Capabilities
Center for AI Safety, Scale AI, HLE Contributors Consortium
Nature, 2026
-
FlowRL: Matching Reward Distributions for LLM Reasoning
Xuekai Zhu, Daixuan Cheng, Dinghuai Zhang, Hengli Li, Kaiyan Zhang, Che Jiang, Youbang Sun, Ermo Hua, Yuxin Zuo, Xingtai Lv, Qizheng Zhang, Lin Chen, Fanghao Shao, Bo Xue, Yunchong Song, Zhenjie Yang, Ganqu Cui, Ning Ding, Jianfeng Gao, Xiaodong Liu, Bowen Zhou, Hongyuan Mei, Zhouhan Lin
International Conference on Learning Representations (ICLR), 2026
-
Agentic Bridge Framework: Closing the Gap Between Agentic Capability and Performance Benchmarks
Yun Du, Rubens Lacouture, Qizheng Zhang, Genghan Zhang, Tian Zhao, Kunle Olukotun
NeurIPS Workshop on Machine Learning for Systems, 2025
-
LowRA: Accurate and Efficient LoRA Fine-Tuning of LLMs under 2 Bits
Zikai Zhou, Qizheng Zhang, Hermann Kumbong, Kunle Olukotun
International Conference on Machine Learning (ICML), 2025
-
CacheBlend: Fast Large Language Model Serving for RAG with Cached Knowledge Fusion
Jiayi Yao, Hanchen Li, Yuhan Liu, Siddhant Ray, Yihua Cheng, Qizheng Zhang, Kuntai Du, Shan Lu, Junchen Jiang
ACM European Conference on Computer Systems (EuroSys), 2025
EuroSys Best Paper Award
-
CacheGen: KV Cache Compression and Streaming for Fast Large Language Model Serving
Yuhan Liu, Hanchen Li, Yihua Cheng, Siddhant Ray, Yuyang Huang, Qizheng Zhang, Kuntai Du, Jiayi Yao, Shan Lu, Ganesh Ananthanarayanan, Michael Maire, Henry Hoffmann, Ari Holtzman, Junchen Jiang
ACM Special Interest Group on Data Communication (SIGCOMM), 2024
-
The Dataflow Abstract Machine Simulator Framework
Nathan Zhang, Rubens Lacouture, Gina Sohn, Paul Mure, Qizheng Zhang, Fredrik Kjolstad, Kunle Olukotun
ACM/IEEE International Symposium on Computer Architecture (ISCA), 2024
ISCA Distinguished Artifact Award
-
GRACE: Loss-Resilient Real-Time Video through Neural Codecs
Yihua Cheng, Ziyi Zhang, Hanchen Li, Anton Arapin, Yue Zhang, Qizheng Zhang, Yuhan Liu, Kuntai Du, Xu Zhang, Francis Y. Yan, Amrita Mazumdar, Nick Feamster, Junchen Jiang
USENIX Symposium on Networked Systems Design and Implementation (NSDI), 2024
-
OneAdapt: Fast Adaptation for Deep Learning Applications via Backpropagation
Kuntai Du, Yuhan Liu, Yitian Hao, Qizheng Zhang, Haodong Wang, Yuyang Huang, Ganesh Ananthanarayanan, Junchen Jiang
ACM Symposium on Cloud Computing (SoCC), 2023
-
Optimizing Real-Time Video Experience with Data Scalable Codec
Hanchen Li*, Yihua Cheng*, Ziyi Zhang, Qizheng Zhang, Anton Arapin, Nick Feamster, Amrita Mazumdar
ACM SIGCOMM Workshop on Emerging Multimedia Systems (EMS), 2023
-
AccMPEG: Optimizing Video Encoding for Video Analytics
Kuntai Du, Qizheng Zhang, Anton Arapin, Haodong Wang, Zhengxu Xia, Junchen Jiang
Conference on Machine Learning and Systems (MLSys), 2022
-
Understanding the Potential of Server-Driven Edge Video Analytics
Qizheng Zhang, Kuntai Du, Neil Agarwal, Ravi Netravali, Junchen Jiang
ACM International Workshop on Mobile Computing Systems and Applications (HotMobile), 2022
-
Server-Driven Video Streaming for Deep Learning Inference
Kuntai Du*, Ahsan Pervaiz*, Xin Yuan, Aakanksha Chowdhery, Qizheng Zhang, Henry Hoffmann, Junchen Jiang
ACM Special Interest Group on Data Communication (SIGCOMM), 2020