About me
Welcome to Wei-Lin Chiang’s page!
- I am a CS PhD student at UC Berkeley SkyLab, working with Prof. Ion Stoica.
- My research focus on building evaluation system for large language models. I’m currently co-leading the Chatbot Arena project, a live LLM leaderboard at LMSYS.org.
- I am honored to receive the a16z Open Source AI Grant for our works in Chatbot Arena, Vicuna, and FastChat.
- Check out our works at LMSYS.org in advancing open LLM research! Or find my random thoughts on X :)
News
- [2024.04] New blog: Arena Hard – a pipeline to produce LLMs benchmark from live data.
- [2024.04] I’ll be giving a talk at the Berkeley LLM Meetup.
- [2024.03] We’ve published the technical report of Chatbot Arena.
Projects
- Chatbot Arena: a live platform for evaluating LLMs by human preference
Our platform has served millions of users and gathered over 700K user votes; Our LLM leaderboard has been widely cited by AI leaders and researchers including Jeff Dean, Andrej Karpathy, Greg Brockman, and Stanford HAI annual report
| Paper | Blog | Website | - FastChat: a multi-model serving system for large language models
FastChat is an open-source system powering Chatbot Arena and has gained strong developer community (over 30K GitHub stars and 200+ contributors) - LLM Judge: model-based evaluation for LLM chatbots
Our LLM benchmark MT-Bench has been widely adopted by leading model developers (e.g., Mistral, HuggingFace, Databricks) and recently upgraded to Arena Hard
| Paper | Code | - Vicuna: one of the first open LLMs demonstrating multi-turn ChatGPT capability
The model has received over 5 million downloads and 1000+ citations
| Blog | Weights | - SkyPilot: An intercloud system for running AI and Batch jobs on any cloud
Support 10+ major clouds; adopted by leading AI startups such as Mistral, Covariant
| GitHub | Paper | - Cluster-GCN: one of the first scalable methods for training large and deep GNNs
Our method has been widely adopted in academia and industry (see DGL, PyTorch Geometric integration, Stanford CS224W)
| GitHub | Paper |
Work Experience
- Intern@Amazon, Seattle (May. 2021 - Aug. 2021)
Contrastive learning for information extraction on semi-structure webpages - Intern@Google Research, Mountain View (Dec. 2018 - Mar. 2019)
Efficient algorithms for training large and deep GCN models.
Cluster-GCN paper, code - Intern@Alibaba Group, Hangzhou (July 2017 - Sept. 2017)
Distributed ML algorithms on Alibaba’s parameter server (KunPeng) - Intern@Microsoft Research Asia, Beijing (Dec. 2016 - Feb. 2017)
Distributed training for deep learning frameworks - Intern@Microsoft, Redmond (July 2016 - Oct. 2016)
Large-scale ML algorithms on Microsoft’s distributed platform (REEF)
Publications (full list on Google Scholar)
- Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference
Wei-Lin Chiang*, Lianmin Zheng*, Sheng Ying, Anastasios Nikolas Angelopoulos, Tianle Li, Dacheng Li, Hao Zhang, Banghua Zhu, Michael Jordan, Joseph E. Gonzalez, Ion Stoica (*equal contribution)
arXiv preprint - LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation Dataset
Lianmin Zheng*, Wei-Lin Chiang*, Ying Sheng, Tianle Li, Siyuan Zhuang, Zhanghao Wu, Yonghao Zhuang, Zhuohan Li, Zi Lin, Eric P. Xing, Joseph E. Gonzalez, Ion Stoica, Hao Zhang (*equal contribution)
ICLR 2024 - Llm-assisted code cleaning for training accurate code generators
Naman Jain, Tianjun Zhang, Wei-Lin Chiang, Joseph E. Gonzalez, Koushik Sen, Ion Stoica
ICLR 2024 - Rethinking benchmark and contamination for language models with rephrased samples Shuo Yang*, Wei-Lin Chiang*, Lianmin Zheng*, Joseph E. Gonzalez, Ion Stoica (*equal contribution)
arXiv preprint - Judging LLM-as-a-judge with MT-Bench and Chatbot Arena
Lianmin Zheng*, Wei-Lin Chiang*, Sheng Ying*, Siyuan Zhuang, Zhanghao Wu, Yonghao Zhuang, Zi Lin, Zhouhan Li, Dacheng Li, Eric Xing, Hao Zhang, Joseph Gonzalez, Ion Stoica (*equal contribution)
NeurIPS 2023 Dataset and Benchmarks Track - Vicuna: An Open-Source Chatbot Impressing GPT-4 with 90%* ChatGPT Quality
Wei-Lin Chiang, Zhuohan Li, Zi Lin, Ying Sheng, Zhanghao Wu, Hao Zhang, Lianmin Zheng, Siyuan Zhuang, Yonghao Zhuang, Joseph Gonzalez, Ion Stoica, Eric Xing (alphabetical order)
Blogpost model weights - Can’t Be Late: Optimizing Spot Instance Savings under Deadlines
Zhanghao Wu, Wei-Lin Chiang, Zongheng Yang, Eric Friedman, Scott Shenker, Ion Stoica.
NSDI 2024 (Outstanding Paper Award) - SkyPilot: An Intercloud Broker for Sky Computing
Zongheng Yang, Zhanghao Wu, Michael Luo, Wei-Lin Chiang, Romil Bhardwaj, Woosuk Kwon, Siyuan Zhuang, Frank Sifei Luan, Gautam Mittal, Scott Shenker, Ion Stoica
USENIX NSDI 2023 - Balsa: Learning a Query Optimizer Without Expert Demonstrations
Zongheng Yang, Wei-Lin Chiang+, Sifei Luan+, Gautam Mittal, Michael Luo, Ion Stoica. (+ equal contribution)
ACM SIGMOD 2022 - Manifold Identification for Ultimately Communication-Efficient Distributed Optimization
Yu-Sheng Li, Wei-Lin Chiang, and Ching-pei Lee.
International Conference on Machine Learning (ICML), 2020 - Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks [code, dataset (Amazon2M)]
Wei-Lin Chiang, Xuanqing Liu, Si Si, Yang Li, Samy Bengio, and Cho-Jui Hsieh.
ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD), 2019 (Oral) slides, poster - Preconditioned Conjugate Gradient Methods in Truncated Newton Frameworks for Large-scale Linear Classification [supplement & code. Implementation available in LIBLINEAR after version 2.20.]
Chih-Yang Hsia, Wei-Lin Chiang, and Chih-Jen Lin.
Asian Conference on Machine Learning (ACML), 2018 (Best paper award) slides, poster - Limited-memory Common-directions Method for Distributed L1-regularized Linear Classification [supplement & code. Implementation available in Distributed LIBLINEAR.]
Wei-Lin Chiang, Yu-Sheng Li, Ching-pei Lee, and Chih-Jen Lin.
SIAM International Conference on Data Mining (SDM), 2018 slides, poster - Parallel Dual Coordinate Descent Method for Large-scale Linear Classification in Multi-core Environments [supplement, code. Implementation available in Multi-core LIBLINEAR.]
Wei-Lin Chiang, Mu-Chu Lee, and Chih-Jen Lin.
ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD), 2016 poster - Fast Matrix-vector Multiplications for Large-scale Logistic Regression on Shared-memory Systems [supplement, code. Implementation available in Multi-core LIBLINEAR.]
Mu-Chu Lee, Wei-Lin Chiang, and Chih-Jen Lin.
IEEE International Conference on Data Mining (ICDM), 2015 slides