About me

Welcome to Wei-Lin Chiang’s page!

I am a CS PhD student at UC Berkeley SkyLab, working with Prof. Ion Stoica.
My research focus on building evaluation system for large language models. I’m currently co-leading the Chatbot Arena project, a live LLM leaderboard at LMSYS.org.
I am honored to receive the a16z Open Source AI Grant for our works in Chatbot Arena, Vicuna, and FastChat.
Check out our works at LMSYS.org in advancing open LLM research! Or find my random thoughts on X :)

News

[2024.04] New blog: Arena Hard – a pipeline to produce LLMs benchmark from live data.
[2024.04] I’ll be giving a talk at the Berkeley LLM Meetup.
[2024.03] We’ve published the technical report of Chatbot Arena.

Projects

Chatbot Arena: a live platform for evaluating LLMs by human preference
Our platform has served millions of users and gathered over 700K user votes; Our LLM leaderboard has been widely cited by AI leaders and researchers including Jeff Dean, Andrej Karpathy, Greg Brockman, and Stanford HAI annual report
| Paper | Blog | Website |
FastChat: a multi-model serving system for large language models
FastChat is an open-source system powering Chatbot Arena and has gained strong developer community (over 30K GitHub stars and 200+ contributors)
LLM Judge: model-based evaluation for LLM chatbots
Our LLM benchmark MT-Bench has been widely adopted by leading model developers (e.g., Mistral, HuggingFace, Databricks) and recently upgraded to Arena Hard
| Paper | Code |
Vicuna: one of the first open LLMs demonstrating multi-turn ChatGPT capability
The model has received over 5 million downloads and 1000+ citations
| Blog | Weights |
SkyPilot: An intercloud system for running AI and Batch jobs on any cloud
Support 10+ major clouds; adopted by leading AI startups such as Mistral, Covariant
| GitHub | Paper |
Cluster-GCN: one of the first scalable methods for training large and deep GNNs
Our method has been widely adopted in academia and industry (see DGL, PyTorch Geometric integration, Stanford CS224W)
| GitHub | Paper |

Work Experience

Intern@Amazon, Seattle (May. 2021 - Aug. 2021)
Contrastive learning for information extraction on semi-structure webpages
Intern@Google Research, Mountain View (Dec. 2018 - Mar. 2019)
Efficient algorithms for training large and deep GCN models.
Cluster-GCN paper, code
Intern@Alibaba Group, Hangzhou (July 2017 - Sept. 2017)
Distributed ML algorithms on Alibaba’s parameter server (KunPeng)
Intern@Microsoft Research Asia, Beijing (Dec. 2016 - Feb. 2017)
Distributed training for deep learning frameworks
Intern@Microsoft, Redmond (July 2016 - Oct. 2016)
Large-scale ML algorithms on Microsoft’s distributed platform (REEF)

Publications (full list on Google Scholar)

Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference
Wei-Lin Chiang*, Lianmin Zheng*, Sheng Ying, Anastasios Nikolas Angelopoulos, Tianle Li, Dacheng Li, Hao Zhang, Banghua Zhu, Michael Jordan, Joseph E. Gonzalez, Ion Stoica (*equal contribution)
arXiv preprint
LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation Dataset
Lianmin Zheng*, Wei-Lin Chiang*, Ying Sheng, Tianle Li, Siyuan Zhuang, Zhanghao Wu, Yonghao Zhuang, Zhuohan Li, Zi Lin, Eric P. Xing, Joseph E. Gonzalez, Ion Stoica, Hao Zhang (*equal contribution)
ICLR 2024
Llm-assisted code cleaning for training accurate code generators
Naman Jain, Tianjun Zhang, Wei-Lin Chiang, Joseph E. Gonzalez, Koushik Sen, Ion Stoica
ICLR 2024
Rethinking benchmark and contamination for language models with rephrased samples Shuo Yang*, Wei-Lin Chiang*, Lianmin Zheng*, Joseph E. Gonzalez, Ion Stoica (*equal contribution)
arXiv preprint
Judging LLM-as-a-judge with MT-Bench and Chatbot Arena
Lianmin Zheng*, Wei-Lin Chiang*, Sheng Ying*, Siyuan Zhuang, Zhanghao Wu, Yonghao Zhuang, Zi Lin, Zhouhan Li, Dacheng Li, Eric Xing, Hao Zhang, Joseph Gonzalez, Ion Stoica (*equal contribution)
NeurIPS 2023 Dataset and Benchmarks Track
Vicuna: An Open-Source Chatbot Impressing GPT-4 with 90%* ChatGPT Quality
Wei-Lin Chiang, Zhuohan Li, Zi Lin, Ying Sheng, Zhanghao Wu, Hao Zhang, Lianmin Zheng, Siyuan Zhuang, Yonghao Zhuang, Joseph Gonzalez, Ion Stoica, Eric Xing (alphabetical order)
Blogpost model weights
Can’t Be Late: Optimizing Spot Instance Savings under Deadlines
Zhanghao Wu, Wei-Lin Chiang, Zongheng Yang, Eric Friedman, Scott Shenker, Ion Stoica.
NSDI 2024 (Outstanding Paper Award)
SkyPilot: An Intercloud Broker for Sky Computing
Zongheng Yang, Zhanghao Wu, Michael Luo, Wei-Lin Chiang, Romil Bhardwaj, Woosuk Kwon, Siyuan Zhuang, Frank Sifei Luan, Gautam Mittal, Scott Shenker, Ion Stoica
USENIX NSDI 2023
Balsa: Learning a Query Optimizer Without Expert Demonstrations
Zongheng Yang, Wei-Lin Chiang⁺, Sifei Luan⁺, Gautam Mittal, Michael Luo, Ion Stoica. (+ equal contribution)
ACM SIGMOD 2022
Manifold Identification for Ultimately Communication-Efficient Distributed Optimization
Yu-Sheng Li, Wei-Lin Chiang, and Ching-pei Lee.
International Conference on Machine Learning (ICML), 2020
Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks [code, dataset (Amazon2M)]
Wei-Lin Chiang, Xuanqing Liu, Si Si, Yang Li, Samy Bengio, and Cho-Jui Hsieh.
ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD), 2019 (Oral) slides, poster
Preconditioned Conjugate Gradient Methods in Truncated Newton Frameworks for Large-scale Linear Classification [supplement & code. Implementation available in LIBLINEAR after version 2.20.]
Chih-Yang Hsia, Wei-Lin Chiang, and Chih-Jen Lin.
Asian Conference on Machine Learning (ACML), 2018 (Best paper award) slides, poster
Limited-memory Common-directions Method for Distributed L1-regularized Linear Classification [supplement & code. Implementation available in Distributed LIBLINEAR.]
Wei-Lin Chiang, Yu-Sheng Li, Ching-pei Lee, and Chih-Jen Lin.
SIAM International Conference on Data Mining (SDM), 2018 slides, poster
Parallel Dual Coordinate Descent Method for Large-scale Linear Classification in Multi-core Environments [supplement, code. Implementation available in Multi-core LIBLINEAR.]
Wei-Lin Chiang, Mu-Chu Lee, and Chih-Jen Lin.
ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD), 2016 poster
Fast Matrix-vector Multiplications for Large-scale Logistic Regression on Shared-memory Systems [supplement, code. Implementation available in Multi-core LIBLINEAR.]
Mu-Chu Lee, Wei-Lin Chiang, and Chih-Jen Lin.
IEEE International Conference on Data Mining (ICDM), 2015 slides