About me
Welcome to Wei-Lin Chiang’s page!
- I am a PhD student at UC Berkeley SkyLab, working with Prof. Ion Stoica.
- My research focuses on building evaluation systems for AI. I’m currently working on Chatbot Arena project, a crowdsourced AI evaluation platform at LMSYS.org.
- Our work on Chatbot Arena, Vicuna, and FastChat has been recognized with an a16z Open Source AI Grant. Check out our blogs at LMSYS.org or follow our updates on X.
News
- [2024.09] Launched: RedTeam Arena
- [2024.08] New blog post: decoupling style and substance in Chatbot Arena
- [2024.06] Launched: Multimodal Arena
- [2024.05] We hosted a Kaggle competition for human preference prediction
- [2024.04] Arena Hard and BenchBuilder: a data curation pipeline for LLMs benchmarks
- [2024.03] Released: technical report on Chatbot Arena
Projects
- Chatbot Arena: A Crowdsourced AI Evaluation Platform
Our website has served millions of users, collecting over one million user votes for the leaderboard; We are honored to be recognized by industry leaders and researchers including Jeff Dean, Andrej Karpathy, and Greg Brockman.
| Paper | Blog | Website | - LLM Judge: Automating LLM Evaluation
We are developing automated evaluation for LLMs, such as MT-Bench and Arena-Hard benchmarks.
| Paper | Code | - FastChat: Multi-Model Serving Framework
FastChat is an open-source system powering Chatbot Arena and has gained strong developer community (over 30K GitHub stars and 200+ contributors) - Vicuna: high-quality LLM chatbot
Vicuna has been downloaded over 8 million times with 1000+ citations.
| Blog | Weights | - SkyPilot: An Intercloud System for AI and Batch Jobs
| GitHub | Paper | - Cluster-GCN: Scalable Training for Large GNNs
Widely integrated into platforms like DGL, PyTorch Geometric
| GitHub | Paper |
Work Experience
- Intern@Amazon, Seattle (May. 2021 - Aug. 2021)
Contrastive learning for information extraction on semi-structure webpages - Intern@Google Research, Mountain View (Dec. 2018 - Mar. 2019)
Developed algorithms for training large and deep GCN models.
Cluster-GCN paper, code - Intern@Alibaba Group, Hangzhou (July 2017 - Sept. 2017)
Distributed ML algorithms on Alibaba’s parameter server (KunPeng) - Intern@Microsoft Research Asia, Beijing (Dec. 2016 - Feb. 2017)
Distributed training for deep learning frameworks - Intern@Microsoft, Redmond (July 2016 - Oct. 2016)
Large-scale ML algorithms on Microsoft’s distributed platform (REEF)
Publications (full list on Google Scholar)
- Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference
Wei-Lin Chiang*, Lianmin Zheng*, Sheng Ying, Anastasios Nikolas Angelopoulos, Tianle Li, Dacheng Li, Hao Zhang, Banghua Zhu, Michael Jordan, Joseph E. Gonzalez, Ion Stoica (*equal contribution)
arXiv preprint - LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation Dataset
Lianmin Zheng*, Wei-Lin Chiang*, Ying Sheng, Tianle Li, Siyuan Zhuang, Zhanghao Wu, Yonghao Zhuang, Zhuohan Li, Zi Lin, Eric P. Xing, Joseph E. Gonzalez, Ion Stoica, Hao Zhang (*equal contribution)
ICLR 2024 - Llm-assisted code cleaning for training accurate code generators
Naman Jain, Tianjun Zhang, Wei-Lin Chiang, Joseph E. Gonzalez, Koushik Sen, Ion Stoica
ICLR 2024 - Rethinking benchmark and contamination for language models with rephrased samples Shuo Yang*, Wei-Lin Chiang*, Lianmin Zheng*, Joseph E. Gonzalez, Ion Stoica (*equal contribution)
arXiv preprint - Judging LLM-as-a-judge with MT-Bench and Chatbot Arena
Lianmin Zheng*, Wei-Lin Chiang*, Sheng Ying*, Siyuan Zhuang, Zhanghao Wu, Yonghao Zhuang, Zi Lin, Zhouhan Li, Dacheng Li, Eric Xing, Hao Zhang, Joseph Gonzalez, Ion Stoica (*equal contribution)
NeurIPS 2023 Dataset and Benchmarks Track - Vicuna: An Open-Source Chatbot Impressing GPT-4 with 90%* ChatGPT Quality
Wei-Lin Chiang, Zhuohan Li, Zi Lin, Ying Sheng, Zhanghao Wu, Hao Zhang, Lianmin Zheng, Siyuan Zhuang, Yonghao Zhuang, Joseph Gonzalez, Ion Stoica, Eric Xing (alphabetical order)
Blogpost model weights - Can’t Be Late: Optimizing Spot Instance Savings under Deadlines
Zhanghao Wu, Wei-Lin Chiang, Zongheng Yang, Eric Friedman, Scott Shenker, Ion Stoica.
NSDI 2024 (Outstanding Paper Award) - SkyPilot: An Intercloud Broker for Sky Computing
Zongheng Yang, Zhanghao Wu, Michael Luo, Wei-Lin Chiang, Romil Bhardwaj, Woosuk Kwon, Siyuan Zhuang, Frank Sifei Luan, Gautam Mittal, Scott Shenker, Ion Stoica
USENIX NSDI 2023 - Balsa: Learning a Query Optimizer Without Expert Demonstrations
Zongheng Yang, Wei-Lin Chiang+, Sifei Luan+, Gautam Mittal, Michael Luo, Ion Stoica. (+ equal contribution)
ACM SIGMOD 2022 - Manifold Identification for Ultimately Communication-Efficient Distributed Optimization
Yu-Sheng Li, Wei-Lin Chiang, and Ching-pei Lee.
International Conference on Machine Learning (ICML), 2020 - Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks [code, dataset (Amazon2M)]
Wei-Lin Chiang, Xuanqing Liu, Si Si, Yang Li, Samy Bengio, and Cho-Jui Hsieh.
ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD), 2019 (Oral) slides, poster - Preconditioned Conjugate Gradient Methods in Truncated Newton Frameworks for Large-scale Linear Classification [supplement & code. Implementation available in LIBLINEAR after version 2.20.]
Chih-Yang Hsia, Wei-Lin Chiang, and Chih-Jen Lin.
Asian Conference on Machine Learning (ACML), 2018 (Best paper award) slides, poster - Limited-memory Common-directions Method for Distributed L1-regularized Linear Classification [supplement & code. Implementation available in Distributed LIBLINEAR.]
Wei-Lin Chiang, Yu-Sheng Li, Ching-pei Lee, and Chih-Jen Lin.
SIAM International Conference on Data Mining (SDM), 2018 slides, poster - Parallel Dual Coordinate Descent Method for Large-scale Linear Classification in Multi-core Environments [supplement, code. Implementation available in Multi-core LIBLINEAR.]
Wei-Lin Chiang, Mu-Chu Lee, and Chih-Jen Lin.
ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD), 2016 poster - Fast Matrix-vector Multiplications for Large-scale Logistic Regression on Shared-memory Systems [supplement, code. Implementation available in Multi-core LIBLINEAR.]
Mu-Chu Lee, Wei-Lin Chiang, and Chih-Jen Lin.
IEEE International Conference on Data Mining (ICDM), 2015 slides