About me

Welcome to Wei-Lin Chiang’s page!

News

  • [2024.04] New blog: Arena Hard – a pipeline to produce LLMs benchmark from live data.
  • [2024.04] I’ll be giving a talk at the Berkeley LLM Meetup.
  • [2024.03] We’ve published the technical report of Chatbot Arena.

Projects

  • Chatbot Arena: a live platform for evaluating LLMs by human preference
    Our platform has served millions of users and gathered over 700K user votes; Our LLM leaderboard has been widely cited by AI leaders and researchers including Jeff Dean, Andrej Karpathy, Greg Brockman, and Stanford HAI annual report
    | Paper | Blog | Website |
  • FastChat: a multi-model serving system for large language models
    FastChat is an open-source system powering Chatbot Arena and has gained strong developer community (over 30K GitHub stars and 200+ contributors)
  • LLM Judge: model-based evaluation for LLM chatbots
    Our LLM benchmark MT-Bench has been widely adopted by leading model developers (e.g., Mistral, HuggingFace, Databricks) and recently upgraded to Arena Hard
    | Paper | Code |
  • Vicuna: one of the first open LLMs demonstrating multi-turn ChatGPT capability
    The model has received over 5 million downloads and 1000+ citations
    | Blog | Weights |
  • SkyPilot: An intercloud system for running AI and Batch jobs on any cloud
    Support 10+ major clouds; adopted by leading AI startups such as Mistral, Covariant
    | GitHub | Paper |
  • Cluster-GCN: one of the first scalable methods for training large and deep GNNs
    Our method has been widely adopted in academia and industry (see DGL, PyTorch Geometric integration, Stanford CS224W)
    | GitHub | Paper |

Work Experience

  • Intern@Amazon, Seattle (May. 2021 - Aug. 2021)
    Contrastive learning for information extraction on semi-structure webpages
  • Intern@Google Research, Mountain View (Dec. 2018 - Mar. 2019)
    Efficient algorithms for training large and deep GCN models.
    Cluster-GCN paper, code
  • Intern@Alibaba Group, Hangzhou (July 2017 - Sept. 2017)
    Distributed ML algorithms on Alibaba’s parameter server (KunPeng)
  • Intern@Microsoft Research Asia, Beijing (Dec. 2016 - Feb. 2017)
    Distributed training for deep learning frameworks
  • Intern@Microsoft, Redmond (July 2016 - Oct. 2016)
    Large-scale ML algorithms on Microsoft’s distributed platform (REEF)

Publications (full list on Google Scholar)

  1. Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference
    Wei-Lin Chiang*, Lianmin Zheng*, Sheng Ying, Anastasios Nikolas Angelopoulos, Tianle Li, Dacheng Li, Hao Zhang, Banghua Zhu, Michael Jordan, Joseph E. Gonzalez, Ion Stoica (*equal contribution)
    arXiv preprint
  2. LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation Dataset
    Lianmin Zheng*, Wei-Lin Chiang*, Ying Sheng, Tianle Li, Siyuan Zhuang, Zhanghao Wu, Yonghao Zhuang, Zhuohan Li, Zi Lin, Eric P. Xing, Joseph E. Gonzalez, Ion Stoica, Hao Zhang (*equal contribution)
    ICLR 2024
  3. Llm-assisted code cleaning for training accurate code generators
    Naman Jain, Tianjun Zhang, Wei-Lin Chiang, Joseph E. Gonzalez, Koushik Sen, Ion Stoica
    ICLR 2024
  4. Rethinking benchmark and contamination for language models with rephrased samples Shuo Yang*, Wei-Lin Chiang*, Lianmin Zheng*, Joseph E. Gonzalez, Ion Stoica (*equal contribution)
    arXiv preprint
  5. Judging LLM-as-a-judge with MT-Bench and Chatbot Arena
    Lianmin Zheng*, Wei-Lin Chiang*, Sheng Ying*, Siyuan Zhuang, Zhanghao Wu, Yonghao Zhuang, Zi Lin, Zhouhan Li, Dacheng Li, Eric Xing, Hao Zhang, Joseph Gonzalez, Ion Stoica (*equal contribution)
    NeurIPS 2023 Dataset and Benchmarks Track
  6. Vicuna: An Open-Source Chatbot Impressing GPT-4 with 90%* ChatGPT Quality
    Wei-Lin Chiang, Zhuohan Li, Zi Lin, Ying Sheng, Zhanghao Wu, Hao Zhang, Lianmin Zheng, Siyuan Zhuang, Yonghao Zhuang, Joseph Gonzalez, Ion Stoica, Eric Xing (alphabetical order)
    Blogpost model weights
  7. Can’t Be Late: Optimizing Spot Instance Savings under Deadlines
    Zhanghao Wu, Wei-Lin Chiang, Zongheng Yang, Eric Friedman, Scott Shenker, Ion Stoica.
    NSDI 2024 (Outstanding Paper Award)
  8. SkyPilot: An Intercloud Broker for Sky Computing
    Zongheng Yang, Zhanghao Wu, Michael Luo, Wei-Lin Chiang, Romil Bhardwaj, Woosuk Kwon, Siyuan Zhuang, Frank Sifei Luan, Gautam Mittal, Scott Shenker, Ion Stoica
    USENIX NSDI 2023
  9. Balsa: Learning a Query Optimizer Without Expert Demonstrations
    Zongheng Yang, Wei-Lin Chiang+, Sifei Luan+, Gautam Mittal, Michael Luo, Ion Stoica. (+ equal contribution)
    ACM SIGMOD 2022
  10. Manifold Identification for Ultimately Communication-Efficient Distributed Optimization
    Yu-Sheng Li, Wei-Lin Chiang, and Ching-pei Lee.
    International Conference on Machine Learning (ICML), 2020
  11. Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks [code, dataset (Amazon2M)]
    Wei-Lin Chiang, Xuanqing Liu, Si Si, Yang Li, Samy Bengio, and Cho-Jui Hsieh.
    ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD), 2019 (Oral) slides, poster
  12. Preconditioned Conjugate Gradient Methods in Truncated Newton Frameworks for Large-scale Linear Classification [supplement & code. Implementation available in LIBLINEAR after version 2.20.]
    Chih-Yang Hsia, Wei-Lin Chiang, and Chih-Jen Lin.
    Asian Conference on Machine Learning (ACML), 2018 (Best paper award) slides, poster
  13. Limited-memory Common-directions Method for Distributed L1-regularized Linear Classification [supplement & code. Implementation available in Distributed LIBLINEAR.]
    Wei-Lin Chiang, Yu-Sheng Li, Ching-pei Lee, and Chih-Jen Lin.
    SIAM International Conference on Data Mining (SDM), 2018 slides, poster
  14. Parallel Dual Coordinate Descent Method for Large-scale Linear Classification in Multi-core Environments [supplement, code. Implementation available in Multi-core LIBLINEAR.]
    Wei-Lin Chiang, Mu-Chu Lee, and Chih-Jen Lin.
    ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD), 2016 poster
  15. Fast Matrix-vector Multiplications for Large-scale Logistic Regression on Shared-memory Systems [supplement, code. Implementation available in Multi-core LIBLINEAR.]
    Mu-Chu Lee, Wei-Lin Chiang, and Chih-Jen Lin.
    IEEE International Conference on Data Mining (ICDM), 2015 slides