Evaluating Large Language Models: Challenges and Methods
- Room 119B, Philadelphia Convention Center, PA, USA
- 2-6pm, February 25, 2025

Presenters

- Jindong Wang, William & Mary, jwang80 (at) wm.edu
- Kaijie Zhu, UC Santa Barbra, kaijiezhu (at) ucsb.edu
- Linyi Yang, Westlake University, yanglinyi (at) westlake.edu.cn
- Yue Feng, University of Birmingham, yueyuef (at) outlook.com
- Yue Zhang, Westlake University, zhangyue (at) westlake.edu.cn
Contents at a glance
Slides are available to download: Google Drive
Partial video recording

- Background and challenges
- Evaluation approaches
- Adversarial and OOD robustness
- Dynamic evaluation
- Reasoning
- Safety
- User evaluation
- Evaluation in social science
- Conclusion
Acknowledgement
