CS321M
Download as PDF
AI Measurement Science
Course Description
Artificial Intelligence (AI) measurement science provides frameworks and methodologies for evaluating, benchmarking, and understanding AI systems. As AI systems become increasingly powerful and deploy into high-stakes domains, the need for rigorous measurement approaches has grown in importance. Current measurement approaches are often ad hoc, lacking theoretical grounding, and failing to connect to real-world use cases. This has led to a measurement crisis characterized by benchmark saturation, inconsistent evaluation methodologies, and difficulty in making valid claims about AI capabilities. This course will cover the foundations of AI measurement science from first principles and outline connections to the growing literature on the topic. This includes: validity theory as applied to AI evaluation, focusing on content, criterion, construct, external, and consequential validity; psychometric models for AI measurement, including item response theory and latent variable models; scaling laws and intervention effects, predicting the impacts of data, computing, and architecture choices; synthetic data generation for evaluation and its implications; governance and policy considerations around AI measurement. This is a graduate-level course. By the end of the course, students should be able to understand, implement, and critique state-of-the-art AI measurement approaches and be ready to conduct research on these topics.
Grading Basis
ROP - Letter or Credit/No Credit
Min
3
Max
3
Course Repeatable for Degree Credit?
No
Course Component
Lecture
Enrollment Optional?
No
Programs
CS321M
is a
completion requirement
for:
- (from the following course set: )
- (from the following course set: )
- (from the following course set: )