CS321M

Download as PDF

AI Measurement Science

Computer Science ENGR - School of Engineering

Course Description

Artificial Intelligence (AI) measurement science provides frameworks and methodologies for evaluating, benchmarking, and understanding AI systems. As AI systems become increasingly powerful and deploy into high-stakes domains, the need for rigorous measurement approaches has grown in importance. Current measurement approaches are often ad hoc, lacking theoretical grounding, and failing to connect to real-world use cases. This has led to a measurement crisis characterized by benchmark saturation, inconsistent evaluation methodologies, and difficulty in making valid claims about AI capabilities. This course will cover the foundations of AI measurement science from first principles and outline connections to the growing literature on the topic. This includes: validity theory as applied to AI evaluation, focusing on content, criterion, construct, external, and consequential validity; psychometric models for AI measurement, including item response theory and latent variable models; scaling laws and intervention effects, predicting the impacts of data, computing, and architecture choices; synthetic data generation for evaluation and its implications; governance and policy considerations around AI measurement. This is a graduate-level course. By the end of the course, students should be able to understand, implement, and critique state-of-the-art AI measurement approaches and be ready to conduct research on these topics.

Grading Basis

ROP - Letter or Credit/No Credit

Units

Min

Max

Course Repeatable for Degree Credit?

Components

Course Component

Lecture

Enrollment Optional?

Programs

CS321M is a completion requirement for:

(from the following course set: )
(from the following course set: )
(from the following course set: )