Skip to main content

CS246

Mining Massive Data Sets

Computer Science ENGR - School of Engineering

Course Description

The availability of massive datasets is revolutionizing science and industry. This course discusses data mining and machine learning algorithms for analyzing very large amounts of data. Topics include: Big data systems (Hadoop, Spark); Link Analysis (PageRank, spam detection); Similarity search (locality-sensitive hashing, shingling, min-hashing); Stream data processing; Recommender Systems; Analysis of social-network graphs; Association rules; Dimensionality reduction (UV, SVD, and CUR decompositions); Algorithms for large-scale mining (clustering, nearest-neighbor search); Large-scale machine learning (decision tree ensembles); Multi-armed bandit; Computational advertising. Prerequisites: At least one of CS107 or CS145.

Grading Basis

ROP - Letter or Credit/No Credit

Min

3

Max

4

Course Repeatable for Degree Credit?

No

Course Component

Lecture

Enrollment Optional?

No

This course has been approved for the following WAYS

Formal Reasoning (FR)

Does this course satisfy the University Language Requirement?

No

Programs

CS246 is a completion requirement for:
  • (from the following course set: )