STAT260 - Robust Statistics

Fall 2019

Instructor: Jacob Steinhardt (jsteinhardt@berkeley)
Lectures: T/Th 12:30-2 (Evans 332)
Office Hours: F 11-12 (Evans 325)

Syllabus: link

IMPORTANT: If you plan to take the class, sign up here to be added to the class mailing list. Students who don’t sign up by the end of the second week of instruction may be dropped from the class.

Prerequisites

No formal requirements, but this class will be fast-paced and assume mathematical maturity.

Lecture Notes

Updated periodically: link (last update 11/21/2019)

Please e-mail typos/corrections to me (jsteinhardt@berkeley with a dot edu at the end).

See here if you prefer notes broken up on a per-lecture basis.

Problem Sets

Problem Set 1 (due September 19th in class; corrected 09/12/2019) tex source
Problem Set 2 (due October 8th in class) tex source
Problem Set 3 (due October 24th in class) tex source
Problem Set 4 (due November 12th in class) tex source
Problem Set 5 (due December 11th via e-mail) tex source

Schedule

Lecture 1: Overview and 1D Robust Estimation (Feedback form)
Lecture 2: Minimum Distance Functionals and Resilience (Feedback form)
Lecture 3: Concentration Inequalities (Feedback form)
Lecture 4: Bounding Suprema via Concentration Inequalities (Feedback form)
Lecture 5: Finite-Sample Analysis via Generalized KS Distance (Feedback form)
Lecture 6: Finite-Sample Analysis via Expanding the Destination Set (Feedback form)
Lecture 7: Truncated Moments and Ledoux-Talagrand (Feedback form)
Lecture 8: Efficient Algorithms: Projecting onto Maximum Eigenvector (Feedback form)
Lecture 9: Approximation Oracles and Grothendieck’s Inequality (Feedback form)
Lecture 10: Semidefinite Programming and Sum-of-Squares (Feedback form)
Lecture 11: Sum-of-Squares and Poincaré inequality (Feedback form)
Lecture 12: Resilience Beyond Mean Estimation (Feedback form)
Lecture 13: Resilience For Linear Regression
Lecture 14: Efficient Algorithms for Linear Regression
Lecture 15: Resilience for Wasserstein Distances
Lecture 16: Wasserstein Resilience for Moment Estimation and Linear Regression
Lecture 17: Test-Time Robustness
Lecture 18: Adversarial Training and Certified Robustness
Lecture 19: Randomized Smoothing
Lecture 20: Domain Adaptation under Covariate Shift
Lecture 21: Doubly-Robust Estimators and Semi-Parametric Estimation
Lecture 22: Partial Specification for Linear Regression
Lecture 23: Partial Specification and Agnostic Clustering
Lecture 24: Agnostic Clustering via Resilience
Lecture 25: Efficient Clustering via SVD + k-means
Lecture 26: Transfer Learning via Dimensionality Reduction
Lecture 27: TBD

Feedback form for Lecture 13 onwards.

Supplementary Reading List

Jerry Li is teaching a class on similar topics.

Robust Learning: Information Theory and Algorithms (Jacob Steinhardt’s thesis)
Concentration of Measure (lecture notes by Terence Tao)
Alternate reference: Concentration Inequalities (notes by Boucheron, Lugosi, and Bousquet)
Generalized Resilience and Robust Statistics (Zhu, Jiao, Steinhardt)
Principled Approaches to Robust Machine Learning and Beyond (Jerry Li’s thesis)
Probability Bounds (John Duchi; contains exposition on Ledoux-Talagrand)
Approximating the Cut-Norm via Grothendieck’s Inequality (Alon and Naor)
Better Agnostic Clustering via Relaxed Tensor Norms (Kothari and Steinhardt)
Ricci curvature of Markov chains on metric spaces (Ollivier; relation between Poincaré inequalities and Markov chain convergence)
Concentration inequalities: A nonasymptotic theory of independence (Boucheron, Lugosi, and Massart; good general survey that contains discussion of Poincaré inequalities)
Provable Defenses against Adversarial Examples via the Convex Outer Adversarial Polytope (Eric Wong and Zico Kolter)
Training Verified Learners with Learned Verifiers (Krishnamurthy Dvijotham et al.)
Semidefinite relaxations for certifying robustness to adversarial examples (Aditi Raghunathan et al.)