# STAT260 - Robust Statistics

Fall 2019

**Instructor:** Jacob Steinhardt (jsteinhardt@berkeley)

**Lectures:** T/Th 12:30-2 (Evans 332)

**Office Hours:** F 11-12 (Evans 325)

Syllabus: link

**IMPORTANT:** If you plan to take the class, sign up here to be added to the class mailing list. Students who don’t sign up by the end of the second week of instruction may be dropped from the class.

### Prerequisites

No formal requirements, but this class will be fast-paced and assume mathematical maturity.

### Lecture Notes

Updated periodically: link (last update 11/21/2019)

Please e-mail typos/corrections to me (jsteinhardt@berkeley with a dot edu at the end).

See here if you prefer notes broken up on a per-lecture basis.

### Problem Sets

Problem Set 1 (due September 19th in class; corrected 09/12/2019) tex source

Problem Set 2 (due October 8th in class) tex source

Problem Set 3 (due October 24th in class) tex source

Problem Set 4 (due November 12th in class) tex source

Problem Set 5 (due December 11th via e-mail) tex source

### Schedule

Lecture 1: Overview and 1D Robust Estimation (Feedback form)

Lecture 2: Minimum Distance Functionals and Resilience (Feedback form)

Lecture 3: Concentration Inequalities (Feedback form)

Lecture 4: Bounding Suprema via Concentration Inequalities (Feedback form)

Lecture 5: Finite-Sample Analysis via Generalized KS Distance (Feedback form)

Lecture 6: Finite-Sample Analysis via Expanding the Destination Set (Feedback form)

Lecture 7: Truncated Moments and Ledoux-Talagrand (Feedback form)

Lecture 8: Efficient Algorithms: Projecting onto Maximum Eigenvector (Feedback form)

Lecture 9: Approximation Oracles and Grothendieck’s Inequality (Feedback form)

Lecture 10: Semidefinite Programming and Sum-of-Squares (Feedback form)

Lecture 11: Sum-of-Squares and Poincaré inequality (Feedback form)

Lecture 12: Resilience Beyond Mean Estimation (Feedback form)

Lecture 13: Resilience For Linear Regression

Lecture 14: Efficient Algorithms for Linear Regression

Lecture 15: Resilience for Wasserstein Distances

Lecture 16: Wasserstein Resilience for Moment Estimation and Linear Regression

Lecture 17: Test-Time Robustness

Lecture 18: Adversarial Training and Certified Robustness

Lecture 19: Randomized Smoothing

Lecture 20: Domain Adaptation under Covariate Shift

Lecture 21: Doubly-Robust Estimators and Semi-Parametric Estimation

Lecture 22: Partial Specification for Linear Regression

Lecture 23: Partial Specification and Agnostic Clustering

Lecture 24: Agnostic Clustering via Resilience

Lecture 25: Efficient Clustering via SVD + k-means

Lecture 26: Transfer Learning via Dimensionality Reduction

Lecture 27: TBD

Feedback form for Lecture 13 onwards.

### Supplementary Reading List

Jerry Li is teaching a class on similar topics.

Robust Learning: Information Theory and Algorithms (Jacob Steinhardt’s thesis)

Concentration of Measure (lecture notes by Terence Tao)

Alternate reference: Concentration Inequalities (notes by Boucheron, Lugosi, and Bousquet)

Generalized Resilience and Robust Statistics (Zhu, Jiao, Steinhardt)

Principled Approaches to Robust Machine Learning and Beyond (Jerry Li’s thesis)

Probability Bounds (John Duchi; contains exposition on Ledoux-Talagrand)

Approximating the Cut-Norm via Grothendieck’s Inequality (Alon and Naor)

Better Agnostic Clustering via Relaxed Tensor Norms (Kothari and Steinhardt)

Ricci curvature of Markov chains on metric spaces (Ollivier; relation between Poincaré inequalities and Markov chain convergence)

*Concentration inequalities: A nonasymptotic theory of independence* (Boucheron, Lugosi, and Massart; good general survey that contains discussion of Poincaré inequalities)

Provable Defenses against Adversarial Examples via the Convex Outer Adversarial Polytope (Eric Wong and Zico Kolter)

Training Verified Learners with Learned Verifiers (Krishnamurthy Dvijotham et al.)

Semidefinite relaxations for certifying robustness to adversarial examples (Aditi Raghunathan et al.)