Publications

2022

  • How would the viewer feel? estimating wellbeing from video scenarios. Mantas Mazeika, Eric Tang, Andy Zou, Steven Basart, Jun Shern Chan, Dawn Song, David Forsyth, Jacob Steinhardt, Dan Hendrycks. Advances in Neural Information Processing Systems (NeurIPS), 2022.[bib] [paper]
  • Capturing failures of large language models via human cognitive biases. Erik Jones, Jacob Steinhardt. Advances in Neural Information Processing Systems (NeurIPS), 2022.[bib] [paper]
  • A3D: studying pretrained representations with programmable datasets. Ye Wang, Norman Mu, Daniele Grandi, Nicolas Savva, Jacob Steinhardt. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2022.[bib] [paper]
  • Forecasting future world events with neural networks. Andy Zou, Tristan Xiao, Ryan Jia, Joe Kwon, Mantas Mazeika, Richard Li, Dawn Song, Jacob Steinhardt, Owain Evans, Dan Hendrycks. Advances in Neural Information Processing Systems (NeurIPS), 2022.[bib] [paper]
  • Auditing visualizations: transparency methods struggle to detect anomalous behavior. Jean-Stanislas Denain, Jacob Steinhardt. arXiv, 2022. [bib] [paper]
  • Supply-Side equilibria in recommender systems. Meena Jagadeesan, Nikhil Garg, Jacob Steinhardt. arXiv, 2022. [bib] [paper]
  • More than a toy: random matrix models predict how real-world neural representations generalize. Alexander Wei, Wei Hu, Jacob Steinhardt. International Conference on Machine Learning (ICML), 2022.[bib] [paper]
  • Predicting out-of-distribution error with the projection norm. Yaodong Yu, Zitong Yang, Alexander Wei, Yi Ma, Jacob Steinhardt. International Conference on Machine Learning (ICML), 2022.[bib] [paper]
  • Describing differences between text distributions with natural language. Ruiqi Zhong, Charlie Snell, Dan Klein, Jacob Steinhardt. International Conference on Machine Learning (ICML), 2022.[bib] [paper]
  • The effects of reward misspecification: mapping and mitigating misaligned models. Alexander Pan, Kush Bhatia, Jacob Steinhardt. International Conference on Learning Representations (ICLR), 2022.[bib] [paper]
  • PixMix: dreamlike pictures comprehensively improve safety measures. Dan Hendrycks, Andy Zou, Mantas Mazeika, Leonard Tang, Dawn Song, Jacob Steinhardt. Computer Vision and Pattern Recognition (CVPR), 2022.[bib] [paper] [code]
  • Stronger data poisoning attacks break data sanitization defenses. Pang Wei Koh, Jacob Steinhardt, Percy Liang. Machine Learning, 2022.[bib] [paper] [code]
  • Scaling out-of-distribution detection for real-world settings. Dan Hendrycks, Steven Basart, Mantas Mazeika, Mohammadreza Mostajabi, Jacob Steinhardt, Dawn Song. International Conference on Machine Learning (ICML), 2022.[bib] [paper] [data]
  • 2021

    2020

    2019

    2018

    2017

    2016

    2015

    2014

    2012

    2011

    2010

    2009

    • On coloring the odd-distance graph. Jacob Steinhardt. Electronic Journal of Combinatorics, 2009.[bib] [paper]

    2007