Jacob Steinhardt

Jacob Steinhardt

Assistant Professor

Department of Statistics

UC Berkeley

I am an Assistant Professor of Statistics and EECS at UC Berkeley, where I’m also part of BAIR and CLIMB. I am also Founder & CEO of Transluce, a non-profit research lab building open, scalable technology for understanding frontier AI systems.

My research focuses on ensuring machine learning systems are understood by and aligned with humans. The basic problem is that ML models are complex systems that often produce unintended consequences. For instance, ML systems tend to exploit errors in the reward function, leading to unintended behavior that often gets worse as models get bigger. The problem compounds once ML systems interact with each other or with humans, which can lead to strategic incentives and other intrasystem goals.

To tackle this problem, one approach is to understand not just the outputs of neural networks but also their latent activations, which represent the computational process used to generate outputs. By understanding this process, we can hopefully modify it to be more aligned with human intent.

Another approach is to enable humans to better understand complex systems. We have built several systems that consume large datasets and summarize their properties in natural language. More generally, ML models could help humans with important but difficult tasks such as understanding the long-term consequences of an action, automatically discovering failures in an ML or computer system, or predicting future world events.

I seek students who are technically strong, broad-minded, and want to improve the world through their research. I particularly value creative thinkers and curious empiricists who are excited to chart new approaches to the field.

As a graduate student, I was very fortunate to be advised by Percy Liang. During my post-doc year, I worked at OpenAI and Open Philanthropy. I like ultimate frisbee, power lifting, and indoor bouldering.

Current Ph.D. students and post-docs

I am also fortunate to collaborate with many students who I do not directly advise, as can be seen from my publications page.

Former PhD. students and post-docs

Essays

For more recent writing, see my blog.

  • AI Alignment Research Overview (October 2019) link
  • Research as a Stochastic Decision Process (December 2018) link
  • Long-Term and Short-Term Challenges to Ensuring the Safety of AI Systems (June 2015) link
  • The Power of Noise (June 2014) link
  • A Fervent Defense of Frequentist Statistics (February 2014) link
  • Beyond Bayesians and Frequentists (October 2012) link