# Blog

## New Blog Location

Published:

I’ve switched the host of my blog to Ghost, so my blog is now located at https://bounded-regret.ghost.io/.

Feedly users can subscribe via RSS here. Or click the bottom-right subscribe button on this page to get e-mail notifications of new posts.

## How Much Do Recommender Systems Drive Polarization?

Published:

Polarization caused by social media is seen by many as an important societal problem, which also overlaps with AI alignment (since social media recommendations come from ML algorithms). I have personally begun directing some of my research to recommender alignment, which has gotten me curious about the extent to which polarization is actually driven by social media. This blog post is the first in a series that summarizes my current take-aways. I’ll start (in this post) by looking at aggregate trends in polarization, then connect them with micro-level data on Facebook feeds in later posts.

I started out feeling that most polarization probably comes from social media. As I read more, my views have shifted: I think there’s pretty good evidence that other sources, including cable news, have historically driven a lot of polarization (see DellaVigna and Kaplan (2006) and Martin and Yurukoglu (2017)), and that we would be highly polarized even without social media. In addition, most readers of this post (and myself) are “extremely online”, and probably intuitively overestimate the impact of social media on a typical American. However, it is possible that social media has further accelerated polarization to an important degree, but the data are too noisy to provide strong evidence either way.

## Economic AI Safety

Published:

There is a growing fear that algorithmic recommender systems, such as Facebook, Youtube, Netflix, and Amazon, are having negative effects on society, for instance by manipulating users into behaviors that they wouldn’t endorse (e.g. getting them addicted to feeds, leading them to form polarized opinions, recommending false but convincing content).

Some common responses to this fear are to advocate for privacy or to ask that users have greater agency over what they see. I argue below that neither of these will solve the problem, and that the problem may get far worse in the future. I then speculate on alternative solutions based on audits and “information marketplaces”, and discuss their limitations.

## Film Study for Research

Published:

Research ability, like most tasks, is a trainable skill. However, while PhD students and other researchers spend a lot of time doing research, we often don’t spend enough time training our research abilities in order to improve. For many researchers, aside from taking classes and reading papers, most of our training is implicit, through doing research and interacting with mentors (usually a single mentor–our PhD advisor or research manager). By analogy, we are like basketball players who somehow made it to the NBA, and are now hoping that simply playing basketball games will be enough to keep improving.

Drawing on this analogy, I want to talk about two habits that are ubiquitous among elite athletes, that have analogs in research that I feel are underutilized. Those who do pursue these habits as PhD students often improve quickly as researchers.

## Donations for 2019/2020

Published:

Each year I aim to donate around 10% of my income. In 2019, I fell behind on this, probably due to the chaos of COVID-19 (but really this was just an embarassing logistical failure on my part). I’ve recently, finally, finished processing donations for 2019 and 2020. In this post I write about my decisions, in case they are useful to others; see also here for a past write-up from 2016.

## Measurement, Optimization, and Take-off Speed

Published:

In machine learning, we are obsessed with datasets and metrics: progress in areas as diverse as natural language understanding, object recognition, and reinforcement learning is tracked by numerical scores on agreed-upon benchmarks. Despite this, I think we focus too little on measurement—that is, on ways of extracting data from machine learning models that bears upon important hypotheses. This might sound paradoxical, since benchmarks are after all one way of measuring a model. However, benchmarks are a very narrow form of measurement, and I will argue below that trying to measure pretty much anything you can think of is a good mental move that is heavily underutilized in machine learning. I’ll argue this in three ways:

1. Historically, more measurement has almost always been a great move, not only in science but also in engineering and policymaking.
2. Philosophically, measurement has many good properties that bear upon important questions in ML.
3. In my own research, just measuring something and seeing what happened has often been surprisingly fruitful.

## Sets with Small Intersection

Published:

Suppose that we want to construct subsets $S_1, \ldots, S_m \subseteq \{1,\ldots,n\}$ with the following properties:

1.  $S_i \geq k$ for all $i$
2.  $S_i \cap S_j \leq 1$ for all $i \neq j$

The goal is to construct as large a family of such subsets as possible (i.e., to make $m$ as large as possible). If $k \geq 2\sqrt{n}$, then up to constants it is not hard to show that the optimal number of sets is $\frac{n}{k}$ (that is, the trivial construction with all sets disjoint is essentially the best we can do).

Here I am interested in the case when $k \ll \sqrt{n}$. In this case I claim that we can substantially outperform the trivial construction: we can take $m = \Omega(n^2 / k^3)$. The proof is a very nice application of the asymmetric Lovasz Local Lemma. (Readers can refresh their memory here on what the asymmetric LLL says.)

Published:

I’ve spent much of the last few days reading various ICML papers and I find there’s a few pieces of feedback that I give consistently across several papers. I’ve collated some of these below. As a general note, many of these are about local style rather than global structure; I think that good local style probably contributes substantially more to readability than global structure and is in general under-rated. I’m in general pretty willing to break rules about global structure (such as even having a conclusion section in the first place! though this might cause reviewers to look at your paper funny), but not to break local stylistic rules without strong reasons.

• Be precise. This isn’t about being pedantic, but about maximizing information content. Choose your words carefully so that you say what you mean to say. For instance, replace “performance” with “accuracy” or “speed” depending on what you mean.
• Be concise. Most of us write in an overly wordy style, because it’s easy to and no one drilled it out of us. Not only does wordiness decrease readability, it wastes precious space if you have a page limit.
• Avoid complex sentence structure. Most research is already difficult to understand and digest; there’s no reason to make it harder by having complex run-on sentences.
• Use consistent phrasing. In general prose, we’re often told to refer to the same thing in different ways to avoid boring the reader, but in technical writing this will lead to confusion. Hopefully your actual results are interesting enough that the reader doesn’t need to be entertained by your large vocabulary.

## Model Mis-specification and Inverse Reinforcement Learning

Published:

In my previous post, “Latent Variables and Model Mis-specification”, I argued that while machine learning is good at optimizing accuracy on observed signals, it has less to say about correctly inferring the values for unobserved variables in a model. In this post I’d like to focus in on a specific context for this: inverse reinforcement learning (Ng et al. 2000, Abeel et al. 2004, Ziebart et al. 2008, Ho et al 2016), where one observes the actions of an agent and wants to infer the preferences and beliefs that led to those actions. For this post, I am pleased to be joined by Owain Evans, who is an active researcher in this area and has co-authored an online book about building models of agents (see here in particular for a tutorial on inverse reinforcement learning and inverse planning).

Owain and I are particularly interested in inverse reinforcement learning (IRL) because it has been proposed (most notably by Stuart Russell) as a method for learning human values in the context of AI safety; among other things, this would eventually involve learning and correctly implementing human values by artificial agents that are much more powerful, and act with much broader scope, than any humans alive today. While we think that overall IRL is a promising route to consider, we believe that there are also a number of non-obvious pitfalls related to performing IRL with a mis-specified model. The role of IRL in AI safety is to infer human values, which are represented by a reward function or utility function. But crucially, human values (or human reward functions) are never directly observed.

## Linear algebra fact

Published:

Here is interesting linear algebra fact: let $A$ be an $n \times n$ matrix and $u$ be a vector such that $u^{\top}A = \lambda u^{\top}$. Then for any matrix $B$, $u^{\top}((A-B)(\lambda I - B)^{-1}) = u^{\top}$.

The proof is just basic algebra: $u^{\top}(A-B)(\lambda I - B)^{-1} = (\lambda u^{\top} - u^{\top}B)(\lambda I - B)^{-1} = u^{\top}(\lambda I - B)(\lambda I - B)^{-1} = u^{\top}$.

Why care about this? Let’s imagine that $A$ is a (not necessarily symmetric) stochastic matrix, so $1^{\top}A = 1^{\top}$. Let $A-B$ be a low-rank approximation to $A$ (so $A-B$ consists of all the large singular values, and $B$ consists of all the small singular values). Unfortunately since $A$ is not symmetric, this low-rank approximation doesn’t preserve the eigenvalues of $A$ and so we need not have $1^{\top}(A-B) = 1^{\top}$. The $(I-B)^{-1}$ can be thought of as a “correction” term such that the resulting matrix is still low-rank, but we’ve preserved one of the eigenvectors of $A$.

## Prékopa–Leindler inequality

Published:

Consider the following statements:

1. The shape with the largest volume enclosed by a given surface area is the $n$-dimensional sphere.
2. A marginal or sum of log-concave distributions is log-concave.
3. Any Lipschitz function of a standard $n$-dimensional Gaussian distribution concentrates around its mean.

What do these all have in common? Despite being fairly non-trivial and deep results, they all can be proved in less than half of a page using the Prékopa–Leindler inequality.

(I won’t show this here, or give formal versions of the statements above, but time permitting I will do so in a later blog post.)

## Latent Variables and Model Mis-specification

Published:

Machine learning is very good at optimizing predictions to match an observed signal — for instance, given a dataset of input images and labels of the images (e.g. dog, cat, etc.), machine learning is very good at correctly predicting the label of a new image. However, performance can quickly break down as soon as we care about criteria other than predicting observables. There are several cases where we might care about such criteria:

• In scientific investigations, we often care less about predicting a specific observable phenomenon, and more about what that phenomenon implies about an underlying scientific theory.
• In economic analysis, we are most interested in what policies will lead to desirable outcomes. This requires predicting what would counterfactually happen if we were to enact the policy, which we (usually) don’t have any data about.
• In machine learning, we may be interested in learning value functions which match human preferences (this is especially important in complex settings where it is hard to specify a satisfactory value function by hand). However, we are unlikely to observe information about the value function directly, and instead must infer it implicitly. For instance, one might infer a value function for autonomous driving by observing the actions of an expert driver.

## Individual Project Fund: Further Details

Published:

In my post on where I plan to donate in 2016, I said that I would set aside \$2000 for funding promising projects that I come across in the next year: The idea behind the project fund is … [to] give in a low-friction way on scales that are too small for organizations like Open Phil to think about. Moreover, it is likely good for me to develop a habit of evaluating projects I come across and thinking about whether they could benefit from additional money (either because they are funding constrained, or to incentivize an individual who is on the fence about carrying the project out). Finally, if this effort is successful, it is possible that other EAs will start to do this as well, which could magnify the overall impact. I think there is some danger that I will not be able to allocate the \$2000 in the next year, in which case any leftover funds will go to next year’s donor lottery.

In this post I will give some further details about this fund. My primary goal is to give others an idea of what projects I am likely to consider funding, so that anyone who thinks they might be a good fit for this can get in contact with me. (I also expect many of the best opportunities to come from people that I meet in person but don’t necessarily read this blog, so I plan to actively look for projects throughout the year as well.)

I am looking to fund or incentivize projects that meet several of the criteria below:

Published: