## Overabundant Information and Learning Traps, joint with Xiaosheng Mu

Last updated: Feb. 15, 2018

**Abstract**. We develop a model of learning from overabundant information: Agents have access to many sources of information, where observation of all sources is not necessary in order to learn the payoff-relevant unknown. Short-lived agents sequentially choose to acquire a signal realization from the best source for them. All signal realizations are public. Our main results characterize two starkly different possible long-run outcomes, and the conditions under which each obtains: (1) efficient information aggregation, where signal acquisitions eventually achieve the highest possible speed of learning; (2) "learning traps," where the community gets stuck using a suboptimal set of sources and learns inefficiently slowly. A simple property of the correlation structure separates these two possibilities. In both regimes, we characterize which sources are observed in the long run and how often.

## Optimal Myopic Information Acquisition, joint with Xiaosheng Mu and Vasilis Syrgkanis

Last updated: April 10, 2018

**Abstract**. We consider the problem of optimal information acquisition from many correlated information sources. Each period, the DM jointly takes an action and allocates a fixed number of observations across the available sources. His payoff depends on the actions taken and on an unknown state. In a canonical setting--jointly normal information sources--we show that the optimal dynamic information acquisition rule proceeds myopically after finitely many periods. If signals are acquired in large blocks each period, then the optimal rule turns out to be myopic from period 1. These results demonstrate the possibility of robust and "simple" optimal information acquisition, and simplify the analysis of dynamic information acquisition in a widely used informational environment.

## Games of Incomplete Information Played By Statisticians

Last updated: Feb 28, 2018

**Abstract**. This paper proposes a foundation for heterogeneous beliefs in games, in which disagreement arises not because players observe different information, but because they learn from common information in different ways. Players may be misspecified, and may moreover be misspecified about how others learn. The key assumption is that players nevertheless have some common understanding of how to interpret the data; formally, players have common certainty in the predictions of a *class* of learning rules. The common prior assumption is nested as the special case in which this class is a singleton. The main results characterize which rationalizable actions and Nash equilibria can be predicted when agents observe a finite quantity of data, and how much data is needed to predict various solutions. This number of observations needed depends on the degree of strictness of the solution and speed of common learning.

## Inference of Preference Heterogeneity from Choice Data (R&R at Journal of Economic Theory)

Last updated: Jan. 16, 2018

**Abstract**. Suppose that an analyst observes inconsistent choices from either a single decision-maker, or a population of agents. Can the analyst determine whether this inconsistency arises from choice error (imperfect maximization of a single preference) or from preference heterogeneity (deliberate maximization of multiple preferences)? I model choice data as generated from a perturbation of a "sparse" random utility model, whose support is a small number of underlying preferences. I show that (a) simultaneously minimizing the number of inferred preferences and the number of unexplained observations can exactly recover the number of underlying preferences with high probability; (b) simultaneously minimizing the richness of the set of preferences and the number of unexplained observations can exactly recover the choice implications of the underlying preferences with high probability.

## The Theory is Predictive, But Is It Complete? An Application to Human Perception of Randomness, joint with Jon Kleinberg and Sendhil Mullainathan

Last updated: July 15, 2017

**Abstract**. When testing a theory, we should ask not just whether its predictions match what we see in the data, but also about its "completeness": how much of the predictable variation in the data does the theory capture? Defining completeness is conceptually challenging, but we show how methods based on machine learning can provide tractable measures of completeness. We also identify a model domain -- the human perception and generation of randomness -- where measures of completeness can be feasibly analyzed; from these measures we discover there is significant structure in the problem that existing theories have yet to capture.

## Predicting and Understanding Initial Play, joint with Drew Fudenberg

Last updated: April 6, 2018

**Abstract**. We take a machine learning approach to the problem of predicting initial play in strategic-form games, with the goal of uncovering new regularities in play and improving the predictions of existing theories. The analysis is implemented on data from previous laboratory experiments, and also a new data set of 200 games played on Mechanical Turk. We first use learning algorithms to train prediction rules based on a large set of game features. Examination of the games where our algorithm predicts play correctly, but the existing models do not, leads us to introduce a risk aversion parameter that we find significantly improves predictive accuracy. Second, we augment existing empirical models by using play in a set of training games to predict how the models' parameters vary across new games. This modified approach generates better out-of-sample predictions, and provides insight into how and why the parameters vary. These methodologies are not special to the problem of predicting play in games, and may be useful in other contexts.