Concepts of existential catastrophe

Hilary Greaves (University of Oxford)

GPI Working Paper No. 8-2023, forthcoming in The Monist

The notion of existential catastrophe is increasingly appealed to in discussion of risk management around emerging technologies, but it is not completely clear what this notion amounts to. Here, I provide an opinionated survey of the space of plausibly useful definitions of existential catastrophe. Inter alia, I discuss: whether to define existential catastrophe in ex post or ex ante terms, whether an ex ante definition should be in terms of loss of expected value or loss of potential, and what kind of probabilities should be involved in any appeal to expected value.

Other working papers

The Shutdown Problem: An AI Engineering Puzzle for Decision Theorists – Elliott Thornley (Global Priorities Institute, University of Oxford)

I explain and motivate the shutdown problem: the problem of designing artificial agents that (1) shut down when a shutdown button is pressed, (2) don’t try to prevent or cause the pressing of the shutdown button, and (3) otherwise pursue goals competently. I prove three theorems that make the difficulty precise. These theorems suggest that agents satisfying some innocuous-seeming conditions will often try to prevent or cause the pressing of the shutdown button, even in cases where it’s costly to do so. I end by noting that…

The scope of longtermism – David Thorstad (Global Priorities Institute, University of Oxford)

Longtermism holds roughly that in many decision situations, the best thing we can do is what is best for the long-term future. The scope question for longtermism asks: how large is the class of decision situations for which longtermism holds? Although longtermism was initially developed to describe the situation of…

Cassandra’s Curse: A second tragedy of the commons – Philippe Colo (ETH Zurich)

This paper studies why scientific forecasts regarding exceptional or rare events generally fail to trigger adequate public response. I consider a game of contribution to a public bad. Prior to the game, I assume contributors receive non-verifiable expert advice regarding uncertain damages. In addition, I assume that the expert cares only about social welfare. Under mild assumptions, I show that no information transmission can happen at equilibrium when the number of contributors…