The Shutdown Problem: An AI Engineering Puzzle for Decision Theorists

Elliott Thornley (Global Priorities Institute, University of Oxford)

GPI Working Paper No. 10-2024, forthcoming in Philosophical Studies

I explain and motivate the shutdown problem: the problem of designing artificial agents that (1) shut down when a shutdown button is pressed, (2) don’t try to prevent or cause the pressing of the shutdown button, and (3) otherwise pursue goals competently. I prove three theorems that make the difficulty precise. These theorems suggest that agents satisfying some innocuous-seeming conditions will often try to prevent or cause the pressing of the shutdown button, even in cases where it’s costly to do so. I end by noting that these theorems can guide our search for solutions to the problem.

Other working papers

Quadratic Funding with Incomplete Information – Luis M. V. Freitas (Global Priorities Institute, University of Oxford) and Wilfredo L. Maldonado (University of Sao Paulo)

Quadratic funding is a public good provision mechanism that satisfies desirable theoretical properties, such as efficiency under complete information, and has been gaining popularity in practical applications. We evaluate this mechanism in a setting of incomplete information regarding individual preferences, and show that this result only holds under knife-edge conditions. We also estimate the inefficiency of the mechanism in a variety of settings and show, in particular, that inefficiency increases…

Egyptology and Fanaticism – Hayden Wilkinson (Global Priorities Institute, University of Oxford)

Various decision theories share a troubling implication. They imply that, for any finite amount of value, it would be better to wager it all for a vanishingly small probability of some greater value. Counterintuitive as it might be, this fanaticism has seemingly compelling independent arguments in its favour. In this paper, I consider perhaps the most prima facie compelling such argument: an Egyptology argument (an analogue of the Egyptology argument from population ethics). …

Meaning, medicine and merit – Andreas Mogensen (Global Priorities Institute, Oxford University)

Given the inevitability of scarcity, should public institutions ration healthcare resources so as to prioritize those who contribute more to society? Intuitively, we may feel that this would be somehow inegalitarian. I argue that the egalitarian objection to prioritizing treatment on the basis of patients’ usefulness to others is best thought…