The Shutdown Problem: An AI Engineering Puzzle for Decision Theorists
Elliott Thornley (Global Priorities Institute, University of Oxford)
GPI Working Paper No. 10-2024, forthcoming in Philosophical Studies
I explain and motivate the shutdown problem: the problem of designing artificial agents that (1) shut down when a shutdown button is pressed, (2) don’t try to prevent or cause the pressing of the shutdown button, and (3) otherwise pursue goals competently. I prove three theorems that make the difficulty precise. These theorems suggest that agents satisfying some innocuous-seeming conditions will often try to prevent or cause the pressing of the shutdown button, even in cases where it’s costly to do so. I end by noting that these theorems can guide our search for solutions to the problem.
Other working papers
Economic inequality and the long-term future – Andreas T. Schmidt (University of Groningen) and Daan Juijn (CE Delft)
Why, if at all, should we object to economic inequality? Some central arguments – the argument from decreasing marginal utility for example – invoke instrumental reasons and object to inequality because of its effects…
Moral uncertainty and public justification – Jacob Barrett (Global Priorities Institute, University of Oxford) and Andreas T Schmidt (University of Groningen)
Moral uncertainty and disagreement pervade our lives. Yet we still need to make decisions and act, both in individual and political contexts. So, what should we do? The moral uncertainty approach provides a theory of what individuals morally ought to do when they are uncertain about morality…
Is In-kind Kinder than Cash? The Impact of Money vs Food Aid on Social Emotions and Aid Take-up – Samantha Kassirer, Ata Jami, & Maryam Kouchaki (Northwestern University)
There has been widespread endorsement from the academic and philanthropic communities on the new model of giving cash to those in need. Yet the recipient’s perspective has mostly been ignored. The present research explores how food-insecure individuals feel and respond when offered either monetary or food aid from a charity. Our results reveal that individuals are less likely to accept money than food aid from charity because receiving money feels relatively more shameful and relatively less socially positive. Since many…