What power-seeking theorems do not show
David Thorstad (Vanderbilt University)
GPI Working Paper No. 27-2024
Recent years have seen increasing concern that artificial intelligence may soon pose an existential risk to humanity. One leading ground for concern is that artificial agents may be power-seeking, aiming to acquire power and in the process disempowering humanity. A range of power-seeking theorems seek to give formal articulation to the idea that artificial agents are likely to be power-seeking. I argue that leading theorems face five challenges, then draw lessons from this result.
Other working papers
Moral uncertainty and public justification – Jacob Barrett (Global Priorities Institute, University of Oxford) and Andreas T Schmidt (University of Groningen)
Moral uncertainty and disagreement pervade our lives. Yet we still need to make decisions and act, both in individual and political contexts. So, what should we do? The moral uncertainty approach provides a theory of what individuals morally ought to do when they are uncertain about morality…
Philosophical considerations relevant to valuing continued human survival: Conceptual Analysis, Population Axiology, and Decision Theory – Andreas Mogensen (Global Priorities Institute, University of Oxford)
Many think that human extinction would be a catastrophic tragedy, and that we ought to do more to reduce extinction risk. There is less agreement on exactly why. If some catastrophe were to kill everyone, that would obviously be horrific. Still, many think the deaths of billions of people don’t exhaust what would be so terrible about extinction. After all, we can be confident that billions of people are going to die – many horribly and before their time – if humanity does not go extinct. …
The paralysis argument – William MacAskill, Andreas Mogensen (Global Priorities Institute, Oxford University)
Given plausible assumptions about the long-run impact of our everyday actions, we show that standard non-consequentialist constraints on doing harm entail that we should try to do as little as possible in our lives. We call this the Paralysis Argument. After laying out the argument, we consider and respond to…