What power-seeking theorems do not show

David Thorstad (Vanderbilt University)

GPI Working Paper No. 27-2024

Recent years have seen increasing concern that artificial intelligence may soon pose an existential risk to humanity. One leading ground for concern is that artificial agents may be power-seeking, aiming to acquire power and in the process disempowering humanity. A range of power-seeking theorems seek to give formal articulation to the idea that artificial agents are likely to be power-seeking. I argue that leading theorems face five challenges, then draw lessons from this result.

Other working papers

Strong longtermism and the challenge from anti-aggregative moral views – Karri Heikkinen (University College London)

Greaves and MacAskill (2019) argue for strong longtermism, according to which, in a wide class of decision situations, the option that is ex ante best, and the one we ex ante ought to choose, is the option that makes the very long-run future go best. One important aspect of their argument is the claim that strong longtermism is compatible with a wide range of ethical assumptions, including plausible non-consequentialist views. In this essay, I challenge this claim…

Moral uncertainty and public justification – Jacob Barrett (Global Priorities Institute, University of Oxford) and Andreas T Schmidt (University of Groningen)

Moral uncertainty and disagreement pervade our lives. Yet we still need to make decisions and act, both in individual and political contexts. So, what should we do? The moral uncertainty approach provides a theory of what individuals morally ought to do when they are uncertain about morality…

Time Bias and Altruism – Leora Urim Sung (University College London)

We are typically near-future biased, being more concerned with our near future than our distant future. This near-future bias can be directed at others too, being more concerned with their near future than their distant future. In this paper, I argue that, because we discount the future in this way, beyond a certain point in time, we morally ought to be more concerned with the present well- being of others than with the well-being of our distant future selves. It follows that we morally ought to sacrifice…