What power-seeking theorems do not show

David Thorstad (Vanderbilt University)

GPI Working Paper No. 27-2024

Recent years have seen increasing concern that artificial intelligence may soon pose an existential risk to humanity. One leading ground for concern is that artificial agents may be power-seeking, aiming to acquire power and in the process disempowering humanity. A range of power-seeking theorems seek to give formal articulation to the idea that artificial agents are likely to be power-seeking. I argue that leading theorems face five challenges, then draw lessons from this result.

Other working papers

The weight of suffering – Andreas Mogensen (Global Priorities Institute, University of Oxford)

How should we weigh suffering against happiness? This paper highlights the existence of an argument from intuitively plausible axiological principles to the striking conclusion that in comparing different populations, there exists some depth of suffering that cannot be compensated for by any measure of well-being. In addition to a number of structural principles, the argument relies on two key premises. The first is the contrary of the so-called Reverse Repugnant Conclusion…

Existential risk and growth – Leopold Aschenbrenner (Columbia University)

Human activity can create or mitigate risks of catastrophes, such as nuclear war, climate change, pandemics, or artificial intelligence run amok. These could even imperil the survival of human civilization. What is the relationship between economic growth and such existential risks? In a model of directed technical change, with moderate parameters, existential risk follows a Kuznets-style inverted U-shape. …

Maximal cluelessness – Andreas Mogensen (Global Priorities Institute, Oxford University)

I argue that many of the priority rankings that have been proposed by effective altruists seem to be in tension with apparently reasonable assumptions about the rational pursuit of our aims in the face of uncertainty. The particular issue on which I focus arises from recognition of the overwhelming importance…