What power-seeking theorems do not show

David Thorstad (Vanderbilt University)

GPI Working Paper No. 27-2024

Recent years have seen increasing concern that artificial intelligence may soon pose an existential risk to humanity. One leading ground for concern is that artificial agents may be power-seeking, aiming to acquire power and in the process disempowering humanity. A range of power-seeking theorems seek to give formal articulation to the idea that artificial agents are likely to be power-seeking. I argue that leading theorems face five challenges, then draw lessons from this result.

Other working papers

Against Willing Servitude: Autonomy in the Ethics of Advanced Artificial Intelligence – Adam Bales (Global Priorities Institute, University of Oxford)

Some people believe that advanced artificial intelligence systems (AIs) might, in the future, come to have moral status. Further, humans might be tempted to design such AIs that they serve us, carrying out tasks that make our lives better. This raises the question of whether designing AIs with moral status to be willing servants would problematically violate their autonomy. In this paper, I argue that it would in fact do so.

Economic growth under transformative AI – Philip Trammell (Global Priorities Institute, Oxford University) and Anton Korinek (University of Virginia)

Industrialized countries have long seen relatively stable growth in output per capita and a stable labor share. AI may be transformative, in the sense that it may break one or both of these stylized facts. This review outlines the ways this may happen by placing several strands of the literature on AI and growth within a common framework. We first evaluate models in which AI increases output production, for example via increases in capital’s substitutability for labor…

Estimating long-term treatment effects without long-term outcome data – David Rhys Bernard (Paris School of Economics)

Estimating long-term impacts of actions is important in many areas but the key difficulty is that long-term outcomes are only observed with a long delay. One alternative approach is to measure the effect on an intermediate outcome or a statistical surrogate and then use this to estimate the long-term effect. …