What power-seeking theorems do not show
David Thorstad (Vanderbilt University)
GPI Working Paper No. 27-2024
Recent years have seen increasing concern that artificial intelligence may soon pose an existential risk to humanity. One leading ground for concern is that artificial agents may be power-seeking, aiming to acquire power and in the process disempowering humanity. A range of power-seeking theorems seek to give formal articulation to the idea that artificial agents are likely to be power-seeking. I argue that leading theorems face five challenges, then draw lessons from this result.
Other working papers
Do not go gentle: why the Asymmetry does not support anti-natalism – Andreas Mogensen (Global Priorities Institute, Oxford University)
According to the Asymmetry, adding lives that are not worth living to the population makes the outcome pro tanto worse, but adding lives that are well worth living to the population does not make the outcome pro tanto better. It has been argued that the Asymmetry entails the desirability of human extinction. However, this argument rests on a misunderstanding of the kind of neutrality attributed to the addition of lives worth living by the Asymmetry. A similar misunderstanding is shown to underlie Benatar’s case for anti-natalism.
Against Willing Servitude: Autonomy in the Ethics of Advanced Artificial Intelligence – Adam Bales (Global Priorities Institute, University of Oxford)
Some people believe that advanced artificial intelligence systems (AIs) might, in the future, come to have moral status. Further, humans might be tempted to design such AIs that they serve us, carrying out tasks that make our lives better. This raises the question of whether designing AIs with moral status to be willing servants would problematically violate their autonomy. In this paper, I argue that it would in fact do so.
Longtermist institutional reform – Tyler M. John (Rutgers University) and William MacAskill (Global Priorities Institute, Oxford University)
There is a vast number of people who will live in the centuries and millennia to come. Even if homo sapiens survives merely as long as a typical species, we have hundreds of thousands of years ahead of us. And our future potential could be much greater than that again: it will be hundreds of millions of years until the Earth is sterilized by the expansion of the Sun, and many trillions of years before the last stars die out. …