What power-seeking theorems do not show
David Thorstad (Vanderbilt University)
GPI Working Paper No. 27-2024
Recent years have seen increasing concern that artificial intelligence may soon pose an existential risk to humanity. One leading ground for concern is that artificial agents may be power-seeking, aiming to acquire power and in the process disempowering humanity. A range of power-seeking theorems seek to give formal articulation to the idea that artificial agents are likely to be power-seeking. I argue that leading theorems face five challenges, then draw lessons from this result.
Other working papers
Are we living at the hinge of history? – William MacAskill (Global Priorities Institute, Oxford University)
In the final pages of On What Matters, Volume II, Derek Parfit comments: ‘We live during the hinge of history… If we act wisely in the next few centuries, humanity will survive its most dangerous and decisive period… What now matters most is that we avoid ending human history.’ This passage echoes Parfit’s comment, in Reasons and Persons, that ‘the next few centuries will be the most important in human history’. …
Existential Risk and Growth – Philip Trammell (Global Priorities Institute and Department of Economics, University of Oxford) and Leopold Aschenbrenner
Technologies may pose existential risks to civilization. Though accelerating technological development may increase the risk of anthropogenic existential catastrophe per period in the short run, two considerations suggest that a sector-neutral acceleration decreases the risk that such a catastrophe ever occurs. First, acceleration decreases the time spent at each technology level. Second, since a richer society is willing to sacrifice more for safety, optimal policy can yield an “existential risk Kuznets curve”; acceleration…
When should an effective altruist donate? – William MacAskill (Global Priorities Institute, Oxford University)
Effective altruism is the use of evidence and careful reasoning to work out how to maximize positive impact on others with a given unit of resources, and the taking of action on that basis. It’s a philosophy and a social movement that is gaining considerable steam in the philanthropic world. For example,…