What power-seeking theorems do not show
David Thorstad (Vanderbilt University)
GPI Working Paper No. 27-2024
Recent years have seen increasing concern that artificial intelligence may soon pose an existential risk to humanity. One leading ground for concern is that artificial agents may be power-seeking, aiming to acquire power and in the process disempowering humanity. A range of power-seeking theorems seek to give formal articulation to the idea that artificial agents are likely to be power-seeking. I argue that leading theorems face five challenges, then draw lessons from this result.
Other working papers
Longtermist political philosophy: An agenda for future research – Jacob Barrett (Global Priorities Institute, University of Oxford) and Andreas T. Schmidt (University of Groningen)
We set out longtermist political philosophy as a research field. First, we argue that the standard case for longtermism is more robust when applied to institutions than to individual action. This motivates “institutional longtermism”: when building or shaping institutions, positively affecting the value of the long-term future is a key moral priority. Second, we briefly distinguish approaches to pursuing longtermist institutional reform along two dimensions: such approaches may be more targeted or more broad, and more urgent or more patient.
A Fission Problem for Person-Affecting Views – Elliott Thornley (Global Priorities Institute, University of Oxford)
On person-affecting views in population ethics, the moral import of a person’s welfare depends on that person’s temporal or modal status. These views typically imply that – all else equal – we’re never required to create extra people, or to act in ways that increase the probability of extra people coming into existence. In this paper, I use Parfit-style fission cases to construct a dilemma for person-affecting views: either they forfeit their seeming-advantages and face fission analogues…
Are we living at the hinge of history? – William MacAskill (Global Priorities Institute, Oxford University)
In the final pages of On What Matters, Volume II, Derek Parfit comments: ‘We live during the hinge of history… If we act wisely in the next few centuries, humanity will survive its most dangerous and decisive period… What now matters most is that we avoid ending human history.’ This passage echoes Parfit’s comment, in Reasons and Persons, that ‘the next few centuries will be the most important in human history’. …