What power-seeking theorems do not show
David Thorstad (Vanderbilt University)
GPI Working Paper No. 27-2024
Recent years have seen increasing concern that artificial intelligence may soon pose an existential risk to humanity. One leading ground for concern is that artificial agents may be power-seeking, aiming to acquire power and in the process disempowering humanity. A range of power-seeking theorems seek to give formal articulation to the idea that artificial agents are likely to be power-seeking. I argue that leading theorems face five challenges, then draw lessons from this result.
Other working papers
Moral uncertainty and public justification – Jacob Barrett (Global Priorities Institute, University of Oxford) and Andreas T Schmidt (University of Groningen)
Moral uncertainty and disagreement pervade our lives. Yet we still need to make decisions and act, both in individual and political contexts. So, what should we do? The moral uncertainty approach provides a theory of what individuals morally ought to do when they are uncertain about morality…
Choosing the future: Markets, ethics and rapprochement in social discounting – Antony Millner (University of California, Santa Barbara) and Geoffrey Heal (Columbia University)
This paper provides a critical review of the literature on choosing social discount rates (SDRs) for public cost-benefit analysis. We discuss two dominant approaches, the first based on market prices, and the second based on intertemporal ethics. While both methods have attractive features, neither is immune to criticism. …
Existential risks from a Thomist Christian perspective – Stefan Riedener (University of Zurich)
Let’s say with Nick Bostrom that an ‘existential risk’ (or ‘x-risk’) is a risk that ‘threatens the premature extinction of Earth-originating intelligent life or the permanent and drastic destruction of its potential for desirable future development’ (2013, 15). There are a number of such risks: nuclear wars, developments in biotechnology or artificial intelligence, climate change, pandemics, supervolcanos, asteroids, and so on (see e.g. Bostrom and Ćirković 2008). …