What power-seeking theorems do not show

David Thorstad (Vanderbilt University)

GPI Working Paper No. 27-2024

Recent years have seen increasing concern that artificial intelligence may soon pose an existential risk to humanity. One leading ground for concern is that artificial agents may be power-seeking, aiming to acquire power and in the process disempowering humanity. A range of power-seeking theorems seek to give formal articulation to the idea that artificial agents are likely to be power-seeking. I argue that leading theorems face five challenges, then draw lessons from this result.

Other working papers

Economic inequality and the long-term future – Andreas T. Schmidt (University of Groningen) and Daan Juijn (CE Delft)

Why, if at all, should we object to economic inequality? Some central arguments – the argument from decreasing marginal utility for example – invoke instrumental reasons and object to inequality because of its effects…

Consequentialism, Cluelessness, Clumsiness, and Counterfactuals – Alan Hájek (Australian National University)

According to a standard statement of objective consequentialism, a morally right action is one that has the best consequences. More generally, given a choice between two actions, one is morally better than the other just in case the consequences of the former action are better than those of the latter. (These are not just the immediate consequences of the actions, but the long-term consequences, perhaps until the end of history.) This account glides easily off the tongue—so easily that…

Time discounting, consistency and special obligations: a defence of Robust Temporalism – Harry R. Lloyd (Yale University)

This paper defends the claim that mere temporal proximity always and without exception strengthens certain moral duties, including the duty to save – call this view Robust Temporalism. Although almost all other moral philosophers dismiss Robust Temporalism out of hand, I argue that it is prima facie intuitively plausible, and that it is analogous to a view about special obligations that many philosophers already accept…