Shutdownable Agents through POST-Agency

Elliott Thornley (Global Priorities Institute, University of Oxford)

GPI Working Paper No. 5-2025

Many fear that future artificial agents will resist shutdown. I present an idea – the POST-Agents Proposal – for ensuring that doesn’t happen. I propose that we train agents to satisfy Preferences Only Between Same-Length Trajectories (POST). I then prove that POST – together with other conditions – implies Neutrality+: the agent maximizes expected utility, ignoring the probability distribution over trajectory-lengths. I argue that Neutrality+ keeps agents shutdownable and allows them to be useful.

Other working papers

Intergenerational equity under catastrophic climate change – Aurélie Méjean (CNRS, Paris), Antonin Pottier (EHESS, CIRED, Paris), Stéphane Zuber (CNRS, Paris) and Marc Fleurbaey (CNRS, Paris School of Economics)

Climate change raises the issue of intergenerational equity. As climate change threatens irreversible and dangerous impacts, possibly leading to extinction, the most relevant trade-off may not be between present and future consumption, but between present consumption and the mere existence of future generations. To investigate this trade-off, we build an integrated assessment model that explicitly accounts for the risk of extinction of future generations…

Ethical Consumerism – Philip Trammell (Global Priorities Institute and Department of Economics, University of Oxford)

I study a static production economy in which consumers have not only preferences over their own consumption but also external, or “ethical”, preferences over the supply of each good. Though existing work on the implications of external preferences assumes price-taking, I show that ethical consumers generically prefer not to act even approximately as price-takers. I therefore introduce a near-Nash equilibrium concept that generalizes the near-Nash equilibria found in literature on strategic foundations of general equilibrium…

Moral demands and the far future – Andreas Mogensen (Global Priorities Institute, Oxford University)

I argue that moral philosophers have either misunderstood the problem of moral demandingness or at least failed to recognize important dimensions of the problem that undermine many standard assumptions. It has been assumed that utilitarianism concretely directs us to maximize welfare within a generation by transferring resources to people currently living in extreme poverty. In fact, utilitarianism seems to imply that any obligation to help people who are currently badly off is trumped by obligations to undertake actions targeted at improving the value…