Shutdownable Agents through POST-Agency
Elliott Thornley (Global Priorities Institute, University of Oxford)
GPI Working Paper No. 5-2025
Many fear that future artificial agents will resist shutdown. I present an idea – the POST-Agents Proposal – for ensuring that doesn’t happen. I propose that we train agents to satisfy Preferences Only Between Same-Length Trajectories (POST). I then prove that POST – together with other conditions – implies Neutrality+: the agent maximizes expected utility, ignoring the probability distribution over trajectory-lengths. I argue that Neutrality+ keeps agents shutdownable and allows them to be useful.
Other working papers
How much should governments pay to prevent catastrophes? Longtermism’s limited role – Carl Shulman (Advisor, Open Philanthropy) and Elliott Thornley (Global Priorities Institute, University of Oxford)
Longtermists have argued that humanity should significantly increase its efforts to prevent catastrophes like nuclear wars, pandemics, and AI disasters. But one prominent longtermist argument overshoots this conclusion: the argument also implies that humanity should reduce the risk of existential catastrophe even at extreme cost to the present generation. This overshoot means that democratic governments cannot use the longtermist argument to guide their catastrophe policy. …
The weight of suffering – Andreas Mogensen (Global Priorities Institute, University of Oxford)
How should we weigh suffering against happiness? This paper highlights the existence of an argument from intuitively plausible axiological principles to the striking conclusion that in comparing different populations, there exists some depth of suffering that cannot be compensated for by any measure of well-being. In addition to a number of structural principles, the argument relies on two key premises. The first is the contrary of the so-called Reverse Repugnant Conclusion…
Economic inequality and the long-term future – Andreas T. Schmidt (University of Groningen) and Daan Juijn (CE Delft)
Why, if at all, should we object to economic inequality? Some central arguments – the argument from decreasing marginal utility for example – invoke instrumental reasons and object to inequality because of its effects…