On two arguments for Fanaticism

Jeffrey Sanford Russell (University of Southern California)

GPI Working Paper No. 17-2021, published in Noûs

Should we make significant sacrifices to ever-so-slightly lower the chance of extremely bad outcomes, or to ever-so-slightly raise the chance of extremely good outcomes? Fanaticism says yes: for every bad outcome, there is a tiny chance of extreme disaster that is even worse, and for every good outcome, there is a tiny chance of an enormous good that is even better. I consider two related recent arguments for Fanaticism: Beckstead and Thomas’s argument from strange dependence on space and time, and Wilkinson’s Indology argument. While both arguments are instructive, neither is persuasive. In fact, the general principles that underwrite the arguments (a separability principle in the first case, and a reflection principle in the second) are inconsistent with Fanaticism. In both cases, though, it is possible to rehabilitate arguments for Fanaticism based on restricted versions of those principles. The situation is unstable: plausible general principles tell against Fanaticism, but restrictions of those same principles (with strengthened auxiliary assumptions) support Fanaticism. All of the consistent views that emerge are very strange.

Other working papers

Existential risk and growth – Leopold Aschenbrenner (Columbia University)

Human activity can create or mitigate risks of catastrophes, such as nuclear war, climate change, pandemics, or artificial intelligence run amok. These could even imperil the survival of human civilization. What is the relationship between economic growth and such existential risks? In a model of directed technical change, with moderate parameters, existential risk follows a Kuznets-style inverted U-shape. …

Crying wolf: Warning about societal risks can be reputationally risky – Lucius Caviola (Global Priorities Institute, University of Oxford) et al.

Society relies on expert warnings about large-scale risks like pandemics and natural disasters. Across ten studies (N = 5,342), we demonstrate people’s reluctance to warn about unlikely but large-scale risks because they are concerned about being blamed for being wrong. In particular, warners anticipate that if the risk doesn’t occur, they will be perceived as overly alarmist and responsible for wasting societal resources. This phenomenon appears in the context of natural, technological, and financial risks…

Shutdownable Agents through POST-Agency – Elliott Thornley (Global Priorities Institute, University of Oxford)

Many fear that future artificial agents will resist shutdown. I present an idea – the POST-Agents Proposal – for ensuring that doesn’t happen. I propose that we train agents to satisfy Preferences Only Between Same-Length Trajectories (POST). I then prove that POST – together with other conditions – implies Neutrality+: the agent maximizes expected utility, ignoring the probability distribution over trajectory-lengths. I argue that Neutrality+ keeps agents shutdownable and allows them to be useful.