Against Anti-Fanaticism

Christian Tarsney (Population Wellbeing Initiative, University of Texas at Austin)

GPI Working Paper No. 15-2023, published in Philosophy and Phenomenological Research

Should you be willing to forego any sure good for a tiny probability of a vastly greater good? Fanatics say you should, anti-fanatics say you should not. Anti-fanaticism has great intuitive appeal. But, I argue, these intuitions are untenable, because satisfying them in their full generality is incompatible with three very plausible principles: acyclicity, a minimal dominance principle, and the principle that any outcome can be made better or worse. This argument against anti-fanaticism can be turned into a positive argument for a weak version of fanaticism, but only from significantly more contentious premises. In combination, these facts suggest that those who find fanaticism counterintuitive should favor not anti-fanaticism, but an intermediate position that permits agents to have incomplete preferences that are neither fanatical nor anti-fanatical.

Other working papers

In Defence of Moderation – Jacob Barrett (Vanderbilt University)

A decision theory is fanatical if it says that, for any sure thing of getting some finite amount of value, it would always be better to almost certainly get nothing while having some tiny probability (no matter how small) of getting sufficiently more finite value. Fanaticism is extremely counterintuitive; common sense requires a more moderate view. However, a recent slew of arguments purport to vindicate it, claiming that moderate alternatives to fanaticism are sometimes similarly counterintuitive, face a powerful continuum argument…

Imperfect Recall and AI Delegation – Eric Olav Chen (Global Priorities Institute, University of Oxford), Alexis Ghersengorin (Global Priorities Institute, University of Oxford) and Sami Petersen (Department of Economics, University of Oxford)

A principal wants to deploy an artificial intelligence (AI) system to perform some task. But the AI may be misaligned and aim to pursue a conflicting objective. The principal cannot restrict its options or deliver punishments. Instead, the principal is endowed with the ability to impose imperfect recall on the agent. The principal can then simulate the task and obscure whether it is real or part of a test. This allows the principal to screen misaligned AIs during testing and discipline their behaviour in deployment. By increasing the…

Evolutionary debunking and value alignment – Michael T. Dale (Hampden-Sydney College) and Bradford Saad (Global Priorities Institute, University of Oxford)

This paper examines the bearing of evolutionary debunking arguments—which use the evolutionary origins of values to challenge their epistemic credentials—on the alignment problem, i.e. the problem of ensuring that highly capable AI systems are properly aligned with values. Since evolutionary debunking arguments are among the best empirically-motivated arguments that recommend changes in values, it is unsurprising that they are relevant to the alignment problem. However, how evolutionary debunking arguments…