A bargaining-theoretic approach to moral uncertainty
Owen Cotton-Barratt (Future of Humanity Institute, University of Oxford), Hilary Greaves (Global Priorities Institute, University of Oxford)
GPI Working Paper No. 2-2023, published in the Journal of Moral Philosophy
This paper explores a new approach to the problem of decision under relevant moral uncertainty. We treat the case of an agent making decisions in the face of moral uncertainty on the model of bargaining theory, as if the decision-making process were one of bargaining among different internal parts of the agent, with different parts committed to different moral theories. The resulting approach contrasts interestingly with the extant “maximise expected choiceworthiness” and “my favourite theory” approaches, in several key respects. In particular, it seems somewhat less prone than the MEC approach to ‘fanaticism’: allowing decisions to be dictated by a theory in which the agent has extremely low credence, if the relative stakes are high enough. Overall, however, we tentatively conclude that the MEC approach is superior to a bargaining-theoretic approach.
Other working papers
Heuristics for clueless agents: how to get away with ignoring what matters most in ordinary decision-making – David Thorstad and Andreas Mogensen (Global Priorities Institute, Oxford University)
Even our most mundane decisions have the potential to significantly impact the long-term future, but we are often clueless about what this impact may be. In this paper, we aim to characterize and solve two problems raised by recent discussions of cluelessness, which we term the Problems of Decision Paralysis and the Problem of Decision-Making Demandingness. After reviewing and rejecting existing solutions to both problems, we argue that the way forward is to be found in the distinction between procedural and substantive rationality…
The Shutdown Problem: An AI Engineering Puzzle for Decision Theorists – Elliott Thornley (Global Priorities Institute, University of Oxford)
I explain and motivate the shutdown problem: the problem of designing artificial agents that (1) shut down when a shutdown button is pressed, (2) don’t try to prevent or cause the pressing of the shutdown button, and (3) otherwise pursue goals competently. I prove three theorems that make the difficulty precise. These theorems suggest that agents satisfying some innocuous-seeming conditions will often try to prevent or cause the pressing of the shutdown button, even in cases where it’s costly to do so. I end by noting that…
Three mistakes in the moral mathematics of existential risk – David Thorstad (Global Priorities Institute, University of Oxford)
Longtermists have recently argued that it is overwhelmingly important to do what we can to mitigate existential risks to humanity. I consider three mistakes that are often made in calculating the value of existential risk mitigation: focusing on cumulative risk rather than period risk; ignoring background risk; and neglecting population dynamics. I show how correcting these mistakes pushes the value of existential risk mitigation substantially below leading estimates, potentially low enough to…