A non-identity dilemma for person-affecting views
Elliott Thornley (Global Priorities Institute, University of Oxford)
GPI Working Paper No. 6-2024
Person-affecting views state that (in cases where all else is equal) we’re permitted but not required to create people who would enjoy good lives. In this paper, I present an argument against every possible variety of person-affecting view. The argument is a dilemma over trilemmas. Narrow person-affecting views imply a trilemma in a case that I call ‘Expanded Non-Identity.’ Wide person-affecting views imply a trilemma in a case that I call ‘Two-Shot Non-Identity.’ One plausible practical upshot of my argument is as follows: we individuals and our governments should be doing more to reduce the risk of human extinction this century.
Other working papers
Towards shutdownable agents via stochastic choice – Elliott Thornley (Global Priorities Institute, University of Oxford), Alexander Roman (New College of Florida), Christos Ziakas (Independent), Leyton Ho (Brown University), and Louis Thomson (University of Oxford)
Some worry that advanced artificial agents may resist being shut down. The Incomplete Preferences Proposal (IPP) is an idea for ensuring that does not happen. A key part of the IPP is using a novel ‘Discounted Reward for Same-Length Trajectories (DReST)’ reward function to train agents to (1) pursue goals effectively conditional on each trajectory-length (be ‘USEFUL’), and (2) choose stochastically between different trajectory-lengths (be ‘NEUTRAL’ about trajectory-lengths). In this paper, we propose…
It Only Takes One: The Psychology of Unilateral Decisions – Joshua Lewis (New York University) et al.
Sometimes, one decision can guarantee that a risky event will happen. For instance, it only took one team of researchers to synthesize and publish the horsepox genome, thus imposing its publication even though other researchers might have refrained for biosecurity reasons. We examine cases where everybody who can impose a given event has the same goal but different information about whether the event furthers that goal. …
Will AI Avoid Exploitation? – Adam Bales (Global Priorities Institute, University of Oxford)
A simple argument suggests that we can fruitfully model advanced AI systems using expected utility theory. According to this argument, an agent will need to act as if maximising expected utility if they’re to avoid exploitation. Insofar as we should expect advanced AI to avoid exploitation, it follows that we should expected advanced AI to act as if maximising expected utility. I spell out this argument more carefully and demonstrate that it fails, but show that the manner of its failure is instructive…