The Significance, Persistence, Contingency Framework
William MacAskill, Teruji Thomas (Global Priorities Institute, University of Oxford) and Aron Vallinder (Forethought Foundation for Global Priorities Institute)
GPI Technical Report No. T1-2022
The world, considered from beginning to end, combines many different features, or states of affairs, that contribute to its value. The value of each feature can be factored into its significance—its average value per unit time—and its persistence—how long it lasts. Sometimes, though, we want to ask a further question: how much of the feature’s value can be attributed to a particular agent’s decision at a particular point in time (or to some other originating event)? In other words, to what extent is the feature’s value contingent on the agent’s choice? For this, we must also look at the counterfactual: how would things have turned out otherwise?
Other working papers
Misjudgment Exacerbates Collective Action Problems – Joshua Lewis (New York University) et al.
In collective action problems, suboptimal collective outcomes arise from each individual optimizing their own wellbeing. Past work assumes individuals do this because they care more about themselves than others. Yet, other factors could also contribute. We examine the role of empirical beliefs. Our results suggest people underestimate individual impact on collective problems. When collective action seems worthwhile, individual action often does not, even if the expected ratio of costs to benefits is the same. …
Will AI Avoid Exploitation? – Adam Bales (Global Priorities Institute, University of Oxford)
A simple argument suggests that we can fruitfully model advanced AI systems using expected utility theory. According to this argument, an agent will need to act as if maximising expected utility if they’re to avoid exploitation. Insofar as we should expect advanced AI to avoid exploitation, it follows that we should expected advanced AI to act as if maximising expected utility. I spell out this argument more carefully and demonstrate that it fails, but show that the manner of its failure is instructive…
Towards shutdownable agents via stochastic choice – Elliott Thornley (Global Priorities Institute, University of Oxford), Alexander Roman (New College of Florida), Christos Ziakas (Independent), Leyton Ho (Brown University), and Louis Thomson (University of Oxford)
Some worry that advanced artificial agents may resist being shut down. The Incomplete Preferences Proposal (IPP) is an idea for ensuring that does not happen. A key part of the IPP is using a novel ‘Discounted Reward for Same-Length Trajectories (DReST)’ reward function to train agents to (1) pursue goals effectively conditional on each trajectory-length (be ‘USEFUL’), and (2) choose stochastically between different trajectory-lengths (be ‘NEUTRAL’ about trajectory-lengths). In this paper, we propose…