Will AI Avoid Exploitation?
Adam Bales (Global Priorities Institute, University of Oxford)
GPI Working Paper No. 16-2023, published in Philosophical Studies
A simple argument suggests that we can fruitfully model advanced AI systems using expected utility theory. According to this argument, an agent will need to act as if maximising expected utility if they’re to avoid exploitation. Insofar as we should expect advanced AI to avoid exploitation, it follows that we should expected advanced AI to act as if maximising expected utility. I spell out this argument more carefully and demonstrate that it fails, but show that the manner of its failure is instructive: in exploring the argument, we gain insight into how to model advanced AI systems.
Other working papers
The unexpected value of the future – Hayden Wilkinson (Global Priorities Institute, University of Oxford)
Various philosophers accept moral views that are impartial, additive, and risk-neutral with respect to betterness. But, if that risk neutrality is spelt out according to expected value theory alone, such views face a dire reductio ad absurdum. If the expected sum of value in humanity’s future is undefined—if, e.g., the probability distribution over possible values of the future resembles the Pasadena game, or a Cauchy distribution—then those views say that no real-world option is ever better than any other. And, as I argue…
Doomsday and objective chance – Teruji Thomas (Global Priorities Institute, Oxford University)
Lewis’s Principal Principle says that one should usually align one’s credences with the known chances. In this paper I develop a version of the Principal Principle that deals well with some exceptional cases related to the distinction between metaphysical and epistemic modality. I explain how this principle gives a unified account of the Sleeping Beauty problem and chance-based principles of anthropic reasoning…
Estimating long-term treatment effects without long-term outcome data – David Rhys Bernard (Paris School of Economics)
Estimating long-term impacts of actions is important in many areas but the key difficulty is that long-term outcomes are only observed with a long delay. One alternative approach is to measure the effect on an intermediate outcome or a statistical surrogate and then use this to estimate the long-term effect. …