Will AI Avoid Exploitation?

Adam Bales (Global Priorities Institute, University of Oxford)

GPI Working Paper No. 16-2023, published in Philosophical Studies

A simple argument suggests that we can fruitfully model advanced AI systems using expected utility theory. According to this argument, an agent will need to act as if maximising expected utility if they’re to avoid exploitation. Insofar as we should expect advanced AI to avoid exploitation, it follows that we should expected advanced AI to act as if maximising expected utility. I spell out this argument more carefully and demonstrate that it fails, but show that the manner of its failure is instructive: in exploring the argument, we gain insight into how to model advanced AI systems.

Other working papers

The asymmetry, uncertainty, and the long term – Teruji Thomas (Global Priorities Institute, Oxford University)

The Asymmetry is the view in population ethics that, while we ought to avoid creating additional bad lives, there is no requirement to create additional good ones. The question is how to embed this view in a complete normative theory, and in particular one that treats uncertainty in a plausible way. After reviewing…

Future Suffering and the Non-Identity Problem – Theron Pummer (University of St Andrews)

I present and explore a new version of the Person-Affecting View, according to which reasons to do an act depend wholly on what would be said for or against this act from the points of view of particular individuals. According to my view, (i) there is a morally requiring reason not to bring about lives insofar as they contain suffering (negative welfare), (ii) there is no morally requiring reason to bring about lives insofar as they contain happiness (positive welfare), but (iii) there is a permitting reason to bring about lives insofar as they…

Prediction: The long and the short of it – Antony Millner (University of California, Santa Barbara) and Daniel Heyen (ETH Zurich)

Commentators often lament forecasters’ inability to provide precise predictions of the long-run behaviour of complex economic and physical systems. Yet their concerns often conflate the presence of substantial long-run uncertainty with the need for long-run predictability; short-run predictions can partially substitute for long-run predictions if decision-makers can adjust their activities over time. …