Will AI Avoid Exploitation?

Adam Bales (Global Priorities Institute, University of Oxford)

GPI Working Paper No. 16-2023, published in Philosophical Studies

A simple argument suggests that we can fruitfully model advanced AI systems using expected utility theory. According to this argument, an agent will need to act as if maximising expected utility if they’re to avoid exploitation. Insofar as we should expect advanced AI to avoid exploitation, it follows that we should expected advanced AI to act as if maximising expected utility. I spell out this argument more carefully and demonstrate that it fails, but show that the manner of its failure is instructive: in exploring the argument, we gain insight into how to model advanced AI systems.

Other working papers

‘The only ethical argument for positive 𝛿’? – Andreas Mogensen (Global Priorities Institute, Oxford University)

I consider whether a positive rate of pure intergenerational time preference is justifiable in terms of agent-relative moral reasons relating to partiality between generations, an idea I call ​discounting for kinship​. I respond to Parfit’s objections to discounting for kinship, but then highlight a number of apparent limitations of this…

The case for strong longtermism – Hilary Greaves and William MacAskill (Global Priorities Institute, University of Oxford)

A striking fact about the history of civilisation is just how early we are in it. There are 5000 years of recorded history behind us, but how many years are still to come? If we merely last as long as the typical mammalian species…

The end of economic growth? Unintended consequences of a declining population – Charles I. Jones (Stanford University)

In many models, economic growth is driven by people discovering new ideas. These models typically assume either a constant or growing population. However, in high income countries today, fertility is already below its replacement rate: women are having fewer than two children on average. It is a distinct possibility — highlighted in the recent book, Empty Planet — that global population will decline rather than stabilize in the long run. …