Will AI Avoid Exploitation?

Adam Bales (Global Priorities Institute, University of Oxford)

GPI Working Paper No. 16-2023, published in Philosophical Studies

A simple argument suggests that we can fruitfully model advanced AI systems using expected utility theory. According to this argument, an agent will need to act as if maximising expected utility if they’re to avoid exploitation. Insofar as we should expect advanced AI to avoid exploitation, it follows that we should expected advanced AI to act as if maximising expected utility. I spell out this argument more carefully and demonstrate that it fails, but show that the manner of its failure is instructive: in exploring the argument, we gain insight into how to model advanced AI systems.

Other working papers

Can an evidentialist be risk-averse? – Hayden Wilkinson (Global Priorities Institute, University of Oxford)

Two key questions of normative decision theory are: 1) whether the probabilities relevant to decision theory are evidential or causal; and 2) whether agents should be risk-neutral, and so maximise the expected value of the outcome, or instead risk-averse (or otherwise sensitive to risk). These questions are typically thought to be independent – that our answer to one bears little on our answer to the other. …

The freedom of future people – Andreas T Schmidt (University of Groningen)

What happens to liberal political philosophy, if we consider not only the freedom of present but also future people? In this article, I explore the case for long-term liberalism: freedom should be a central goal, and we should often be particularly concerned with effects on long-term future distributions of freedom. I provide three arguments. First, liberals should be long-term liberals: liberal arguments to value freedom give us reason to be (particularly) concerned with future freedom…

Ethical Consumerism – Philip Trammell (Global Priorities Institute and Department of Economics, University of Oxford)

I study a static production economy in which consumers have not only preferences over their own consumption but also external, or “ethical”, preferences over the supply of each good. Though existing work on the implications of external preferences assumes price-taking, I show that ethical consumers generically prefer not to act even approximately as price-takers. I therefore introduce a near-Nash equilibrium concept that generalizes the near-Nash equilibria found in literature on strategic foundations of general equilibrium…