AI alignment vs AI ethical treatment: Ten challenges

Adam Bradley (Lingnan University) and Bradford Saad (Global Priorities Institute, University of Oxford)

GPI Working Paper No. 19-2024

A morally acceptable course of AI development should avoid two dangers: creating unaligned AI systems that pose a threat to humanity and mistreating AI systems that merit moral consideration in their own right. This paper argues these two dangers interact and that if we create AI systems that merit moral consideration, simultaneously avoiding both of these dangers would be extremely challenging. While our argument is straightforward and supported by a wide range of pretheoretical moral judgments, it has far-reaching moral implications for AI development. Although the most obvious way to avoid the tension between alignment and ethical treatment would be to avoid creating AI systems that merit moral consideration, this option may be unrealistic and is perhaps fleeting. So, we conclude by offering some suggestions for other ways of mitigating mistreatment risks associated with alignment.

Other working papers

Aggregating Small Risks of Serious Harms – Tomi Francis (Global Priorities Institute, University of Oxford)

According to Partial Aggregation, a serious harm can be outweighed by a large number of somewhat less serious harms, but can outweigh any number of trivial harms. In this paper, I address the question of how we should extend Partial Aggregation to cases of risk, and especially to cases involving small risks of serious harms. I argue that, contrary to the most popular versions of the ex ante and ex post views, we should sometimes prevent a small risk that a large number of people will suffer serious harms rather than prevent…

Minimal and Expansive Longtermism – Hilary Greaves (University of Oxford) and Christian Tarsney (Population Wellbeing Initiative, University of Texas at Austin)

The standard case for longtermism focuses on a small set of risks to the far future, and argues that in a small set of choice situations, the present marginal value of mitigating those risks is very great. But many longtermists are attracted to, and many critics of longtermism worried by, a farther-reaching form of longtermism. According to this farther-reaching form, there are many ways of improving the far future, which determine the value of our options in all or nearly all choice situations…

Do not go gentle: why the Asymmetry does not support anti-natalism – Andreas Mogensen (Global Priorities Institute, Oxford University)

According to the Asymmetry, adding lives that are not worth living to the population makes the outcome pro tanto worse, but adding lives that are well worth living to the population does not make the outcome pro tanto better. It has been argued that the Asymmetry entails the desirability of human extinction. However, this argument rests on a misunderstanding of the kind of neutrality attributed to the addition of lives worth living by the Asymmetry. A similar misunderstanding is shown to underlie Benatar’s case for anti-natalism.