AI alignment vs AI ethical treatment: Ten challenges

Adam Bradley (Lingnan University) and Bradford Saad (Global Priorities Institute, University of Oxford)

GPI Working Paper No. 19-2024

A morally acceptable course of AI development should avoid two dangers: creating unaligned AI systems that pose a threat to humanity and mistreating AI systems that merit moral consideration in their own right. This paper argues these two dangers interact and that if we create AI systems that merit moral consideration, simultaneously avoiding both of these dangers would be extremely challenging. While our argument is straightforward and supported by a wide range of pretheoretical moral judgments, it has far-reaching moral implications for AI development. Although the most obvious way to avoid the tension between alignment and ethical treatment would be to avoid creating AI systems that merit moral consideration, this option may be unrealistic and is perhaps fleeting. So, we conclude by offering some suggestions for other ways of mitigating mistreatment risks associated with alignment.

Other working papers

A Fission Problem for Person-Affecting Views – Elliott Thornley (Global Priorities Institute, University of Oxford)

On person-affecting views in population ethics, the moral import of a person’s welfare depends on that person’s temporal or modal status. These views typically imply that – all else equal – we’re never required to create extra people, or to act in ways that increase the probability of extra people coming into existence. In this paper, I use Parfit-style fission cases to construct a dilemma for person-affecting views: either they forfeit their seeming-advantages and face fission analogues…

Should longtermists recommend hastening extinction rather than delaying it? – Richard Pettigrew (University of Bristol)

Longtermism is the view that the most urgent global priorities, and those to which we should devote the largest portion of our current resources, are those that focus on ensuring a long future for humanity, and perhaps sentient or intelligent life more generally, and improving the quality of those lives in that long future. The central argument for this conclusion is that, given a fixed amount of are source that we are able to devote to global priorities, the longtermist’s favoured interventions have…

Philosophical considerations relevant to valuing continued human survival: Conceptual Analysis, Population Axiology, and Decision Theory – Andreas Mogensen (Global Priorities Institute, University of Oxford)

Many think that human extinction would be a catastrophic tragedy, and that we ought to do more to reduce extinction risk. There is less agreement on exactly why. If some catastrophe were to kill everyone, that would obviously be horrific. Still, many think the deaths of billions of people don’t exhaust what would be so terrible about extinction. After all, we can be confident that billions of people are going to die – many horribly and before their time – if humanity does not go extinct. …