Andreas Mogensen | The good news about just saving
Global Priorities Seminar
ANDREAS MOGENSEN (00:10) So I take it that the central question in the theory of Intergenerational Justice can be posed roughly as follows. Most of the people who are ever going to live will probably live in our future long after you and I are dead. Over them, we hold immense unchecked power. What would it mean for us to exercise that power justly? For concreteness, I'm going to focus on the question of how much we should save or invest on behalf of future generations. That was the question that principally concerned Rawls in his work on Intergenerational Justice. So our generation inherits from its forebearers and from nature, a range of goods and capital and what we're endowed will produce a range of goods and services. And the question we should ask ourselves is, how much of that can we justly take for ourselves? And how little can we justly leave for those who will come after? In addressing this question Rawls was, in large part, attempting to respond to the utilitarian tradition in growth economics, pioneered by Frank Ramsey in his 1928 paper, A Mathematical Theory of Saving. So given suitable simplifying assumptions, it can be shown that absent pure time preference and intergenerational utilitarian social welfare function is maximized only if the current generation saves the vast majority of total output. So Kenneth Arrow in this 1996 working paper that I'm citing, he discusses a model where the optimal rate of saving requires us to set aside at least two thirds of economic output on behalf of future people. Arrow describes this as an incredible and unacceptable strain on the present generation. Now, in spite of Arrow’s dismay, the view that the welfare of the vast multitude of potential future people should be the dominating concern of present people, has been gaining ground with non-consequentialists, including philosophers at GPI.
(01:57) So in introducing his own just savings principle it's kind of notable that Rawls expresses this hope that we should be able to avoid this sort of extreme. So Rawls has this view where saving is required by justice only insofar as it's necessary to bring about a state of society where the material base is sufficient to establish effective just institutions and saving beyond that is not required. However, I think Rawls’ attempts to work out a theory of intergenerational justice are widely viewed as having failed. So the theory of just saving that he put forward in the first edition of A Theory of Justice is one that he himself would later reject and the revised view that he offers in Political Liberalism and in Justice as Fairness, whilst an improvement is, I think, somewhat difficult to interpret. And it's been argued to fail to support the kind of modest saving schedule that Rawls hoped to derive by [inaudible 02:50] in a 2003 paper in Philosophical Studies, I think.
(02:54) So my aim in this talk is to do my best to help to try to reinvigorate the study of Intergenerational Justice from the perspective of Rawlsian social contract theory, building in particular, on the work of the later Rawls. And so in particular, I want to try to address two key conceptual challenges that I think remain as yet unsolved. The first is to explain the motivation of the contracting parties with respect to whether and how much to save. And the second is to account for the significance of variable population outcomes. The thought here is that the choice of saving schedule across generations determines the size and/or composition of the total population of everyone who ever lived. I think these are basic philosophical issues that need to be addressed before you can really begin to derive a just saving principle, a concrete one in earnest. Only once all this is resolved, can we really say whether the demands of intergenerational justice are more like the modest demands that Rawls imagined or more like the exacting obligations that contemporary consequentialist longtermists want to promote?
(03:56) And so I'm just going to assume throughout this talk, what I take to be the core assumption of Rawlsian social contract theory stated using Rawlsian jargon. It's the idea that Justice in Ideal theory corresponds to what would be chosen in the Original Position from behind the Veil of Ignorance stated with somewhat less jargon, is that thought that principles of justice can be derived by asking what rational mutually disinterested persons would collectively choose as the norms governing their basic political institutions when deprived of knowledge about how they individually would fare under the terms of their collective agreements, but know that everyone will adhere to the agreement. Now, that assumption is obviously highly controversial. It's alleged shortcomings are legion and my aim is not going to be to try to defend social contract theory as a general political philosophy, because that will take, you know, a completely separate series of talks.
(04:45 ) My aim is just going to be to try to work out some foundational questions that are relevant to its application to the problem of intergenerational justice, without which I think a final accounting would be premature. And I'm going to start by just summarizing Rawls’ own thinking on this issue and its development over the course of his career, because as I said, I want to try to build on the view of the later Rawls. So let's see how we got there. So this issue of justice between generations is addressed in sections 44 to 46, on A Theory of Justice. And Rawls acknowledges upfront, that the attempt to specify a Just Saving Principle is going to pose significant problems for his contractualist framework. And in some sense, the most straightforward and the deepest obstacle arises just from the fact that on the one hand, Rawls imagines the parties in this Original Position as attempting to win for themselves the highest index of primary social goods. In addition, he stipulates that they know that they all belong to the same generation, the so-called the present time of entry interpretation of the contract. And from these assumptions alone, it would seem to follow that nothing will be saved. So Rawls elaborates, he says, “unless they care, at least for their immediate successors, there's no reason for them to agree to undertake any saving whatever. Either earlier generations have saved or they have not. There is nothing the parties can do to affect it.” Now, to avoid this implication of a zero rate of saving, we could try to remove the present time of entry interpretation. We could remove this assumption that the parties know that they all belong to the same generation and we could try instead to imagine a contract between generations. But Rawls does reject that approach, he says, to think of the Original Position as a general assembly which includes at one moment everyone who will ever know at some time would stretch fantasy too far. Now, he doesn't really elaborate on why he thinks the imagination will be stretched beyond breaking. But nonetheless, I think grounds for thinking that this idea of a contract between generations is philosophically suspect or ready to hand. So note, in particular, that we might think any principles agreed upon by an assembly of generations will affect the number and composition of the total population of everyone who will ever live. But in that case, who exactly is it that is being placed behind the Veil of Ignorance and asked to choose principles.
(07:05) But we can't coherently conceive of an assembly of individuals choosing among options, if the choice of some of those options would mean that some of those people never existed in the first place. On the other hand, there might be no one beyond the members of the first generation, whose existence does not depend at all on what schedule of saving might be chosen by this assembly of generations. But of course, if they're the only people behind the Veil of Ignorance, then we don't have a contract between generations. So rather than appealing to this idea of a contract between everyone who will ever live, Rawls originally tried to avoid the implication of zero rate of saving by retaining the present time of entry interpretation, but rejecting the assumption that the parties are just trying to maximize their own holdings of social primary goods. So he says that it “seems best to preserve the present time of entry interpretation and therefore to adjust the motivation condition. The parties are regarded as representing family lines, say, with ties of sentiment between successive generations.” And so those ties of sentiments are supposed to ensure that the members of the present generation want to save something on behalf of the people who come after.
(08:10) Now this solution is widely viewed as a failure to highlight only the most obvious problem, the appeal to ties of sentiment seems to be an ad hoc adjustment of the motivating condition. And in his later work Rawls would suggest that the just savings principle is to be chosen on a different basis. It's to be chosen without adjusting the motivation condition, but now with the constraint that the parties must want what they choose to have been chosen or to have been observed by all previous generations. So he puts it as follows. He says “the correct principle is that which the member of any generation (and so all generations) would adopt as the one their generation is to follow and as the principle that they would want preceding generations to have followed (and later generations to follow), no matter how far back (or forward) in time. And Rawls notes that a view of this kind was first suggested in a 1976 paper by Jane English. English, in that paper argues that Rawls was simply mistaken to suppose that the assumption of self-interest posed an insuperable obstacle to the derivation of a positive rate of saving, even given the present time entry interpretation. English argues that since the derivation of a just saving principle forms the part of ideal theory. She says “the choosers in the original position should assume that other generations save according to just principles too. Then selecting a saving principle would not be contrary to their self-interest.”
(09:35) So I think that while the new foundation that's proposed for the just savings principle, avoids the early Rawls’ ad hoc reliance on ties of sentiment. It arguably sacrifices the nice clarity and straightforwardness of the original. I think it's not altogether clear how we're now to conceive of the choices that are to be made by the parties in the original position with respect to whether and how much to save. We’re to think of them as choosing a rate of saving that they will themselves adopt subject to this constraint that they must want previous generations who have adopted it as well. But how exactly do we conceive of this constraint and its role in shaping the choices of the parties? So one possibility we could consider is to imagine that the choice of a just savings principle is now conceived as a kind of two-step procedure. We might think that in the first step, the menu of principles that's going to be part of the second stage is first filtered based on whether saving at the indicated rate is something the parties would want previous generations to have followed. In the second step, the parties are then presented with that filtered menu of just saving principles from among which they choose self-interestedly. But I'm not entirely clear how to fill out the details of this picture. For example, what sense of want is an issue at the first stage? Should we think that this initial filtering is picking out the rate of saving that the parties would most prefer previous generations to have followed? If that's the case, won’t only a very high rate of saving be left on the menu implying implausibly strict demands for accumulation on behalf of future people? Furthermore, we can ask what justifies the application of this kind of filter or this kind of constraint given there’s no similar filter in Rawls’ theory that's imposed on choices of intergenerational principles. So isn't this constraint, this filter as we might conceive of it, just another ad hoc modification of the contract framework.
(11:28) In a 2009 paper Attas suggests that this filter or this constraint can be justified by appeal to the formal universality constraint otherwise imposed by Rawls on Principles of Justice. So outlining that constraint Rawls says that principles of justice must hold for everyone in virtue of their being moral persons. Moreover, a principle is ruled out if it would be self-contradictory, or self-defeating, for everyone to act upon it. Similarly, should a principle be reasonable to follow only when others conform to a different one, it is also inadmissible. However, it's not at all clear that that universality constraint is going to rule out, say, a principle which requires no saving on behalf of the present generation, although that's precisely what the kind of filter or constraint that we're imagining, is designed to eliminate. So consider the constraint that a principle may not be chosen if it would be reasonable to act on it, only if others conform to a different one. That, I think, does not rule out the choice of a zero rate of saving, after all Rawls’ original argument that self-interested agents will choose this principle. It was a dominance argument. The thought was that whatever level of saving previous generations had chosen, it is better for us not to save.
(12:42) Another set of problems arise from the fact that the social rate of saving is plausibly viewed as identity effecting i.e. it changes which persons will come to exist. So the thought here, as I've already expressed, is that depending on what rate of saving is followed in previous generations, the people in those generations would conceive children at different times, sometimes or often with different partners. And after suitably many generations, the populations associated with different candidate just saving principles, might therefore well have no individuals in common. Taking this into account, would the parties not therefore want previous generations to have followed, whatever savings, whatever schedule of saving behaviour is necessary for them to be born at all. For example, if the menu of candidate principles includes an option that prescribes whatever pattern of saving behaviour actually obtains, we might think the parties would want previous generations to have acted in accordance with this principle since they would otherwise almost certainly not exist. On the other hand, it seems kind of absurd to suppose that whatever set pattern of saving behaviour actually obtained is ipso facto just. Now, it could be said that we should imagine the parties as deciding what savings schedule they would want previous generations to have followed on the assumption that they would have existed regardless of how much previous generations had saved. However, that assumption is false for the reasons I've just noted and Rawls understandably insists that the parties “should not reason from false premises.” So I think a different solution is required.
(14:20) So, in my view, the right way of explaining the motivation of the contracting parties with respect to whether and how much to save and the way to solve the problems I've just noted, is to assume that the parties in the original position obey evidential decision theory, as opposed to causal decision theory. So in this part of the talk, I'm going to explain the nature of this contrast. This is mainly intended for people in political philosophy who might not have very much prior familiarity with decision theory. Those of you who do have prior knowledge of contemporary controversies in decision theory, might prefer to mentally switch off for about five minutes or so. In any case, here it goes. So very roughly EDT instructs rational agents to choose the option they would most like to learn they will choose, whereas CDT instructs them to choose the act that can be expected to cause the best outcome. That's just a very rough clause. So to illustrate, let's consider the classic Newcomb problem. We'll imagine that a highly reliable predictor has either placed $0 or $1,000,000 in an opaque box. And that box is presented to you alongside a transparent box. And you can see what the transparent box contains. It contains $1,000. Here are your options, you can either take both boxes, we'll call this ‘two boxing’, or you can take only the opaque box. So the catch here is that the predictor has placed $1,000,000 in the opaque box if and only if she predicted that you would ‘one box’, that you would only take the opaque box. So choosing to ‘one box’ does not causally affect whether the opaque box contains $0 or $1,000,000, the predictor has already put the money in there or not. The [inaudible 15:56] to one box none the less provides very strong evidence that you are now a millionaire. For this reason EDT may recommend that you take this option and in particular, it will recommend that you ‘one box’ in the situation where you are certain that the predictor has correctly predicted your behaviour.
(16:13) Causal decision theory by contrast is going to recommend that you ‘two box’ in the Newcomb Problem. On the standard interpretation, it does so even when you're certain that the predictor has correctly predicted what you will do. As we recall CDT instructs you to choose the act that can be expected to cause the best outcome. In the Newcomb problem that means you ‘two box’. That leaves you $1,000 richer regardless of whether the opaque box contains zero or a million dollars, and there is nothing that you can do now to affect its contents. So the EDT/CDT contrast is therefore in part a disagreement about when dominance reasoning is appropriate. It's clear that dominance reasoning can sometimes be used in ways that should be considered pathological. For example, I shouldn't reason as follows: either I will develop lung cancer or I won't. Well, if I'm going to develop lung cancer anyway, I would prefer to smoke and if I'm not going to develop lung cancer ever, then I also prefer to smoke. So being a smoker is preferable no matter what and I should start immediately. There's something wrong with this line of argument and intuitively the problem is that whether or not I'll develop lung cancer is not independent of whether I'm now going to choose to start smoking. But, independent in what sense?
(17:21) According to CDT, it's causal independence that matters. Dominance reasoning is appropriate if and only if the relevant states are individuated so that which state is actual is causally independent of my choice. According to EDT by contrast, it's evidential independence that matters. Dominance reasoning is appropriate if and only if the relevant states are individuated so that my choice provides no evidence about the status act. We can bring out the contrast with greater formal precision by stating a formalization of EDT and CDT using a framework proposed by Lewis originally. So we have S as our set of possible worlds. We have capital A as the set of actions available to the agent and we introduce this idea of a K-partition of S as being a partition of S such that each cell in the partition fully specifies the causal dependence of each possible outcome on each act in the set of acts. Then if we have such a K-partition of S, if Pr is the agent’s rational credence function and u is her utility function, then an act A is rationally permissible according to EDT, if and only if there is no other act a’ such that this inequality here is satisfied.
(18:33) However, the same act may be rationally permissible according to CDT if and only if there's no a’ such that this slightly different inequality is satisfied. The key thing to note here is that CDT is asking us to compute expected utilities using the unconditional prior probabilities of the cell in the partition. Whereas EDT is asking you to compute the expected utilities using the relevant conditional probabilities. Now, the fact that EDT and CDT are typically formalized using expected utility theory, may suggest to some that the distinction between these theories probably only matters, the context in which the agent can be assumed to act on the basis of a unique probability assignment relative to which they maximize expected utility. And of course, Rawlsians are adamant that that's not how one should decide in the Original Position. However, the EDT/CDT distinction has a more general significance and it needn't make that kind of assumption about the agent's behaviour. Apart from the point that we've already noted that the different theories support different kinds of dominance reasoning.
(19:34) The obvious way to see this is as follows: we could consider agents whose beliefs are modeled not by a single probability function, but rather by a set of probability functions, a spread of credence, a so-called representor corresponding to the different, say, chance hypotheses or chance functions that are left open for evidence. And we can assume, furthermore, that these agents act in accordance with the so-called MMEU or Maximin EU principle, which instructs them to choose the option with the greatest minimum expected utility relative to the probability functions in their representor. In order to compute that minimum expected utility for any given action. Well, you can use either CDT or EDT, depending on whether the expectation is calculated using unconditional prior probabilities or conditional probabilities. So the proposal that the EDT/CDT distinction may be relevant in thinking about what the parties would choose in the Original Position is not going to ask us to assume that the parties are maximizing expected utility relative to a unique probability distribution. It's compatible with the thought that the doxastic states of the parties should adequately reflect the very deep uncertainty under which they must choose and with the assumption that they will choose in a manner expressive of the exercise of great caution.
(20:46) Okay, so I'm now going to talk about the significance of the EDT/CDT distinction in the context of intergenerational justice. So the decision theorists who mentally switched off for five minutes, you can you can rejoin us now. And I think in order to grasp why this matters, it helps us to return to a comment that was made by Jane English, which I quoted earlier. So English says the choosers in the original position, should assume that other generations save according to just principles too. Then selecting a saving principle would not be contrary to their self-interest. There seems to be some logic in this, but how are we to square English’s remarks with Rawls’ early arguments, that self-interested agents wouldn't save anything. As he says either earlier generations have saved or they have not. There's nothing the parties can do to affect it. That seemed kind of equally compelling. Well, I think the key here is to note that the argument that is given by Rawls here implicitly assumes CDT. So Rawls notes that the parties are better off not saving, regardless of how much or how little previous generations have saved, whereas the savings behaviour of previous generations is causally independent of present actions, that that kind of dominance reasoning is perfectly acceptable given CDT, but not given EDT. According to EDT, we can use dominance reasoning only if the relevant states are probabilistically independent of present acts. Only if the actions available to us provide no evidence as to which state is actual. And I think we can, in effect, read English as highlighting that that assumption does not hold in the present case. Even if there's nothing that we can do now, to affect the savings behaviour of previous generations, our choice of savings principle from behind the Veil of Ignorance is going to provide evidence about the savings behaviour of our ancestors because we’re the framework of Rawlsian ideal theory and that allows us to assume that previous generations will have acted justly, and thus will have obeyed whatever just savings principle we now select.
(22:48) So my proposal is that we're going to imagine the choice of just saving principle from behind the Veil of Ignorance as governed by EDT and I think that serves to clarify and precisify that the correct principle is that which the members of any generation would adopt as the one their generation is to follow, and as the principle they would want preceding generations to have followed. So assuming that the parties obey EDT, I think provides a nice, straightforward and concrete interpretation of that idea. Earlier we considered this proposal, but maybe we should have this constraint, which acts as a filter on the menu of Just Savings Principles in a kind of two-step process. And I think that just, you know, raised more questions than it answered. I think we now have a more plausible and transparent proposal to hand. The thought is that in choosing Just Savings Principles in the context of Rawlsian ideal theory, the parties are to take into account the fact that their choice provides evidence about the savings behaviour of past people. Since they obey EDT, that disincentivizes the choice of a low rate of saving, because that choice would provide evidence that relatively less must have been saved for them by prior generations.
(23:55) So the parties are going to have to try to balance the costs that they expect to bear from the choice of a higher rate of saving against the positive news value that's provided by a choice of this kind, in relation to the savings behaviour of previous people. Now exactly how they achieve that balancing act, that's going to depend on exactly what kind of decision principle they follow. I also think that when it comes to the identity effect of character of past savings behaviour, the interpretation I'm suggesting helps us to overcome the troubles we encountered earlier. So the problem I noted earlier, was that the parties might well want previous generations to have adhered to whatever idiosyncratic saving schedule is necessary for them to exist in the first place. But suppose we instead interpret the constraint that Rawls introduces along the lines suggested here. So we're imagining that the parties obey EDT and they're choosing a Just Savings Principle according to the news value of that decision.
(24:50) Now, while their existence may well depend causally on what savings schedule or what savings behaviour past people caught up to, I think nothing they could learn about the behaviour of past people through their choice of any particular savings rate, is going to be capable of raising or lowering their confidence that they're currently alive because their existence is presumably already certain. So the causal dependence of their existence on some particular schedule of savings behaviour is irrelevant to their decision and it drops out of consideration. Now, some of you are going to think that it's just undesirable for a contractarian theory of intergenerational justice to rely on EDT as a decision theory. And in particular, you might worry that this appeal to EDT to solve our problems is just another ad hoc modification of the contract framework to try to deal with problems of intergenerational justice. I, on the other hand, think that there are actually good reasons to believe that Rawlsian social contract theorists have all along viewed the parties as obeying EDT, setting aside entirely issues related to intergenerational justice, and I'll explain why in this section of the talk.
(25:58) So we ought to keep in mind that the parties in the original position are to strike a contract with one another. Unlike for Harsanyi, there's not one person behind the Veil of Ignorance, who gets to choose dictatorially, which we imagine that there are many different people. Rawls insists that although utilitarianism extends to social choice, the principle of rational choice for just one agent, Justice as Fairness represents social choice as a contract between distinct persons. And by representing the original position in that way, it reflects the fact that a plurality of distinct people with different ends is an essential feature of human societies. Now, whenever there are multiple people who need to come to a joint decision, it's very natural to ask what would happen if they didn't agree? Now we might insist as Rawls does that agreement is to be unanimous. But then what happens if the parties don't unanimously agree? Do they wait a bit and try again? Do they keep going until they reach a unanimous decision? If so, are there any costs to delay? Or could they choose to delay indefinitely? Or maybe there are no second chances. Maybe if they don't unanimously agree, they're all ejected into a state of nature to enjoy lives that are nasty, brutish, and short.
(27:13) Now I expect some of you may have grown impatient with this line of questioning and I admit I probably hammed up a bit, but in any case, what I've just been saying might seem to rest on a basic misunderstanding of how Rawls constructs the Original Position because you might think, because they're placed behind the Veil of Ignorance, the parties are identically situated. They're completely ignorant of their own idiosyncrasies including their own particular conception of the good. Therefore, they can expect what they choose will be chosen by everyone else. They can therefore ignore the possibility of disagreements, and we can ignore the possibility of disagreement in modeling how they will choose. Now I think all this sounds plausible enough, but I'm now going to suggest that we can't infer that the parties can ignore the possibility of disagreement and that we can ignore that possibility in modeling how they'll choose, unless we assume that they obey EDT, at least if our only reason for ignoring that possibility is this idea that they can expect whatever they choose to be chosen by everyone else. So here's why.
(28:13) Let's assume that each person adopts a partition of the relevant state space and has beliefs about which element in this K-partition is actual, that are capable of being represented by a set of probability functions. And we're going to assume furthermore, that the parties choose in accordance with some decision principle that relies on the expected utility of the available acts relative to the probability functions in her representor. For example, the MaxminEU or MMEU principle. Since the parties all have the same beliefs and preferences, these assumptions, I think, are fully compatible with the idea that they're all going to choose similarly. Now, as I noted previously, CDT requires the agents to compute the expected utility of each available act by using the unconditional prior probability of each state, whereas EDT requires the agent to use the conditional probability.
(29:01) Now, given the construction of the Original Position, the probability that other parties will choose some principal P’ conditional on the assumption that the agent herself chooses P is going to be zero relative to every probability function in the agent’s representor. However, the unconditional prior probability that the other parties would choose P’ cannot, I think, be presumed without further argument to render the possibility of divergent choices negligible when expected utilities are calculated in accordance with CDT. So to see this in a bit more detail, let's consider an extremely simplified representation of the choice of principles in the Original Position. Let's imagine that there are only two persons there. We'll call them x and y And there are only two principles from among which they can choose which we call P and Q. And we'll imagine that the decision faced by x is represented by the decision matrix that you see on the screen. So the leftmost column here is indicating which options are available to x and the uppermost row indicates the possible state which correspond to choices that could be made by the y.
(30:03) We're going to assume that x’s confidence that what y will choose can be represented by a set of probability functions R and for the present time where we're not going to exclude the possibility that R is a singleton, nor the possibility that it's a multi member set. So we let the agent’s confidence in each state be represented by a function, a set valued function CR whose values are a set of probabilities. And we're also going to assume that x knows that y is identically situated, such that the conditional probability that y chooses some principle given the x has chosen there is {1} according to every probability function in R. I should also note that the values in the cells here in the center of the matrix, they indicate x's utilities for the different possible outcomes. So the thought is that when the two agents choose some principle, they choose the same principle, then that principle gets to be implemented. And that gives x utility k if they both choose P and k’ if they both choose Q. I assume that if the two agents choose differently then something else would happen. We assume that the utility of this outcome is going to be the same regardless of how that divergence might come about. And I do denote that utility value by λ. So the crucial question is going to be whether x can ignore the value of λ or equivalently, whether we can ignore the value of λ in modeling how x will choose.
(31:23) And we're going to assume that the agent obeys the MMEU principle, which reduces to expected utility maximization if R is indeed a singleton. And I think it's easy to see that if you calculate expected utility, according to EDT, then the analysis of the choice problem is pretty straightforward. x preferred the act of choosing P to the act of choosing Q just in case k > k’ and vice versa. It doesn't matter what value is assigned to λ. Given EDT, the minimum expected utility of choosing P is given by the formula 1(k) + 0(λ) and the minimum expected utility of Q is given similarly by 0(λ) + 1(k’). The important insight here is that the outcome in which the parties choose differently receives zero weight when the minimum expected utility of either option is calculated in this way because we're relying on the conditional probabilities.
(32:16) By contrast, if expected utility is calculated in accordance with CDT, then the analysis becomes a good deal more complicated and I think there's now no guarantee that we can ignore the value of λ. Well, the reason it becomes more complicated is that we now need to know how to represent x's unconditional prior confidence concerning what y will choose. One key difficulty, of course, is that this is a strategic interaction. So x’s beliefs about what y will choose may depend on what x believes that y believes that x will choose, but what y believes that x will choose will depend on what y thinks that x believes that y believes that x will choose and so on. The problem of how to determine the agent’s or rational agent’s prior probability distribution over the other players’ strategies in these strategic interactions is, so far as I know, a Becks problem in the philosophical foundations of game theory that I am not going to be able to solve.
(33:09) But there are some specific assumptions that could be made about x's confidence, that would show that λ is irrelevant. So for example, if there were some way to show that x is rationally requires to adopt a uniform probability distribution over the possible states, then we could happily ignore the value of λ, because in that case, the minimum expected utility of choosing P would be point 0.5(k) + 0.5(λ), and the minimum expected utility of using Q would be 0.5(λ) + 0.5(k’). So since λ would contribute equally to the minimum expected utility of either option, its value would be irrelevant in that comparison. The question, of course, would be how to justify the imposition of a uniform probability distribution.
(33:51) Now we could instead suppose that x's unconditional confidence about what y I will choose should be represented instead by a multi membered set of probability functions as opposed to a singleton set. And we can assume that x's unconditional confidence that y chooses P corresponds to the closed interval [a, b], confidence that y chooses Q instead, corresponding to the closed interval [1 – b, 1 – a] . In this case, I also think we can't rule out that the value of λ matters without making specific assumptions. For example, if both k, k’ > λ, the minimum expected utility of choosing P is a(k) + (1 – a)(λ) and the minimum expected utility of choosing Q is b(λ) + (1 – b)(k’).
(34:36) The value of λ may then turn out to be relevant in the comparison unless we stipulate that b = 1 – a i.e. unless the [a, b] is symmetric about the midpoint of the unit interval. That actually isn't sufficient to fully justify ignoring λ If the parties obey and then MMEU, we also have to exclude that the [a, b] is the closed unit interval, because if it is the closed unit interval, then the value of λ could still matter to how x ought to choose. For example, in that case, if both k, k’ > λ then the two acts have the same minimum expected utility. Whereas if k, k’ < λ, then x prefers P to Q just in case k > k’. So, I have not argued here that there's no way that you could justify ignoring divergent choices within a causalist interpretation of the choice situation. All I want us to conclude is that the easy confidence with which Rawlsian social contract theory assumes the possibility of disagreement between the parties can be ignored in analyzing the decision problem they face. It's pretty difficult to justify without assuming that the parties obey EDT.
(35:46) Okay. So I now want to show how assuming that the parties in the Original Position obey EDT can help to solve some other problems, in particular, how it can help to address what I think is an especially forceful challenge that arises when a contract sharing framework for intergenerational justice is applied in conditions involving variable populations. So we're going to focus on a problem that's been posed by Johann Gustafson. The question we should ask ourselves is something like: on the basis of what we've argued so far about the best interpretation of the choice situation faced by the parties hasn't actually been shown that the first generation will choose to save anything at all for posterity, but you might well think that it has. From behind the Veil of Ignorance, the members of that first generation should think they could well be the members of some late live generation, who would be much better off if all previous generations had saved. Therefore, from behind the Veil of Ignorance, the members of the first generation have a strong incentive to choose a Just Savings Principle that prescribes a positive rate of saving, including a positive rate of saving for the very first generation. However, as Gustafsson points out, it's not so obvious that we can guarantee that the members of the first generation are ignorant of the generation to which they belong, because the choices that are available to them may indirectly tell them who they are.
(37:05) In particular, suppose that there is some option which is available to the members of the first generation, such that if it is selected as the Just Savings Principle, then it is the case that the first generation will consume everything, save nothing, and leave no descendants. In that case, we might think, the parties can infer that they must be the members of first generation, because otherwise, this principle would not be available for them to choose. Furthermore, since this principle guarantees that the members of the first generation enjoy the highest possible index of social primary goods, they will select it. They will save nothing. Now, of course, it's very natural to object that if the availability of such a principle would indeed allow the members of the first generation to work out that they are the members of the first generation, then it should not be available to choose precisely for that reason. Its availability seems like it's just defeating the very point of the Veil of Ignorance. But I think that's actually pretty questionable. I think the availability of this principle does not allow the parties to distinguish themselves from other existing people. After all, if it's chosen, then they're the only existing people, and we've just argued that this principle, if available, will be chosen. But the Veil of Ignorance is supposed to exclude the parties from having knowledge that would distinguish them from other existing people and allow them to tailor the principles they propose or vote for, in order to benefit themselves over others. It's not supposed to exclude them from knowing something about themselves, which is true of everybody else who will ever live.
(38:40) An alternative solution would be to try to rule out the availability of this kind of principle, a principle whose choice means that the first generation saves nothing and consumes everything and leaves no descendants, by arguing that it violates the generality constraint that Rawls imposes on principles of justice. So that generality constraint says that it must be possible to formulate these principles without the use of what would be intuitively recognized as proper names or rigged definite descriptions. Now, any principle that explicitly mentions the members of the first generation in particular, would seem to violate this constraint. But the sort of principle that we are imagining needn't do that, it seems. It could well be framed using only general terms. For example, it could be a principle according to which every generation is to save nothing and leave no descendants. Now, fortunately, I think there's a different way in which this problem that I've just stated can be solved. And that relies on the assumption that the parties obey EDT.
(39:39) So I said earlier, that if there's some option available to the parties in the Original Position, such that if it is selected as the just (Hayden's principle), then it will have been the case that the first generation will have consumed everything, saved nothing and left no descendants, then the parties can infer that they must be the members of the first generation. But is that really the case? So as it turns out, I think, how we answer that question depends on whether the people in the Original Position are to think of their choices as causally efficacious, or as having what we might call ‘pure news value’. And I think it's only if I think of our choices as causally efficacious, that this inference can be made. So suppose that the agent does think that by choosing P, she would make it the case that only the first generation exists, whereas by choosing P’, she would make it the case that at least two generations exist. In that case, I grant that she can reason as follows: the existence of the people who don't belong to the first generation depends causally on whether she selects P or P’, but her own existence cannot depend causally on whether she selects P or P’ because existence is a precondition for choosing. So she must be in the first generation.
(40:47) Suppose, on the other hand, that the agent does not think of her choice as being causally efficacious in any way, but merely is having news value. In other words, she thinks that by choosing P, that would provide strong evidence that only the first generation exist. Whereas by choosing P’, she would get strong evidence that at least two generations exist. But neither choice affects whether there exists only one generation or at least two generations. Her choice has pure news value. In that case, we can't reason as before, we can't say that, since the existence of later generations depends causally on the agent's choice, whereas our own existence doesn't, she's got to belong to the first generation. It's just not true that the existence of later generations depends causally on her choice, because nothing does. And nor, I think, can the agent reason to the same conclusion by an analogous inference that substitutes talk of evidential dependence for top of causal dependence. In other words, I don't think she can reason that since her choice provides evidence about whether there exists either exactly one or at least two generations, whereas her choice can't provide her with evidence about whether she exists as being already certain, then she should be certain that she's among the members of the first generation.
(41:58) So I conclude that if the agent thinks of her choice, not as causally determining how many generations exists, but millions providing evidence about how many generations exist, then she can't infer from her choice situation that she is in the first generation. She could, of course, infer that if she chooses the principle, which makes it certain that only the first generation exists. But if she instead chooses the principle on what she should be certain that multiple generations exist, then she ought not to be certain, that she's in the first generation. Now, of course, if the agent obeys CDT, then she's going to be indifferent between any options that don't causally affect anything that she cares about. So she might, if she thinks of her choices not being causally efficacious, end up choosing P anyway. But if she obeys EDT when she needn’t. Rather than learning that she belongs to that first impoverished generation, she may well prefer to learn that there exist multiple generations, many of which are much richer than the first generation. Since conditional having made this choice, she might well belong to those later, richer generations, for all she knows.
(43:03) Now, it could be so that it's one thing to assume that the parties obeyed evidential decision theory, and another to suppose that they view their choices as having pure news value, as only providing evidence about desirable or undesirable outcomes, and failing in any way to causally affect anything that they care about. But I think just as there are already strong grounds for supposing that Rawlsian social contract theory must interpret the parties as obeying EDT, I think there are also strong grounds for supposing that the theory must already have interpreted the parties as making choices among options that only have news value. So we can keep in mind that Rawls’ two principles serve primarily to regulate inequalities that are attached to people starting in places or their initial chances in life. Given that causes must precede their effects, that raises the question, well, how could the parties think of their choices as causally affecting their initial chances in life in any way. On its face, that would require the choice situation represented by the Original Position to take place at a point in time prior to the time of their birth? Bur there are some people who think the Original Position is supposed to be understood in this way. So Steven Pinker in the Blank Slate at one point discusses Rawls’ A Theory of Justice. And on his interpretation, the parties are ghosts ignorant of the machines they will haunt, deciding on what society to be born into. I think Rawls never presents the original position in this way, and many of his remarks are quite difficult to square with this sort of interpretation. For example, he's clear that we shouldn't think of the Original Position as representing any particular moment in time. He says it's something we should be able to enter at any time simply by reasoning for principles of justice, subject to the enumerated restrictions on information. And a view adopting the perspective of the Original Position would require us to think of ourselves as disembodied souls would leave Rawlsians open to the communitarian charge that justice as fairness rests on a view of the self as metaphysically prior to those characteristics of which the parties in the Original Position are ignorant, something that Rawls explicitly wants to avoid.
(45:07) Nonetheless, Pinker's interpretation is hardly unmotivated. If we think of the parties, as choosing in light of what they expect will have the best causal effects, and is choosing among principles in light of how they're expected to affect their own initial chances in life, then we are very naturally led to an interpretation of the Original Position as an assembly of disembodied souls who are waiting to be born. What alternative could we have? Well, I think EDT. We can think of the parties simply as ordinary people, who are placed in conditions of ignorance, including ignorance about what initial life chances they may have had. So these people may indeed enter the Original Position at any time just by reasoning from the enumerated restrictions on information. Now although we can't causally affect the past, we can learn about it. And we can in principle, learn about it through our own present actions as we do in the Newcomb problem. We learn what predicted it. Hence, although the parties in the Original Position cannot causally affect their starting places in life, within the constraints of Rawlsian ideal theory, their choices can provide them with evidence about the character of those starting places and in relation to their starting position in life, their choices where they have pure news value, but some have better news value than others. And that should move the parties to prefer some of those choices over others if they obey EDT.
(46:23) Now, there is admittedly another way in which we could try to avoid asking us to imagine ourselves as disembodied souls which Rawls may seem to have preferred, and that's to suppose that the individuals in the Original Position aren't actually existing people, but rather fictional representatives of those people. So it's not you or I who must choose from behind the Veil of Ignorance subject to conditions of ignorance, but fictional people who act as trustees for our interests. This seems to be how Rawls describes the Original Position in his later work. So political liberalism, he tells us that the parties are to be seen as representatives of free and equal citizens. Now, exactly how to interpret these remarks about representatives is not altogether clear, I think. But it is natural to infer that we're now being asked to think that the parties are no longer conceived as being the free and equal citizens themselves. So we might therefore imagine that the parties in the Original Position are instead guardian angels, who exist prior to the point of our birth, and who may conceive of their choices as causally efficacious with respect to our starting places in society.
(47:25) Now, I think the solution I've proposed is preferable, in large part because it doesn't require us to reject what I've otherwise characterized as the core assumption of Rawlsian social contract theory. That's the idea that principles of justice can be derived by asking what rational neutrally disinterested persons would collectively choose as the norms governing their basic little institutions, when they are deprived of knowledge about how they personally would fare under the terms of their collective agreements. I think it's striking that that assumption is affirmed in political liberalism almost immediately before it's apparently contradicted by the suggestion that the parties who decide in the Original Position aren't in fact, the people who have to live under the terms of their collective agreement. So Rawls says that the “fair terms of social cooperation are conceived as agreed to by those engaged in it, that is, by free and equal citizens who are born into the society in which they live their lives.” Now, it may not seem to really make much difference in the grand scheme of things whether we say principles of justice can be derived by asking what rational mutually disinterested persons would themselves collectively choose, as opposed to what would be chosen by a representative guardian angels. But I think it's not totally immaterial, if our aim is to outline fair terms of cooperation among citizens who are conceived as both equal and free.
(48:44) If the citizens are not imagined as choosing for themselves and instead as having their guardian angels choose for them on their behalf, then the Original Position is less apt to represent citizens as free beings. It's no surprise, therefore, that Rawls can wholly give up on the idea that it's you or I who have to be imagined as entering into a contract with one another in the Original Position as opposed to a gathering of our minders. So I conclude that insofar as Rawlsian social contract theory is a plausible framework for doing political philosophy whose implications for the study of intergenerational justice are worth taking seriously. They can't be strong objections to working out those implications by assuming that the parties are obeying EDT, and they're choosing among options that just have news value, pure news value. As I've already argued, there are good grounds for thinking that Rawlsian social contract theory already presupposes that the parties are deciding in that way. Furthermore, for the reasons I've argued, this assumption that they decide in this way seems to help us to solve what might otherwise seem to be insuperable obstacles to the development of a plausible Rawlsian theory of just saving. And that's it.
Thank you very much.