Two envelopes problem
Two envelopes problem
Let x be the amount in one envelope and 2x be the amount in the other envelope.
Let y be the amount in my envelope and z be the amount in the other envelope.
[math]P(y = x) = P(z = 2y) = {1\over2}\\P(y = 2x) = P(z = {y\over2}) = {1\over2}[/math]
The expected value of z is then:
[math]E(z) = {1\over2}2y + {1\over2}({y\over2}) = {5\over4}y[/math]
This would suggest that its rational to switch. The paradox is that we then apply this same reasoning again which then suggests that we should switch back. And so on ad infinitum.
However, let's assume any two values that satisfy the rules for the sake of argument; let's assume that one envelope contains £10 and the other envelope contains £20.
[math]P(y = 10) = P(z = 2y) = {1\over2}\\P(y = 20) = P(z = {y\over2}) = {1\over2}[/math]
The expected value of z is allegedly:
[math]E(z) = {1\over2}2y + {1\over2}({y\over2}) = {5\over4}y[/math]
But this commits a mathematical fallacy, conflating three different values of y.
[math]E(z) = {1\over2}(2\cdot10) + {1\over2}({20\over2}) = {5\over4}12[/math]
And my envelope doesn't contain £12, assuming that one envelope contains £10 and the other envelope contains £20.
On the other hand, if we open our envelope first and find £10 (not knowing the value of the other envelope):
[math]P(y = x|y = 10) = P(z = 2y|y = 10) = {1\over2}\\P(y = 2x|y = 10) = P(z = {y\over2}|y = 10) = {1\over2}[/math]
[math]E(z) = {1\over2}(2\cdot10) + {1\over2}({10\over2}) = {5\over4}10[/math]
Prima facie there is no fallacy with E(z), which does suggest to switch.
However, in the first case y = x and in the second case y = 2x, and so if in both cases y = 10 then in the first case x = 10 and in the second case x = 5. The calculation covertly conflates two different values of x, which again shows that it commits a fallacy.
The paradox has nothing to do with probability at all. It's just an improper use of variables when calculating the expected value.
E(z) is properly represented as:
[math]\begin{aligned}y_a &= {z\over2}\\y_b &= 2z\\\\E(z) &= {1\over2}2y_a + {1\over2}({y_b\over2})\\&={1\over2}({2z\over2}) + {1\over2}({2z\over2})\\&=z=2y_a= {y_b\over2}\end{aligned}[/math]
Which only tells us what we already know; that the other envelope contains either twice as much or half as much as my envelope.
So there's no reason to switch, as expected.
Imagine you are given two identical envelopes, each containing money. One contains twice as much as the other. You may pick one envelope and keep the money it contains. Having chosen an envelope at will, but before inspecting it, you are given the chance to switch envelopes. Should you switch?
Let x be the amount in one envelope and 2x be the amount in the other envelope.
Let y be the amount in my envelope and z be the amount in the other envelope.
[math]P(y = x) = P(z = 2y) = {1\over2}\\P(y = 2x) = P(z = {y\over2}) = {1\over2}[/math]
The expected value of z is then:
[math]E(z) = {1\over2}2y + {1\over2}({y\over2}) = {5\over4}y[/math]
This would suggest that its rational to switch. The paradox is that we then apply this same reasoning again which then suggests that we should switch back. And so on ad infinitum.
However, let's assume any two values that satisfy the rules for the sake of argument; let's assume that one envelope contains £10 and the other envelope contains £20.
[math]P(y = 10) = P(z = 2y) = {1\over2}\\P(y = 20) = P(z = {y\over2}) = {1\over2}[/math]
The expected value of z is allegedly:
[math]E(z) = {1\over2}2y + {1\over2}({y\over2}) = {5\over4}y[/math]
But this commits a mathematical fallacy, conflating three different values of y.
[math]E(z) = {1\over2}(2\cdot10) + {1\over2}({20\over2}) = {5\over4}12[/math]
And my envelope doesn't contain £12, assuming that one envelope contains £10 and the other envelope contains £20.
On the other hand, if we open our envelope first and find £10 (not knowing the value of the other envelope):
[math]P(y = x|y = 10) = P(z = 2y|y = 10) = {1\over2}\\P(y = 2x|y = 10) = P(z = {y\over2}|y = 10) = {1\over2}[/math]
[math]E(z) = {1\over2}(2\cdot10) + {1\over2}({10\over2}) = {5\over4}10[/math]
Prima facie there is no fallacy with E(z), which does suggest to switch.
However, in the first case y = x and in the second case y = 2x, and so if in both cases y = 10 then in the first case x = 10 and in the second case x = 5. The calculation covertly conflates two different values of x, which again shows that it commits a fallacy.
The paradox has nothing to do with probability at all. It's just an improper use of variables when calculating the expected value.
E(z) is properly represented as:
[math]\begin{aligned}y_a &= {z\over2}\\y_b &= 2z\\\\E(z) &= {1\over2}2y_a + {1\over2}({y_b\over2})\\&={1\over2}({2z\over2}) + {1\over2}({2z\over2})\\&=z=2y_a= {y_b\over2}\end{aligned}[/math]
Which only tells us what we already know; that the other envelope contains either twice as much or half as much as my envelope.
So there's no reason to switch, as expected.
Comments (149)
Yes, there's no possible world in which the fiver is both the larger and the smaller amount at the same time. So there's no possible world in which the other envelope may with equal probability contain either two fifty or a tenner, which is the only way to make a profit by switching.
The issue isn't with probability. Given that there's £10 in my envelope, the probability that the other envelope contains £5 is equal to the probability that the other envelope contains £20; that probability being 1/2.
As an analogy, if I flip a coin and hide it from you, it is correct for you to say that the probability that it landed heads is equal to the probability that it landed tails; that probability being 1/2.
The error made in the paradox is with it's calculation of the expected value of switching. It claims that the expected value is 0.5 * 2y + 0.5 * y/2, which is technically correct, but what it fails to explain is that it's 0.5 * 2y where y = x/2 + 0.5 * y/2 where y = 2x.
So a more accurate formulation is:
[math]E(x) = {1\over2}({2x\over2}) + {1\over2}({2x\over2}) = {x\over2} + {x\over2} = x[/math]
It seems to amount to us saying the same thing. We must either be, in your example, in a 10/5 world or a 10/20 world. In neither possible world is switching a rational strategy. (I think we've discussed this one before btw.) But maybe your way of expressing the issue is more accurate technically.
This seems confused though with you conflating non-overlapping scenarios and adding a poor analogy. If you want to use ordinary language, just say what I said. Your maths is on the ball though.
The amount of money in the envelope is irrelevant, so I'm not sure why that's part of your equation. You have two envelopes, and one is an intended outcome. If you pick randomly, its a 50% chance that you get the intended outcome. If you switch, its the same odds. Your x's, y's, and z's are irrelevant to this outcome.
Then you should read this, this, and this.
Thank you, I missed your link in the original OP as well.
I still don't understand how they apply their math to this situation.
The 5/4 is an expected value, which is an average. But the situation doesn't call for an average because its either or. You either get A, or 2A.
Numbers and their outcomes need to match the representative realities they are equating. Applying an average to a situation in which an average is not in consideration is not a rational application of math. Its like saying people have an average of 1.5 kids and then betting that you'll have half of a kid born. It doesn't make any sense.
P(Heads) = 0.5
P(Tails) = 0.5
I bet £1 on a coin toss. If I bet correctly I will win £1. What is my expected return if I bet on heads?
There's a 0.5 chance of losing £1 and a 0.5 chance of winning £1, so the expected return is:
(0.5 * -1) + (0.5 * 1) = 0
So there's no point in betting.
But if I can win £1.50 for betting correctly then the expected return is:
(0.5 * -1) + (0.5 * 1.5) = 0.25
So it's rational to bet.
In the case of the two envelopes paradox, the claim is that there's a 0.5 chance of doubling my winnings and a 0.5 chance of halving my winnings, and so the expected return is:
(0.5 * 2y) + (0.5 * y/2) = (5/4)y
Thanks Michael.
Ok, I think I see a little clearer what this is trying to do, but it still doesn't quite line up.
So it makes sense to use an average here in our decision whether to play. But it doesn't say anything about us switching between heads and tails repeatedly as our guess before we see the coin, which is the situation with the envelope.
All the average return on the envelope tells us is that if we keep playing over the long term, its going to average a return greater than one. But we actually have to play. It doesn't tell us anything about our indecision or which choice we should finally land on.
That's the supposed paradox. Switching doesn't increase our expected return, but the reasoning given suggests that it does. So we need to make sense of this contradiction.
Yeah, I don't see that as a paradox, just a misunderstanding of what the math is representing. The average represents the outcomes if you select one of the outcomes. It doesn't apply in any way to whether you should switch your decision before you see the reveal.
This can be easily seen by simply swapping money with a card that say 1, and then 2. Over the course of several selections, the average result will be X. But that's irrelevant to if we want a particular outcome. Lets say we start with wanting "2" to be selected. The calculated average of outcomes is the same. Now lets say that another person wants "1" to be selected. The calculated average of outcomes is the same.
The average of outcomes is irrelevant then to what we want to pick. It tells us no information regarding whether we should switch our choice or not. The addition of money does not change this, it only changes that everyone wants one of the 50% chance outcomes. But wanting a particular outcome has nothing to do with whether we should switch our choice before the reveal. Its always 50/50 no matter if there is money or a simply cards with numbers in them.
Finally, you have to calculate the average using the idea that each is 1/2, or a 50% chance of being selected. Meaning its hard set that its only a 50% chance. It is impossible to go from having a 50% chance, to then telling someone they have a greater chance of winning what they want if they switch their initial choice. Its still 50/50.
One way of framing it: You see two envelopes on the table. You open one. It contains X. You know that the remaining envelope contains 2X or X/2. Assuming both are equally likely, half the time you switch and get an extra X, and half the time you switch and lose an X/2, so you stand to gain 0.5X - 0.25X=0.25X by switching.
Another way of framing it: Two envelopes are on the table. You know one is twice the value of the other. You open your envelope and see it contains an amount. That fixes the other envelope to contain the other amount. You open your envelope and see you have the amount X. Do you know whether the other envelope has 2X or X/2 in it? Nope. What situations can this arise in?
Either you were given X and the other envelope contains X/2, xor you were given X and the other envelope contains 2X. Those are separate cases. If you were given X and the other envelope contains X/2, you'd lose X/2. If you were given X/2 and the other envelope contains X, you'd gain X/2. In the second case, if you were given X and the other envelope contains 2X, you'd gain X, if you were given 2X and the other envelope contains X, you'd lose X. If you're in the first case, the expected gain of switching is 0, if you're in the second case, the expected gain of switching is 0, both cases are equally likely, so the expected gain of switching is 0.
The relevant question which determines the accuracy of each of these is whether you model the value of the unobserved card as being "twice or half" for a given observed value of X (2X, 0.5X), or whether you model the value of the unobserved card as being drawn from the two scenarios (X, 0.5X) or (2X, X) and switching a move in two distinct ones.
The first framing construes "the sampling mechanism", which is where randomness arises in this scenario, as being the conditioning step - observing the value in the first envelope. The second framing construes "the sampling mechanism" as the allocation of pairs of values to the envelopes. In the first case, observing the value in the first envelope randomly assigns 0.5X or 2X to the unseen envelope, in the second case, observing the value in the first envelope tells you nothing about whether the unseen envelope and your envelope have possible values (X, X/2) or (X, 2X).
After 5 years I remain of the opinion that the second framing is appropriate, since you really do gain no information about what's in the second envelope given what's in the first.
But, nevertheless, if you relax equal probability assumptions of all cases or use a different loss function than expected loss, it can still be more rational to switch (or not switch) depending upon your problem set up. Like if you really needed $200, but didn't care about $100 at all, you'd switch if you got $100.
Said pretty much the same thing five years ago. I think Michael has at least partially come round.
Quoting Baden
Quoting Baden
Quoting Baden
I stand by the claim that the probability that the other envelope contains twice as much as my envelope is equal to the probability that the other envelope contains half as much as my envelope, that probability being 1/2.
But what I believe I have shown is that the formula used to calculate the expected value conflates two different values of y:
[math]\begin{aligned}y_1 &= {z\over2}\\\\y_2 &= 2z\\\\E(z) &= {1\over2}2y_1 + {1\over2}({y_2\over2})\\\\&={1\over2}({2z\over2}) + {1\over2}({2z\over2})\\\\&=z=2y_1= {y_2\over2}\end{aligned}[/math]
And as such it is not more rational to switch.
It is a mathematical fallacy that leads to the conclusion that [math]E(z) = {5\over4}y[/math], much like the fallacy that covertly divides by zero to prove that [math]1 = 2[/math].
I think you're technically right about the probability but so much the worse for the way we are supposed to talk about probability. The rest we're all solidly on the same page now I reckon.
I think this is true regardless of framing.
P(the other envelope = my envelope/2)
You can get that by counting the ways this happens. In the first case (X,X/2), the probability is 0.5. In the second case (2X, X), the probability is 0.5. The probability that we're in the first case is 0.5, which means the probability that we're in the first case and the other envelope contains half mine is 0.25 total. You get another 0.25 from the same reasoning in the other case. Law of total probability gives you 0.5 total for that probability.
No values are conflated in this case, as there's a hierarchy of random variables.
The first random variable is a coinflip, it tells you whether your envelope values are (X, X/2) or (2X, X). Within each case, you have a random variable that has probability half for each event. The only "conflation" which arises is when forgetting the distinction between the realisation of a random variable (it attaining a particular fixed value) and the event of a random variable taking a particular value (which may or may not have happened).
It isn't clear what your random variable z means in your calculation, since its sample space isn't defined. z is a particular fixed value in the first envelope, no? So it's nonrandom, as soon as you open it. So it doesn't have a probability to BE anything else.
It's explained in the OP:
Let x be the amount in one envelope and 2x be the amount in the other envelope.
Let y be the amount in my envelope and z be the amount in the other envelope.
So to bring together my various posts:
[math]A: P(y = x) = P(z = 2x) = P(z = 2y) = {1\over2}\\B: P(y = 2x) = P(z = x) = P(z = {y\over2}) = {1\over2}[/math]
The expected value of z is allegedly:
[math]E(z) = {1\over2}2y + {1\over2}({y\over2}) = {5\over4}y[/math]
However, if we assume that the amount in the other envelope (z) is £10 then:
[math]A: P(y = x | z = 10) = P(z = 2x | z = 10) = P(z = 2y | z = 10) = {1\over2}\\B: P(y = 2x | z = 10) = P(z = x | z = 10) = P(z = {y\over2} | z = 10) = {1\over2}[/math]
Notice that in A, [math]y = {z\over2} = 5[/math] and in B, [math]y = 2z = 20[/math]. That makes the addition performed in E(z) above a mathematical fallacy. You cannot add [math]y[/math] to [math]y\over4[/math], where each y has a different value, to get [math]{5\over4}y[/math].
E(z) is properly represented as:
[math]\begin{aligned}y_a &= {z\over2}\\\\y_b &= 2z\\\\E(z) &= {1\over2}2y_a + {1\over2}({y_b\over2})\\\\&={1\over2}({2z\over2}) + {1\over2}({2z\over2})\\\\&=z=2y_a= {y_b\over2}\end{aligned}[/math]
Which of course tells us nothing we didn't already know; that the other envelope contains either twice as much or half as much as my envelope.
And so there's no reason to switch, as expected.
Let P ( r | x , e) denote the probability of obtaining a value r when opening an envelope labelled e, where x represents the smallest amount of money in the two envelopes. Both e and x are assumed to be hidden variables in the sense that they aren't deducible from a drawn value of r.
The premises of the problem allow the following physical characterisation of P:
P ( r | x , e = 0) := Ind (r ; x)
P ( r | x, e = 1) := Ind (r ; 2x)
Here Ind(r; y) is the indicator function that outputs 1 when r equals y and is otherwise 0. (i.e. we get whatever is in the chosen envelope).
To "objectify" the problem, suppose that instead of choosing an envelope, an envelope is automatically drawn from a prior probability distribution p(e) and then opened. Then we could use Bayes theorem to compute a physically interpretable probability that a given value of r is "caused" by a given envelope:
P (e | r , x) =
P( r | e, x) P( e)
---------------------
P (r | x)
where P(r | x) is obtained by summing over e in the numerator, i.e
P(r|x) = P(r | e= 0, x) P(e = 0) + P(r | e = 1 , x) P(e=1)
But the problem doesn't specify the prior probability P(e) , which implies that P (e | r , x) doesn't have a unique solution, which in turn implies that the decision problem isn't well-posed.
Instead, one might follow the OP and a number of authors, and make the arbitrary selection
P (e =1) = P(e =2) = 1/2. In which case Bayes rule gives
P (r | x) = 0.5 Ind (x ; r +2r)
whose expected value is E [ r | x] = 1.5x
However, the problem is still ill-posed due to the fact that P(r | x) has zero mass at it's expected value of 1.5x :
(r = x) => (r < 1.5x )
(r = 2x) => (r > 1.5x )
Unless we make additional assumptions about P(x) , such that we can gain knowledge of x from knowledge of r, we cannot know which of these cases is the most likely, and so cannot deduce anything about the envelope e from the assumption of a uniform prior for P(e).
Assume that one envelope contains £10 and the other envelope contains £20.
Let y be the value of my chosen envelope and z be the value of the other envelope.
[math]P(y = 10) = P(z = 20) = P(z = 2y) = {1\over2}\\P(y = 20) = P(z = 10) = P(z = {y\over2}) = {1\over2}[/math]
The expected value of z is allegedly:
[math]E(z) = {1\over2}2y + {1\over2}({y\over2}) = {5\over4}y[/math]
This commits a mathematical fallacy, conflating three different values of [math]y[/math], resulting in a third value.
[math]\begin{aligned}E(z) &= {1\over2}(2\cdot10) + {1\over2}({20\over2}) = {5\over4}12\end{aligned}[/math]
But my envelope doesn't contain £12, given that one envelope contains £10 and the other envelope contains £20.
This fallacy is committed even when we only know the value of one envelope and even when we don't know the value of any envelope.
The paradox has nothing to do with probability at all. It's just an improper use of algebraic variables when calculating the expected value.
Our conclusions might agree. I am saying that only the statements 1,3 4 and 5 are valid in the "switching argument". For any prior probability distribution over envelope choices or envelope contents is consistent with the premise
"Imagine you are given two identical envelopes, each containing money. One contains twice as much as the other. You may pick one envelope and keep the money it contains. "
And a chosen prior probability distribution mustn't refer to a person's subjective beliefs about the envelopes, but to a logically consistent set of causal hypotheses concerning the generation of the envelopes and their contents, by which a contradiction isn't derivable.
All that the two-envelope premise concretely specifies is a relation from the cartesian product of booleans and naturals to the naturals
r :: B x N ---> N
r ( 0, a) = a
r ( 1 ,a) = 2a
From which we trivially obtain the inverse relation
ir :: N --> N x N
ir a = { (0,a) , (1 , a/2) }
There is literally nothing else that can be said, deductively or inductively, from the two envelope premise alone. One cannot extract a meaningful notion of probabilities, let alone expectations and averages in relation to that premise when it is stated without additional causal assumptions, which are necessarily to give physical meaning to probabilities in that context.
Why not?
I know that one envelope contains twice as much as the other. I pick one at random. What is the probability that I picked the smaller envelope? It seems perfectly correct to say [math]1\over2[/math].
By "meaningful probabilities" i am referring to epistemically meaningful probabilities that quantify how the real world is expected to behave on the basis of past experience, as opposed to purely mathematical probabilities that merely quantify mathematical properties such as combinations and symmetry.
For example, suppose that the surface area of "heads" on an unknown but symmetric coin is roughly half of it's total surface area. If nothing else is known and assumed about the coin , including how and where it is to be thrown, then one cannot deduce solely only on the basis of the coin's mathematical properties that it's likelihood of landing heads is 50/50. Nevertheless, probabilities are often assigned to such unknown objects purely on the basis of their mathematical properties, leading to the conflation of "mathematical probability" with "physical probability" or "epistemic probability".
The two-envelopes question is epistemic because it concerns decision making in the real-world; so more than mathematics must be appealed to when deciding whether or not a strategy exists for decision-making in that context.
So by your logic we can't even talk about the probability of a coin toss landing heads being [math]1\over2[/math]?
I don't think that's at all reasonable, or even relevant when we consider puzzles like this. For the sake of puzzles like this we assume a coin toss landing heads has a probability of [math]1\over2[/math], and we assume that my choice of envelope is truly random.
In my view , epistemic probabilities are derived from causal knowledge or assumptions on the basis abductive reasoning and repeated trials. One cannot derive causal knowledge from mathematical concepts as is demanded in the two-envelopes problem.
In my strong opinion, Laplace's principle of Indifference and the principle of maximum entropy are grave misuses of probability calculus that only lead to erroneous inferences and the conflation of ignorance with information.
Quoting Michael
It is fair to make probabilistic assumptions, but they shouldn't be arbitrary (unless for the sake of philosophical argument) and they must be explicitly stated, and any conclusion cannot go farther than what is explicitly assumed from the outset. The less one assumes, the less one can conclude.
Since the two-envelopes problem isn't well posed, more must be assumed and/or imprecise probabilities are needed.
The only rational response to the two-envelopes problem as it is traditionally stated without additional assumptions, is to reply
"The probability of getting a greater or lesser prize when opening the other envelope, is between 0 and 1"
I don't agree with this at all.
I have two envelopes. I have put £5 in one envelope and £10 in the other envelope. You use a true random number generator (which uses some quantum mechanical measurement like radioactive decay) to pick one of the envelopes.
What is the probability that you have picked the envelope containing £5? I say [math]1\over2[/math].
Or to be more accurate to the specific puzzle, I don't tell you how much is in each envelope; only that one contains twice as much as the other.
What is the probability that you have picked the envelope containing the smaller amount? I say [math]1\over2[/math].
Relative to your use of probabilities, how do you distinguish knowing that an outcome has probability 1/2 from not knowing the likelihood of an outcome?
From Wikipedia Two Envelopes Problem
1) Denote by A the amount in the player's selected envelope.
2) The probability that A is the smaller amount is 1/2, and that it is the larger amount is also 1/2.
3) The other envelope may contain either 2A or A/2.
4) If A is the smaller amount, then the other envelope contains 2A.
5) If A is the larger amount, then the other envelope contains A/2.
6) Thus the other envelope contains 2A with probability 1/2 and A/2 with probability 1/2.
7) So the expected value of the money in the other envelope is:
[math]\frac{1}{2}(2A)[/math] + [math]\frac{1}{2}(\frac{A}{2})[/math] = [math]\frac{5}{4}A[/math]
8) This is greater than A so, on average, the person reasons that they stand to gain by swapping.
In the first part of the equation: [math]\frac{1}{2}(2A)[/math] - the A is referring to the situation whereby the person has the smaller amount, say A is referring to £10.
In the second part of the equation: [math]\frac{1}{2}(\frac{A}{2})[/math] - the A is referring to the situation whereby the person has the larger amount, say A is referring to £20.
Therefore, within the same equation, A is referring to two different amounts. Am I correct in thinking that this is why the equation gives a false result.
Yes, thats what I show in the OP.
The paradox has nothing to do with probability at all. It's just an improper use of variables when calculating the expected value. :up:
I would say that is a potential cause of the paradox, but isn't the paradox itself. The paradox is the fact that the switching argument consists of a logically inconsistent set of probabilistic assumptions.
The switching argument, which produces a contradictory strategy for solving the two-envelope problem, starts by subjectively assuming, without evidence, the following conditional distribution, with respect to envelopes A and B whose values are a and b respectively :
P (B = (1/2) a | A = a) = P(B = 2a | A = a) = 1/2 For all values a
which yields the conditional expectation value
E [ B | A = a] = (5/4) a For all values a
So far, terrible reasoning with respect to the two-envelope problem, but no inconsistency.
Next, the switching argument further assumes, without evidence, that
P (A = a | B = (1/2) a) = 1 / 2 for all values a
However, using Bayes Theorem gives
P (A = a | B = (1/2) a) = P (B = (1/2) a | A = a) x P(A = a) / P(B = (1/2) a)
Hence the switching argument assumes that ratio on the right hand side is always 1 :
P(A = a) / P(B = (1/2) a) = 1 for all values a
This is the most deranged part of the argument. It amounts to asserting "If we know nothing about the prior distribution of A and nothing about the prior distribution of B, then we can be certain that the ratio of their probabilities is 1 - for our ignorance cancels out! "
We then derive the contradiction
E [ A | B = b ] = (5/4) b For all values b
It just assumes that:
P(A = the smaller envelope) = P(B = the smaller envelope) = 1/2
Which is correct. If either A is smaller than B or B is smaller than A, and if you pick one at random, then the probability that you picked the smaller envelope is 1/2. That's true a priori, is it not? A random selection from a set of two members.
Even if you want to say that your choice of A or B was biased in some way, assume that a true random number generator was used to decide which of A and B was to contain the smaller amount.
In fact, assume that you don't choose an envelope. You're just given envelope A, and a true random number generator was used to determine whether A should contain £10 or £20 (with B containing the other).
What is the probability that your envelope, A, contains £10? It's 1/2.
And this is true even if you don't know beforehand the values of the two envelopes. You're just told that there are two amounts, x and 2x, and that a true number generator was used to determine whether A should contain x or 2x (with B containing the other).
What is the probability that your envelope, A, contains £x? It's 1/2.
That expression is used to represent the same set of initial assumptions, but is less explicit with regards to its premises, such as the fact that some distribution is responsible for placing a certain amount of money in each envelope.
For debugging probabilistic arguments and resolving paradoxes it is better to use conditionals so that every premise is made explicit, so that problematic premises are more easily identified.
I tell you that one envelope contains £20 and the other envelope contains £10. A true random number generator was used to determine which of envelopes A and B should contain which amount.
You are given envelope A.
What is the probability that your envelope contains £20? It's 1/2.
I tell you that one envelope contains twice as much money as the other. A true random number generator was used to determine which of envelopes A and B should contain which amount.
You are given envelope A.
What is the probability that your envelope contains twice as much money as the other? It's 1/2.
There's no reason that the probability in the second case should be different to the probability in the first case. You knowing the actual amounts doesn't affect the probability that you were given the larger amount. It's always going to be 1/2.
Starting with the Wikipedia Two Envelopes Problem
Imagine you are given two identical envelopes, each containing money. One contains twice as much as the other. You may pick one envelope and keep the money it contains. Having chosen an envelope at will, but before inspecting it, you are given the chance to switch envelopes. Should you switch?
There are two identical envelopes A and B
Envelope A contains the value a, and envelope B contains the value b.
Either i) if the value a is 10 euros, then the value b is 20 euros or ii) if the value a is 20 euros, then the value b is 10 euros.
The probability of my picking an envelope with 10 euros is 50% and the probability of my picking an envelope with 20 euros is 50%.
What subjectivity are you referring to ?
Again, that's an additional subjective premise that isn't objectively implied by the 'physics' of the two-envelope premise. Nevertheless we can assume it for sake of argument and continue the discussion as an exercise in subjective Bayesian probability estimation in which we hope for self-consistency.
Quoting Michael
Yes there is. That's why conditional probabilities should be used. The conditional probability
P ( B = b | A = a)
that refers to the amount of money in the unopened envelope B when conditioned on the amount of money in opened envelope A, is generally unequal to
P (A =a | B = b)
that refers to the amount of money in unopened envelope A when conditioned on the amount of money in opened envelope B.
Knowing the former conditional distribution upon opening envelope A generally says nothing about the latter distribution unless the ratios of the priors P(A=a) and P(B=b) is assumed to be 1, as indicated by Bayes Theorem.
Only if you take the ratio to be 1 do the calculated subjective conditional expectations come into conflict with respect to decision making.
Assuming your subjective premise:
- Recall the fact that the implied subjective expectation regarding the amount of money gained upon opening either envelope is 1.5x, where x is the smallest amount of money (as i showed in my first analysis).
- Also recall that unlike in my first analysis that referred to the envelopes by labels that were assigned to them prior to them being opened, here we are defining A to refer to the opened envelope. In which case the probability that we will open A is 1!
Doesn't the fact that you will definitely open A first, together with the fact that your expected gain is 1.5x imply that your subjective prior for P(B) should be different to your subjective prior for P(A)?
and
First of all, thank you both for your interesting arguments. I am following them as much as I can understand the OP started by Michael.
I think the main substance of this dilemma is discerning on if it is "rational" or "not" switching one of the envelopes when you chose one already.
Michael showed to us arguments when switching is rational.
Envelope A contains 10 and Envelope B contains 20 . Then, according to that dilemma, I have the chance to switch my first option and get the twice I suppose to get in my choice. So, I agree with Michael that it's always going to be a 1/2 probability.
Yet, this scenario only happens if I am expecting a value. I decide to switch the envelopes because I want to expect to double the amount that supposedly is in the envelope. Again, switching is rational or "worthy" because I am expecting a value.
But... What about the indifference on the amount of each envelope? For example: Let the amounts in the envelopes be X and 2X. Now by swapping, the player may gain X or lose X. So the potential gain is equal to the potential loss.
I agree in this point. It could exist the scenario where switching or not is meaningless to the participant if the expected value doesn't exists.
I am sorry If I were not clear in my point. What I intend to say is the fact that this paradox and dilemma only works if the participant expects a value or loss, but it could be indifferent whatever the amount and the action of switching is not motivated by an expected value.
There is no opened envelope:
All I know is that one envelope contains twice as much as the other and that I picked one at random. I don't know what's in my envelope. The probability that I picked the more valuable envelope is 1/2.
No I didn't. I showed that the argument which purports to show that switching is rational commits a mathematical fallacy, and that there is no rational reason to switch.
Ok! I see this dilemma and your arguments clearer now. :up:
The switching argument begs to differ :
" 1. Denote by A the amount in the player's selected envelope."
The argument's computed expectation value of (5/4) A is a conditional expectation with respect to the variable A denoting the amount of money in the player's selected envelope.
Not opening the envelope and taking an expectation over the envelope's value means that A cannot be mentioned in the expectation value, even as a variable with an unspecified value.
The unconditional expectation of the players envelope value is 0.5 x M + 0.5 x 2M = 1.5M , where M is the mean of the unspecified distribution F for the smallest amount of money in an envelope. No paradox arises from this calculation.
I've subsequently come to realise however, that my initial analysis wasn't quite right. If my calculations are correct (that are too tedious to post here), the switching dilemma is caused by assuming an un-normalised "non-informative prior" for F(M), which causes the conditional expectations for each envelope to diverge.
Intuitively, this can be explained as follows:
if there could be any amount of money in the two envelopes, and if all monetary values are equally likely, then whatever value you observe in one envelope, the other envelope is likely to have an even higher value.
To remove this bug, one has to replace the physically implausible non-informative prior for F with a normalised distribution, so that the conditonal expectation for the unopened envelope gets lower as the observed money in one's opened envelope gets higher.
My argument with you is over the assigned probabilities. So ignore the expected value. I just want to know an answer to this:
1. One envelope contains £10
2. One envelope contains £20
3. I pick an envelope at random
4. I don't open my envelope
5. What is the probability that I picked the envelope containing £10?
My answer is 1/2. What is yours?
Sure, but as is mentioned on the Wikipedia article:
The paradox doesn't apply in that scenario, since the values of both envelopes are given.
To my understanding , the paradox requires,
1) Knowledge of the value of only one of the envelopes.
2) The assumption of a non-informative prior for P(M) , where M is the smallest (or greatest) money in the envelopes.
Quoting Michael
The contradiction vanishes when a normalised prior is used for P(M), since in that instance knowledge of the value of one envelope is indicative of the value of the other.
The paradox is premised on not knowing the value of any.
But do you agree that the probability in my example situation is 1/2?
That is flat out contradicted by the switching argument. Furthermore, without the premise of knowing the value of one of the envelopes, the paradox disappears.
To have an expectation value containing A on the one hand, and insisting that the envelope isn't opened on the other, is a bait and switch. Wikipedia's article isn't written very well, which might be part of the confusion.
Quoting Michael
It is a half if you assume it to be 1/2, but not necessarily. Consider for instance someone sending you the smaller of two envelopes through the post, according to a probability that they have decided. You open the letter and are informed that if you return the envelope and it's contents, you will receive another envelope that has half as much or twice as much.
Thats not what happens in this example. I am shown two envelopes, one containing £10 and one containing £20, and I freely choose one at random. I dont open it. The probability that I picked the one with £10 is 1/2.
Quoting sime
Its not. Its the premise of the switching argument.
I'll agree for sake of argument . I think the problem is how we are fitting our shared understanding of the problem to probability calculus.
In my preferred description, one of the envelopes is opened to reveal a quantity A, but It isn't known as to whether the other envelope is more than or less than A.
In your preferred description, the quantities of both envelopes is known a priori, but neither of the envelopes are opened.
The problem with your description, is that it runs contrary to how conditional probabilities and expectations are normally interpreted. For the information upon which a probability or expectation is conditioned, is normally interpreted as observed information, rather than possessed information that isn't revealed, compensated by assumed knowledge of other quantities.
I honestly don't understand your interpretation of probability. This seems very straightforward.
Maybe a different example. I have a red ball hidden in one hand and a blue ball hidden in my other hand. You point to one of my hands at random. What is the probability that you pointed to the hand holding the red ball? It's 1/2.
It depends on what interpretation of probability you are appealing to. For those of us who reject Laplace's principle of indifference, the answer is to refrain from asserting a subjective probability.
In any case, it isn't relevant to the two envelopes problem, for It can be reproduced by appealing to a causal interpretation of probability in which a person observes the contents of an envelope he is given, without him making any decisions. A Bayesian analysis reveals that the culprit of the paradox is the assignment of a non-informative prior to the distribution that generates the envelopes contents.
Without that assumption, the conditional expectations involved behave sensibly and the paradox dissolves.
I was leading to explaining why its relevant, but if you disagree with me on the red ball probability then its not going to go anywhere.
My understanding and resolution of the paradox is somewhat aligned with this perspective. The paradox was first introduced to me about 30 years ago by a friend who was a professor in statistics at UQAM (Université du Québec à Montréal). After further thought (and only after I was introduced to Bayes' theorem) I realized that the situation where it appears beneficial to switch the initially chosen envelope arises when we make an unrealistic assumption: that our belief about the probability distribution over possible envelope contents is that it is both uniform and infinite.
However, given any reasonably well-defined (and bounded) prior, opening one envelope may indeed inform our decision to either switch or stick to the original choice. This decision would be guided by the Expected Value (EV) of switching, which in turn is dictated by the revised probabilities concerning the potential contents of both envelopes. Notably, there's only one unique amount in the initially chosen envelope that would result in a zero EV for switching, rendering the choice indifferent.
The paradox seems to emerge from the assumption that opening the initial envelope provides equal probabilities for the second envelope containing either 10n or n/10 the amount in the first one, irrespective of the value of n. This is where I believe the core misunderstanding lies.
In the problem stated in the OP (taken from the Wikipedia article), there is no opening of the initial envelope:
Indeed, I concur that the paradox can also manifest when the option to switch envelopes is offered prior to opening the initially chosen one. The resolution I (and @sime also, I think) proposed also applies in this scenario. The apparent rationality of switching, in this case, is predicated on the concept that the Expected Value (EV) of the decision is expressed as (10n + n/10)/2 - n, a value that remains positive irrespective of n. This line of thought, however, is based on the assumption that the probabilities for the second envelope containing either 10n or n/10 are independent of the value of n.
If we adjust this assumption to reflect that these probabilities are indeed conditional upon n (in the manner that Bayesian updating with an informative prior would suggest), then it becomes plausible to hypothesize and likely not too challenging to demonstrate that the EV of switching remains zero.
Before I flip a fair coin, what is the probability that it will land on heads? I say 1/2.
After flipping the coin, but before looking at the result, what is the probability that it landed on heads? I say 1/2.
Now imagine that rather than heads and tails, there is a number printed on each side. One is a 10 and one is a 20.
Before I flip the coin, what is the probability that it will land on 10? I say 1/2.
After flipping the coin, but before looking at the result, what is the probability that it landed on 10? I say 1/2.
Now imagine that we dont know the exact numbers printed on the coin, only that one is twice the value of the other.
Before I flip the coin, what is the probability that it will land on the smaller number? I say 1/2.
After flipping the coin, but before looking at the result, what is the probability that it landed on the smaller number? I say 1/2.
If the probability that it landed on the smaller number is 1/2 then the probability that the other side is the larger number is 1/2, and if the probability that it landed on the larger number is 1/2 then the probability that the other side is the smaller number is 1/2.
So the probability that the other side is the smaller number is 1/2 and the probability that the other side is the larger number is 1/2.
And given that the larger number is twice the value of the smaller number, the probability that the other side is half the value is 1/2 and the probability that the other side is twice the value is 1/2.
Which step in this line of reasoning do you disagree with?
There is nothing there that I disagree with. But I don't think the paradox arises if the values of the two envelopes are stipulated in advance ($10 and $20, say). The paradox arises when we stipulate that the probability of the second envelope containing the largest amount is 1/2 and is not conditionally dependent on the value of the initially chosen envelope. In the example that you give, the probability of the second envelope containing $20 conditionally on the first envelope containing $10 is 1 (and vice versa).
Then the paradox arises. The probability that the other side is half the value is 1/2 and the probability that the other side is twice the value is 1/2. Do I take the value of the coin as it landed (which I still don't know), or do I take the value of the other side?
Quoting Pierre-Normand
We don't know the value of the initially chosen envelope because we don't look. All we know is that one envelope contains twice as much as the other. The probability that I will pick the smaller amount is 1/2, and so the probability that I did pick the smaller amount is 1/2, and so the probability that the other envelope contains the larger amount is 1/2, exactly as is the case with the coin toss.
Quoting Pierre-Normand
Yes, and the probability of the other side of the coin being tails conditionally on my side of the coin being heads is 1 (and vice versa). But this is irrelevant if we don't look at the result. The probability that the coin will land heads is 1/2, and so the probability that it did land heads is 1/2, and so the probability that the other side is tails is 1/2.
Now assume that tails is worth twice as much as heads. The probability that the other side is worth twice as much as my side is 1/2.
Yes, we're in full agreement. By "non-informative" I was referring to the distributional conditions of both uniformity of probability mass and of infinite support . But you're right in pointing out that a "non informative prior" is often used by Bayesians to refer only to uniformity of probability mass, in which the range of the support is considered to be a separate independent hyper-parameter.
1) Imagine you are given two identical envelopes, each containing money. One contains twice as much as the other. You may pick one envelope and keep the money it contains. Having chosen an envelope at will, but before inspecting it, you are given the chance to switch envelopes. Should you switch?
2) Denote by A the amount in the player's selected envelope.
3) The probability that A is the smaller amount is 1/2, and that it is the larger amount is also 1/2.
4) The other envelope may contain either 2A or A/2.
5) If A is the smaller amount, then the other envelope contains 2A.
6) If A is the larger amount, then the other envelope contains A/2.
7) So the expected value of the money in the other envelope is: 1/2(2A) + 1/2(A/2) = 5/4A
Item 4) "The other envelope may contain either 2A or A/2" is a problem
1) Two envelopes containing £10 and £20
2) Selected envelope contains A, either £10 or £20
3) The probability that A is the smaller amount is 1/2, and that it is the larger amount is also 1/2.
Item 4) is a problem as it contradicts item 1).
If the selected envelope contains £10, then the other envelope must contain £20. It is not true that the other envelope may contain either £20 or £5. It cannot contain £5.
If the selected envelope contains £20, then the other envelope must contain £10. It is not true that the other envelope may contain either £40 or £10. It cannot contain £40.
Items 4), 5) and 6) should be reworded as: "the other envelope may contain either 2A if A is the smaller amount or A/2 if A is the larger amount."
It then follows that there is no value in switching.
Ultimately the disagreement you're having with @Michael there is about the representation of the agent's belief state. If the agent knew that one envelope contained 10, and one envelope contained 20, and they saw 10, the other envelope has 20 in it with probability 1. Notably, this requires the agent knowing the values in both envelopes. In effect, there is a secondary conditioning step in your analysis; you're also conditioning on the set of values {n=10 or n=20} being the values in the envelope, and aggregating a (probably?) uniform prior on probability (the chance of each amount being in each envelope) into the n=10 and n=20 case.
The ability to condition on {n=10 or n=20} isn't something the agent can do while representing their state of evidence, if they don't know that n=10 or n=20 at the start.
Nevertheless, if they observe n=10 in the first envelope, I still think there's a problem with assigning a probability distribution on the values (5, 20) in the other envelope. This is because that stipulates there being three possible values in the envelopes combined; (5, 10, 20); whereas the agent knows only two are possible.
So IMO the issue with "conditioning" on (n=10) when opening the envelope isn't with the conditioning operation itself, it's with the background specification of the sample space's events. Two envelopes (pairs), where one must be half the other. On the possible values (5,10,20), this just (5,10) and (10,20).
A further illustration of why this is weird; if the agent really thinks that the possible values in the envelopes are (5,10,20), and they observe (10) in theirs, then assign equal probability mass to (5,20) based on that observation, that means initially there will have been a nonzero probability assigned to each (5,10,20). A choice between (5,10,20) needs a sampling mechanism associated with it (whence the randomisation), and there isn't one which produces three values anywhere in the problem.
In other words, given that your envelope is 10, making "the other envelope" (5,20) doesn't give you an event which is possible given the (alleged) randomisation that puts the pairs of values in the envelopes in the first place.
Edited: changed 1/3 to nonzero.
I think learning the value of your envelope is an uninformative posterior and so gives you no information with which to reassess the prior probability.
I tell you that one side of a fair coin is worth more than the other.
What is the probability that it will land on heads? 1/2. What is the probability that it will land on the more valuable side? 1/2.
After flipping it, but before looking at the result, what is the probability that it landed on heads? 1/2. What is the probability that it landed on the more valuable side? 1/2.
You check the coin and see that it landed on heads. What is the probability that it landed on the more valuable side? 1/2. You knowing that it landed on heads makes no difference as you don't know which of heads and tails is the more valuable.
I tell you that heads is worth £30. What is the probability that it landed on the more valuable side? 1/2. You knowing that it landed on heads and that heads is worth £30 makes no difference as you don't know which of heads and tails is the more valuable or how valuable the more valuable side is.
I think you either have to say that after flipping the coin, but before looking, the probability is undecidable (or, rather, "is either 1 or 0"), or you have to accept that the probability after looking and learning the value is 1/2. I think it's inconsistent to say anything else.
In general I think it's one of these cases where describing exactly what is random, what isn't, and what is conditioned upon is completely necessary. The debate in thread consists in different ways of setting up the randomness in the problem.
I'm of the opinion that knowing the value of your envelope tells you nothing about what you should do, so long as you can assume the probabilities of each value being in each envelope are equal. Here is what I think describes the randomness in the situation.
1 ) There are two pairs of envelopes. One of them contains (5,10), one of them contains (10,20).
2 ) Someone flips a coin and assigns you one of these pairs of envelopes.
Then depending upon the formulation, either you open it or don't.
Case A, you open it:
A3 ) If you open it and see 10, you don't know if your 10 is in the (5,10) pair or the (10,20) pair.
A4 ) Each of those is equally likely.
A5 ) Assume you're in the (5,10) pair, switching has 0 gain there under equal probability and expected loss.
A6 ) Assume you're in the (10,20) pair, switching has 0 gain under the same assumptions.
A7 ) The expected value of switching is 0.
Edit: A5 is a conditioning step (let your envelope pair be 5,10), A6 is a conditioning step (let your envelope pair by 10,20), A7 is a calculation using the law of total expectation
Case B, you don't open it:
B3 ) If you don't open it, you don't know if your pair is the (5,10) pair or the (10,20) pair.
The reasoning is exactly the same.
Open or don't open, with that framing, there's no gain from switching
If you want to assume the unknown envelope contains (5,20) given that you observe 10 in your envelope, the randomness mechanism looks like:
C1) Someone assigns 10 to your envelope and you open it.
C2 ) Someone assigns 5 or 20 to the other envelope with equal probability
C3 ) If you switch, you either lose 5 or gain 10, with equal probability
C4 ) You'd gain 2.5 by switching.
The major contrast is between the randomness coming in C2 with the randomness coming in the steps 1) and A3 or B3. Those are totally distinct sampling mechanisms (A3 and B3 are the same).
Edit: there actually isn't a conditioning step in this presentation, since the assignment of 10 to the envelope you see is nonrandom. You simply stipulate you open an envelope with 10 in it. If you want to frame this as conditioning, then the probability distribution is just assigning probability 1 to the value 10, for the value in your envelope. (And strictly speaking that isn't a probability distribution which comes with its own subdiscussion... And its support isn't clear, which comes with its own subdiscussion)
I've mentioned this before, but from the Wikipedia article:
It doesn't matter if you can frame the situation in such a way that there is no rational reason to switch. What matters is whether or not the switching argument in the paradox is sound. Specifically, is this formula correct?
[math]E(z) = {1\over2}2y + {1\over2}({y\over2}) = {5\over4}y[/math]
I don't think the problem with this formula is with its probability assignments. If I know that one envelope contains more money than the other, and if I pick one at random, then whether I open it or not the probability that I picked the envelope with the more money is [math]1\over2[/math]. That seems perfectly correct to me.
Probability assignments are done with respect to a space of events. The event spaces in case A and C are different.
I'm not sure what you mean. Perhaps you could answer the questions I posed earlier?
If I flip a coin and don't look at the results then what is the probability that it landed heads?
If heads is twice as valuable as tails then what is the probability that it landed on the more valuable side?
I say that the answer to both is [math]1\over2[/math].
It's a "wu" thing, the presumptions in your questions already define away how to dissolve the problem IMO. So I don't think it's wise of me to answer them. I do agree, given what you just said and your framing, that the calculation of gain is correct. What's wrong is the framing, not the calculation.
What I'm saying: case C makes the assignment of (5,20) to envelopes random. Even though (5,20) could never be an assignment of envelopes. Case C has a perfectly cromulent way of calculating expectations, it's just not the random assignment mechanism to the envelopes.
If I gave you an envelope containing 10 pounds, and I told you I had an envelope containing (5 or 20), you should switch. Just... That's not what situation we're in. We're in a situation where we don't know whether the pairs of envelopes are (5,10) or (10,20).
The framing is the paradox. I pick one of two envelopes at random. One is twice the value of the other. Given that the probability that I picked the more valuable envelope is [math]1\over2[/math], it is rational to switch. But also to then switch back.
And the distinction between Case C and Case A is my solution to it. Ambiguous phrasing suggests we're in Case C, whereas we're actually in case A or B.
I don't see how your cases solve the problem.
Quoting fdrake
If there is a 50% chance that I am in (5, 10) and a 50% chance that I am in (10, 20), and if I have £10, then there is a 50% chance that the other envelope contains £5 and a 50% chance that the other envelope contains £20. According to [math]E(z)[/math] it is rational to switch.
Quoting fdrake
Yes, the reasoning is the same. There is a 50% chance that the other envelope contains twice as much as what's in my envelope and a 50% chance that the other envelope contains half as much as what's in my envelope. According to [math]E(z)[/math] it is rational to switch.
That changes the sample space. The subcase in A5's sample space is (5,10) - IE an envelope contains 5 or 10, the subject just doesn't know that. The subcase in A6's sample space is (10,20), the subject just doesn't know that.
When you assign (5,20) to the other envelope, that's not an event in the sample space in A5 or A7. It's only an event in the sample space in case C.
That's case C! Drawing a new ball is the same as assigning a value to the other envelope. It's just not assigned at that step in the envelope set up. Hence, the distinction between (A and B) and C. It seems to me you're having difficulty seeing the distinction between A and C?
Exactly. You asked me to pick one, then treated that like drawing a ball from the bag.
I don't understand what you're saying or how this is any different to the envelopes.
I put a coloured ball in one envelope and a coloured ball in another envelope. You pick an envelope, open it, and see the ball to be red. What is the probability that the ball in the other envelope is white, given that it must be either white or blue?
To break it down, this comes with the following steps:
1) The possible ball types are red, blue, white.
2) Two distinct types of ball are taken.
3) They're put in two distinct envelopes.
4) I choose an envelope.
5 ) I know that the only balls I could have chosen are red, blue, white
6 ) I observe white.
7 ) I know that the only balls remaining are red, blue
8 ) Each occurs with probability half
5 ) There corresponds to telling the agent, in the envelope case, that the possible values in the other envelope are 5 or 20. This makes your framing case C.
Case A, however, does not have the agent aware that the possible values in the other envelope are 5 or 20. Precisely because they are epistemically indifferent to the cases where the pair is (5,10) or (5,20). The randomness there is epistemic. But it's equivalent to randomly assigning red or blue to the other package at that time by a coin flip. They represent exactly the same set up. But no coin flip is done at that stage; IE, nothing actually happens to the agent in the envelope set up in case A. But they would in case C or your ball example.
They are aware. They are told before the experiment starts that one envelope contains twice as much as the other. They open their envelope to find £10. They know that the possible values in the other envelope are £5 or £20.
Their awareness corresponds to a different fact. I don't think we're making any progress here. Do you understand the difference between case A and case C?
Re-wording my argument:
The puzzle states that one envelope contains twice as much as the other. Let one envelope contain x euros and the other 2x euros.
The player selects one envelope without opening it. This envelope contains either x euros or 2x euros
The puzzle states that the amount in the selected envelope is A
Possibility one
If the selected envelope contains x euros, then A = x
The puzzle goes on to state that the other envelope may contain either 2A or A/2, meaning either 2x euros or x/2 euros.
But the puzzle had previously established that the only amounts in the two envelopes are x euros and 2x euros.
Therefore, the statement "The other envelope may contain either 2A or A/2" leads to a contradiction.
Possibility two
If the selected envelope contains 2x euros, then A = 2x
The puzzle goes on to state that the other envelope may contain either 2A or A/2, meaning either 4x euros or x euros.
But the puzzle had previously established that the only amounts in the two envelopes are x euros and 2x euros.
Therefore, the statement "The other envelope may contain either 2A or A/2" leads to a contradiction.
Conclusion
It seems to me that the statement "The other envelope may contain either 2A or A/2" is where the flaw in the line of reasoning lies, as it leads to a contradiction.
Yes, and I think it is perfectly correct to say, in case A, that the probability that the other envelope contains £20 is 50%.
As I mentioned before, knowing that there is £10 in my envelope is an uninformative posterior. I know that one envelope contains twice as much as the other, I know that the probability that I will pick the more valuable envelope is 50%, and I know that after picking one the probability that I did pick the more valuable envelope is 50%, and so I know that the probability that the envelope I didn't pick is the more valuable envelope is 50%. That then leads to the E(z) calculation.
Opening the envelope and finding £10 or £20 or £60 provides me with no information that will lead me to reassess that prior probability.
It isn't though. "The other envelope contains 20" in case A subcase 1 has probability 0. It just isn't an event in that case. The only events in that case are "my envelope contains 5 and the other envelope contains 10" and "my envelope contains 10 and the other envelope contains 5". In case A subcase 2, "the other envelope contains 20" has probability half, the events in that one are "my envelope contains 10 and the other envelope contains 20" and "my envelope contains 20 and the other envelope contains 10".
Nowhere in that set up is there ever a way of considering that the other envelope contains 5 or 20! That's only an available move in case C.
There is a difference between these two claims:
1. I have a 1 in 2 chance of picking the larger of two envelopes.
2. The envelope I have chosen has a 1 in 2 chance of being the larger of the two envelopes.
Suppose the pair is (5, 10). I have a 1 in 2 chance of picking 10. But 10 does not have a 1 in 2 chance of being greater than 5; it _is_ greater than 5.
I flip a coin but don't look at the result. The probability that it landed heads is [math]1\over2[/math].
You both seem to be mixing up the participant's subjective assessment and some God's eye objective assessment.
Not true, a different subjective assessment. Your uncertainty concerns the values (5,20), which is case C. Case A's uncertainty concerns the pairs (5,10) and (10,20). When you switch, you don't know if you're in case A subcase 1 or case A subcase 2. So you average the gain of switching over each of those. Which is 0.
I don't know how you get 0.
If you're in subcase 1 then you lose £5 by switching. If you're in subcase 2 then you gain £10 by switching. There is a 50% chance that you're in subcase 1 and a 50% chance that you're in subcase 2. So according to the E(z) calculation, it is rational to switch.
You don't know whether you're in Case A subcase 1 or case A subcase 2. Each of those has probability half. If you're in case A subcase 1, if you switch you gain 5 or lose 5. If you're in case A subcase 2, if you switch you gain 10 or lose 10. Each of those has 0 gain.
The relevant question is; let's say you condition on having 10, does that give you any information on what pair you're in (IE, whether you're in subcase A 1 or subcase A 2), and the answer is no. If you knew what case you're in, knowing that you have 10 makes the choice deterministic. The conflation between case C and case A turns precisely on this point.
If you know you have 10, it becomes very tempting to say that you know the other envelope has 5 or 20. Whereas you don't know that, you just know that either (your envelope is 10 and the other is 20) or (your envelope is 10 and the other envelope is 5), and that tells you nothing about whether you're in case A subcase 1 or case A subcase 2.
When you're stipulating that you "have 10", you're using that to resolve uncertainty within subcases. Whereas the point you're shown it, you could only gain information about which subcase you're in - and it provides no information there.
What's the probability of your envelope being 10? Why can you condition on it?
If you're in subcase 1 and you have £10 and you switch then you lose £5, if you're in subcase 2 and you have £10 and you switch then you gain £10. Each of these is equally likely, and nothing else is possible, hence E(z) suggesting you should switch.
Eh, we're just asserting the same thing over and over again at this point. I stop.
Your assertion that 'only two values are possible' for the contents of the envelopes in the two-envelope paradox deserves further exploration. If we consider that the potential amounts are $(5, 10, 20), we might postulate some prior probabilities as follows:
P_1 = P(a) = P(($5, $10)) = 3/4,
P_2 = P(b) = P(($10, $20)) = 1/4,
which translates into priors for the unopened envelope:
P_3 = P(A) = P(($5)) = 3/8,
P_4 = P(B) = P(($10)) = 1/2,
P_5 = P(C) = P(($20)) = 1/8.
This distribution could reflect an informed guess about Joe, the envelope-filler, who is more likely to risk a smaller rather than a larger amount.
Suppose Ann chooses an envelope. If it contains either $5 or $20, she can unambiguously update her priors to 1 and 0, or 0 and 1, respectively. The decision to switch or not becomes trivial. If, however, her envelope contains $10, she must update her beliefs about the contents of the other envelope using Bayes' theorem:
P_updated(A) = P_updated((unseen=$5)) = P((unseen=$5) | (seen=$10)) = (1 * 3/8) / (1/2) = 3/4.
Given this posterior, if Ann sees $10 in her envelope, the expected value (EV) for switching is negative:
(3/4)$5 + (1/4)$20 - $10 = -$1.25.
Therefore, she should retain her $10, as her prior for Joe having included $20 is sufficiently low. Regardless, before she inspects the second envelope, both outcomes ($5 or $20) remain possible.
If we return to the original problem scenario (addressing @Michael's concern), where the first envelope remains sealed, the initial value probabilities become (3/8, 1/2, 1/8) for $5, $10, and $20 respectively. This gives an initial expected value of:
3/8 * $5 + 1/2 * $10 + 1/8 * $20 = $9.375.
The expected value if Ann switches relies on the weighted sum of the expected values for the unopened envelope, conditional on the potential contents of the chosen envelope. As choices of $5 and $20 guarantee $10 in the other envelope, while a choice of $10 leads to an expected value of $8.75 for the other envelope, this calculates to:
3/8 * $10 + 1/2 * $8.75 + 1/8 * $10 = $9.375. (Lo and behold!)
Simulation in R to demonstrate the different sampling mechanisms.
Onto something that sounds like Case_A, and something you can to do it to make it superficially resemble Case_C.
You notice that when you condition on "having your envelope be 10" in that set up, you're subsetting to cases where the gain is -5 or 10, and those occur with equal probability since the cases (5,10) and (10,20) occur with equal probability. If you were to condition on your envelope being 10 in the loop (case index=1), you end up with exactly the same gain numerically but it represents a subjectively different belief state for an agent. Why? The subsetting done at the end lets you look at the whole ensemble of cases where the first envelope was 10, where the assignment of 10 was random. Fixing the envelope as 10 within case makes the assignment of 10 nonrandom.
The thing we're butting heads on, in my view, is the Case_A inner loop line 1, which is where the randomness comes in through the allocation of pairs. If at any point, in the loop, the agent *knows* what case they're in, their gain is deterministic. When you grant that knowledge hypothetically, you either enter subcase A 1 (the first gain line) or subcase A 2 (the second gain line). Once you've done those hypothetical calculations, you reintroduce the randomness of allocating envelopes in the next line to choose the received gain.
You can conclude either strategy is optimal if you can vary the odds (Bayes or nonconstant probability) or the loss function (not expected value). Like if you don't care about amounts under 20 pounds, the optimal strategy is switching. Thus, I'm only really interested in the version where "all results are equally likely", since that seems essential to the ambiguity to me.
Quoting Pierre-Normand
As I wrote, the prior probabilities wouldn't be assigned to the numbers (5,10,20), they'd be assigned to the pairs (5,10) and (10,20). If your prior probability that the gameshow host would award someone a tiny amount like 5 is much lower than the gigantic amount 20, you'd switch if you observed 10. But if there's no difference in prior probabilities between (5,10) and (10,20), you gain nothing from seeing the event ("my envelope is 10"), because that's equivalent to the disjunctive event ( the pair is (5,10) or (10,20) ) and each constituent event is equally likely
Edit: then you've got to calculate the expectation of switching within the case (5,10) or (10,20). If you specify your envelope is 10 within case... that makes the other envelope nonrandom. If you specify it as 10 here and think that specification impacts which case you're in - (informing whether you're in (5,10) or (10,20), that's close to a category error. Specifically, that error tells you the other envelope could have been assigned 5 or 20, even though you're conditioning upon 10 within an already fixed sub-case; (5,10) or (10,20).
The conflation in the edit, I believe, is where the paradox arises from. Natural language phrasing doesn't distinguish between conditioning "at the start" (your conditioning influencing the assignment of the pair (5,10) or (10,20) - no influence) or "at the end" (your conditioning influencing which of (5,10) you have, or which of (10,20) you have, which is totally deterministic given you've determined the case you're in).
I dont think your simulation is relevant to our disagreement given that I dont believe that it is rational to switch.
I don't believe it's rational to switch either. I'd believe it's rational to switch if we were in Case C.
If we assume that all results are equally likely, the EV of switching given that the chosen envelope was seen to contain n is (2n + n/2)/2 - n = 1.5n. Hence whatever value n might be seen in the initially chosen envelope, it is irrational not to switch (assuming only our goal is to maximize EV). This gives rise to the paradox since if, after the initial dealing, the other envelope had been chosen and its content seen, switching would still be +EV.
Quoting fdrake
I did indeed first assigned priors to the two cases(5, 10) and (10, 20)and only derivatively calculated priors regarding the possible contents of the first chosen envelope (or of the other one).
I am not conditioning within an already fixed sub-case. Seeing that there is $10 in one envelope still leaves it open that there might be $5 or $20 dollars in the other one. To be clear, you seems to be emphasizing the role of conditioning and timing, arguing that my observation of the $10 should not affect my decision about whether I'm in the (5,10) or (10,20) case. You appear to argue that, by the time I've observed the $10, the case is already determined, and I can't switch between cases based on my observation.
However, it seems that you've misunderstood my use of Bayesian updating. I am not arguing that observing the $10 allows me to switch between cases. Rather, I'm saying that, given an observation of $10, I can update my beliefs about the probability of being in the (5,10) case or the (10,20) case.
(Some further discussion with GPT-4)
[hide]
GPT-4
The only suggestion I would make is to add a sentence or two at the end summarizing your main point, to ensure it is clear to fdrake and others following the discussion. Something like:
"Essentially, I'm arguing that upon observing a value in the chosen envelope, we can revise our beliefs about the probabilities of the different cases, which subsequently alters the expected value of switching envelopes. This can resolve the two-envelope paradox without assuming a non-uniform prior or altering the objective of maximizing expected value."
Other than that, your reply seems to cover all necessary points.
PN:
I think what ultimately resolves the paradox is that when we don't actually observe the content of the initially chosen envelope, then the EV if switching is the weighted average of the EV of switching for all the suitably conditioned (updated) cases regarding the possible contents of the other envelope. And this EV is zero. Is that clear?
GPT-4
Yes, your point is clear.
When we don't observe the content of the initially chosen envelope, the Expected Value (EV) of switching indeed comes down to the weighted average of the EV for each possible content of the other envelope. This, however, requires an important assumption: the weights used in this average should represent our prior beliefs about the probabilities of each case (i.e., each possible content of the other envelope). The probabilities have to be updated (conditioned) based on the information available, which in this case is none as we don't observe the content of the initially chosen envelope.
If these conditions are met, and given that one envelope has exactly double (or half) the money of the other, the overall EV of switching is indeed zero, which aligns with our intuition that without any extra information, both envelopes are equally likely to contain the larger amount.
PN:
About your remark that "the probabilities have to be updated (conditioned) based on the information available, which in this case is none as we don't observe the content of the initially chosen envelope," what I am rather assuming is that, for whatever content n of the initially chosen envelope, the EV of switching in that particular case, conditionally on our being in that case, is equal to the EV of switching calculated by means of Bayesian updating if we were to gain knowledge that we are in that case. Hence, it is not being assumed that we actually have the information.
GPT-4
Ah, I see what you're getting at now. You're right. I apologize for misunderstanding your point.
In the case of the Two Envelopes Problem, you are considering the expected value of switching under each possible value of the initially chosen envelope (i.e., you're taking a "case by case" approach) and then averaging these values according to their prior probabilities. In other words, you are considering hypothetical situations in which you would know the value of the initially chosen envelope, even though you do not actually observe its content.
This approach effectively evaluates the average EV of switching without actually knowing the content of the initially chosen envelope, but considering all possible outcomes.
To rephrase and clarify: The expected value (EV) of switching needs to be calculated conditionally for each possible outcome. Then, these conditional EVs are averaged over, using the prior probabilities associated with each outcome (not necessarily uniform). As long as the expected values of switching are calculated properly, taking into account the specific conditions of each case, the overall expected value of switching will turn out to be zero, regardless of the specific shape of the prior distribution. This reflects the intuitive idea that without additional information, switching or not switching should yield the same overall expected outcome, hence resolving the paradox.
[/hide]
Here's a swish link so you can try it out.
The idea is that if you run a query like this
you'll be able to backtrack all the way up the tree and back down, like this:
(There's a duplicate 10, but oh well. It shouldn't really be a tree but just a graph, since there's two routes from root to 10. Might redo it, but this was quick.)
This (nearly -- ignoring the dupe issue) represents @Michael's view of the problem construction.
BUT, the design of the actual problem is more like this:
If you insert a cut between the two predicates, backtracking all the way up to the selection of the envelope pair is blocked. When you ask for more solutions, you get just the one, not three.
[hide="Another way of running it."][/hide]
Asking for another solution -- backtracking -- corresponds cleanly to swapping. (It's why the predicates are written a little odd, to be both randomized and backtrackable.)
What does the cut correspond to?
I think I showed in the OP why this isn't the case. The EV calculation commits a fallacy, using the same variable to represent more than one value. That's the source of the paradox, not anything to do with probability.
Answering that gives you the origin of the paradox, right?
I think I see what you mean. Would you agree that which envelope pair you're in is conditionally independent of the observation that your envelope is 10? Assuming that equation (my envelope = 10) lets that "random variable which equals 10" remain random, anyway.
Quoting Pierre-Normand
I think this depends how you've set up your random variables to model the problem. If you can assign a random variable to the other envelope and "flip its coin", so to speak, you end up in case C where there's a positive EV of switching.
I can see that you've assigned probabilities to the pairs (5,10) and (10,20), I'm still unclear on what random variables you've assigned probabilities to, though. Where are you imagining the randomness comes in your set up?
Since you're an R user, you might find it interesting to define a model in RStan, using different choices for the prior P(S) for the smallest amount S put into a envelope. Provided the chosen prior P(S) is proper, a sample from the posterior distribution P( S | X) , where X is the observed quantity of one of the envelopes, will not be uniform, resulting in consistent and intuitive conditional expectations for E [ Y | X] (where Y refers to the quantity in the other envelope)
I'm trying to understand your use of conditioning, so can we start at the very beginning?
Let [math]x[/math] be the amount in one envelope and [math]2x[/math] be the amount in the other envelope.
I pick an envelope at random but don't look at the contents.
Let [math]y[/math] be the amount in my envelope.
[math]P(y = x) = {1\over2}\\P(y = 2x) = {1\over2}[/math]
Do you agree with this?
Next we look at the contents of our envelope and find it to contain £10.
We then want an answer to this:
[math]P(y = x|y = 10) =\space?\\P(y = 2x|y = 10) =\space?[/math]
My understanding is that given that we don't know the value of the other envelope, or how the values were initially chosen, [math]y = 10[/math] is an uninformative posterior; it provides us with no new information with which to reassess the prior probability. As such, [math]P(y = x|y = 10) = P(y = x)[/math].
This seems to be where we disagree?
Initial Speciication
1. Someone flips a coin determining if the envelope pair is Heads=(5,10), Tails=(10,20)
2. They give you the resultant envelope pair.
3. You open your envelope and see 10.
3. doesn't tell you anything about the result of the coin flip in one. What's the expectation of "the other envelope"? That means assigning a random variable to it. How do you do that? I think that scenario plays out like this:
Two Flips Specifications
1. Someone flips a coin determining if the envelope pair is Heads_1=(5,10), Tails_1=(10,20). Call this random variable Flip_1.
2. They give you the resultant envelope pair.
3. You open your envelope and see 10.
4. You flip a coin, Heads_2 means you're in case (5,10) and the other envelope is 5. Tails_2 means you're in the case (10,20) and the other envelope is 20. Call this random variable Flip_2.
I'm abbreviating Heads to H, Tails to T and Flip to F.
The paradox is arising due to stipulated relationships between Flip_1 and Flip_2.
The following reflects the intuition that the coin flip in 1 totally determines the realisation of the envelope in 4. It equates the two random variables. Let's say that we can equate the random variable Flip_1 and Flip_2, that means they have the same expectation. The expectation of Flip_1 would be... how do you average a pair? You don't. The expectation of Flip_2 is transparently 12.5. That means you can't equate the random variables, since they have differing expectations. In essence, the first random variable chooses between apples, and the latter chooses between oranges.
Let's say that instead of there being an equation of Flip_1 with Flip_2, there's the relationship:
P( (H_1=H_2) = 1 )
P( (T_1 = T_2) = 1 )
But that would mean H_1=(5,10) is equal to H_2 = 5, and they aren't equal. Similarly with T_1 and T_2. So the relationship can't be that.
Then there's a conditional specification:
P(F_2 = H_2 | F_1=H_1)=1
P(F_2 = H_2 | F_1=T_1)=0
P(F_2 = T_2 | F_1=H_1)=0
P(F_2 = T_2 | F_1=T_1)=1
Which legitimately makes sense. Then we can ask the question: what does "You open your envelope and see 10" relate to? It can't be a result of F_1, since those are pairs. It can't be a result of F_2, since that's the other envelope. It seems I can't plug it in anywhere in that specification. It's been used in the definition of F_2, to "remove" 10 from the appropriate space of values and render that a coin flip.
In fact, "opening my envelope and seeing 10" and updating on that is only interpretable as another modification of the set up:
Bivariate Distribution Specification
1) You generate a deviate from the following distribution (S,R), where S is the other envelope's value and R is your envelope's value. The possible values for S are (5,10,20) and the possible value for R are the same. S and R have a bivariate distribution (S,R) with the constraint (S=2R or R=2S). It has the following probabilities defining it:
P(S=5, R=10) = 0.25
P(S=10, R=20) =0.25
P(S=20, R=10) = 0.25
P(S=10, R=5) =0.25
A Bayesian solution modifies this distribution. I've just assumed all envelope pairs are equally likely.
2) You then observe R=10.
3) That gives you a conditional distribution P(S=5)=0.5, P(S=20)=0.5
4) Its expectation is 12.5 as desired.
5) You should switch, as S|R=10 has higher expectation than 10.
A different loss function modifies step 4.
And with that set up, you switch. With that set up, if you don't open your envelope, you don't switch.
There's the question of whether the "Bivariate Distribution Specification" reflects the envelope problem. It doesn't reflect the one on Wiki. The reason being the one on the wiki generates the deviate (A,A/2) OR (A,2A) exclusively when allocating the envelope, which isn't reflected in the agent's state of uncertainty surrounding the "other envelope" being (A/2, 2A).
It only resembles the one on the Wiki if you introduce the following extra deviate, another "flip" coinciding to the subject's state of uncertainty when pondering "the other envelope":
1) You generate a deviate from the following distribution (S,R), where S is the other envelope's value and R is your envelope's value. The possible values for S are (5,10,20) and the possible value for R are the same. S and R have a bivariate distribution (S,R) with the constraint (S=2R or R=2S). It has the following probabilities defining it:
P(S=5, R=10) = 0.25
P(S=10, R=20) =0.25
P(S=20, R=10) = 0.25
P(S=10, R=5) =0.25
2) You get the envelope 10, meaning the other envelope is 5 or 20 with equal probability. Call the random variable associated with these values F_1.
3) That gives you a conditional distribution P(F_1=5)=0.5, P(F_1=20)=0.5
4) Its expectation is 12.5 as desired.
5) You should switch, as F_1 produces a gain of 2.5 over not switching.
In that model, there's no relationship between the bivariate distribution (S,R) and F_1. Trying to force one gets us back to the Two Flips Specification, and its resultant equivocations. The presentation above is not supposed to make sense as a calculation, it's supposed to highlight where the account would not make sense.
Quoting sime
It would!
Maybe my response here helps.
Is it correct that, given what he knows, [math]P(y = x) = {1\over2}[/math]?
Is it correct that, given what he knows, [math]P(y = x|y =10) = P(y = x)[/math]?
If so then, given what he knows, [math]P(y = x|y = 10) = {1\over2}[/math].
Perhaps this is clearer if we understand that [math]P(y = x|y = 10)[/math] means "a rational person's credence that his envelope contains the smaller amount given that he knows that his envelope contains £10".
In the Wikipedia article, the problem is set up thus: "Imagine you are given two identical envelopes, each containing money. One contains twice as much as the other. You may pick one envelope and keep the money it contains. Having chosen an envelope at will, but before inspecting it, you are given the chance to switch envelopes. Should you switch?"
Your setup for the bivariate distribution specification is a special case of the problem statement and is perfectly in line with it. Let's call our participant Sue. Sue could be informed of this specific distribution, and it would represent her prior credence regarding the contents of the initially chosen envelope. If she were then to condition the Expected Value (EV) of switching on the hypothetical situation where her initially chosen envelope contains $10, the EV for switching, in that particular case, would be positive. This doesn't require an additional coin flip. She either is in the (5, 10) case or the (10, 20) case, with equal prior (and equal posterior) probabilities in this scenario. However, this is just one hypothetical situation.
There are other scenarios to consider. For instance, if Sue initially picked an envelope containing $5, she stands to gain $5 with certainty by switching. Conversely, if she initially picked an envelope with $20, she stands to lose $10 with certainty by switching.
Taking into account all three possibilities regarding the contents of her initially chosen envelope, her EV for switching is the weighted sum of the updated (i.e. conditioned) EVs for each case, where the weights are the prior probabilities for the three potential contents of the initial envelope. Regardless of the initial bivariate distribution, this calculation invariably results in an overall EV of zero for switching.
This approach also underlines the flaw the popular argument that, if sound, would generate the paradox. If we consider an initial bivariate distribution where the potential contents of the larger envelope range from $2 to $(2^m) (with m being very large) and are evenly distributed, it appears that the Expected Value (EV) of switching, conditioned on the content of the envelope being n, is positive in all cases except for the special case where n=2^m. This would suggest switching is the optimal strategy. However, this strategy still yields an overall EV of zero because in the infrequent situations where a loss is guaranteed, the amount lost nullifies all the gains from the other scenarios. Generalizing the problem in the way I suggested illustrates that this holds true even with non-uniform and unbounded (though normalizable) bivariate distributions.
The normalizability of any suitably chosen prior distribution specification (which represents Sue's credence) is essentially a reflection of her belief that there isn't an infinite amount of money in the universe. The fallacy in the 'always switch' strategy is somewhat akin to the flaw in Martingale roulette strategies.
In a sense yes. The cut is a "committed choice" thing, and you could take that as representing the dealer's not monkeying with the envelopes once he's offered them to you.
But we know how the game works objectively; the problem is why some ways of modeling your own epistemic situation as the player work and others, though quite natural, don't. It's clear that including the cut in your model works; but it's not as clear why you must include it.
It's not even perfectly clear to me what sort of epistemic move the cut is; what have I done when I've done that? I'm not considering certain options for backtracking; okay, but why shouldn't I consider those options? The dealer may not monkey with the envelopes after he presents them, but I still don't know which branch (perhaps of many) he went down, so why shouldn't I consider those?
It is, after all, simply true that the other envelope must contain half the value of mine or twice. (And the problem here doesn't seem to be spurious or-introduction, because {mine/2, mine*2} is the minimal set for which that claim is always true.)
But it's also just as clear that this not equivalent to the {x, 2x} framing. Both envelopes, mine included, are members of that set, but mine is of course never a member of {mine/2, mine*2}. Given a set that includes both, you get a proper disjunctive syllogism: whichever one I have, the other one can't be, so it's the only other one in the set. Reasoning from the "mine" framing isn't nearly so clean.
Feels like the explanation is right here, but I'm not quite seeing it.
I agree that it doesn't require an additional coin flip when you've interpreted it like you have. It's still a different interpretation than assigning a random deviate to the unobserved envelope after you've seen your envelope has 10 in it. The former is a distribution on pairs that you condition on the observation that your envelope is 10, then down to a univariate distribution. The latter takes the subject's epistemic state upon seeing their envelope and treats the unobserved envelope as a univariate coinflip. For the stated reasons, these aren't the same scenario as the random variables in them are different.
I do agree that if you fix (5,10) and (10,20) as the possible pairs, and you're in the bivariate case, it's rational to switch. The wikipedia article, instead, treats "A" as a random quantity, which is univariate. A prior on A is thus a prior on one variable. And it's not necessarily bounded - as you've pointed out previously, I believe.
In contrast, the priors in the bivariate case are all bounded - because they're on the probabilities of amounts being in envelopes. Probabilities get bounded priors since they lay in [0,1].
Why I'm drawing such a distinction between the Wikipedia case, and the one discussed in this thread, is because the Wikipedia case treats A as a random deviate, which already isn't a bivariate random variable. In that case, the remarks about unbounded support on A (the amount in the other envelope) make sense.
Quoting Pierre-Normand
I agree in the bivariate set up. The (5,10) and (10,5) cases have opposite gains from switching, so do the (10,20) and (20,10) cases. They're all equally likely. So the expected gain of switching is 0. That's the expectation of switching on the joint, rather than conditional, distribution.
Quoting Pierre-Normand
This makes sense. If we're in the case on the wiki, where A is a random variable, A's set of possible values are in [0, inf). A "uniform" prior there gives you an infinite expectation (or put another way, no expectation). At that point you get kind of a "wu" answer, since expected loss doesn't make any sense with that "distribution".
If you assume both envelopes are unobserved in the Wiki case, and you want to force an expectation to exist, there's a few analyses you can do:
Give the support of a distribution on A an upper bound, [0,n], Making it a genuine uniform distribution. The expectation of A is n/2. The other envelope is stipulated to contain (2n) or (n/4) based on a coin flip. If this really reflects the unobserved envelope's sampling mechanism, the expectation of the other envelope is n(1+1/8), which gives you switching being optimal. Switching will always be optimal. What stops you from "reasoning the other way" in this scenario is that the order of operations is important, if my envelope has values (2n), (n/4) with expectation (1+1/8)n it can't be identical with the other envelope's random variable which has expectation (n/2). One is a coinflip, one is a guess about the value of A. This might be an engineer's proof, though. "Why can't you equate the scenarios?" "You know they have different expectations, because the order you assign deviates to the envelopes matters from the set up. "Your" envelope and "the other" envelope aren't symmetric".
Now consider the case where the values A and 2A are assigned to the envelopes prior to opening. This gives the pair (A, 2A) are the possibilities. The envelope is unopened. The agent gets allocated one of the two (A, or 2A) based on the result of a coinflip. If they know the possible amounts in their envelope are A and 2A, and they just don't know whether A is in their envelope and 2A is in the other, or vice versa, then the expectation of their envelope is 1.5A. So is the expectation of the other envelope. Switching is then of expected value 0.
Quoting Pierre-Normand
I think that if the "always switch" strategy is wrong, it's only wrong when the sampling mechanism which proves it doesn't apply to the problem. It just isn't contestable that if you flip a coin, get 20 half the time and 5 half the time, the expectation is 12.5. The devil is in how this doesn't apply to the scenario.
In my view that comes down to misspecification of the sampling mechanism (like the infinite EV one), or losing track of how the random variables are defined. Like missing the asymmetry between your envelope and the other envelope's deviates, or equating the deviate of drawing a given pair (or the value A) with the subject's assignment of a deviate to the other envelope.
Its like the Monty Hall problem. But in this case, theres a 50/50 chance of choosing the envelope with the larger amount of money, so switching makes no sense. Youre not given any more information, so I really dont follow the rest of the calculations.
[Edit I see this has been gone over quite a bit, so forgive the late interjection.]
The argument is:
Let x be the amount in one envelope and 2x be the amount in the other envelope.
Let y be the amount in the chosen envelope and z be the amount in the unchosen envelope.
[math]P(y = x) = P(z = 2y) = {1\over2}\\P(y = 2x) = P(z = {y\over2}) = {1\over2}[/math]
The expected value of z is:
[math]E(z) = {1\over2}2y + {1\over2}({y\over2}) = {5\over4}y[/math]
Given that the expected value of the unchosen envelope is greater than the value of the chosen envelope, it is rational to switch.
Thus you can say, if I have the smaller, I have x and the other is 2x, or you can say, I have x/2 and the other is x. The main thing is that you have to use the same scheme for the other case, where you have the larger.
If you choose to label the envelope you've chosen "A", it's true that the other envelope contains either 2A or A/2, but that's because there are two consistent ways to assign actual variables, and the "or" there is capturing the alternative variable schemes available, not the values that might be in the other envelope. "A" is an alias for one of the members of {x, 2x} or for one of the members of {x, x/2}, but you still have to choose which before you can think about expected values.
The key is that "or" up there ("2A or A/2") is not a matter of probability at all; once you've chosen a set, which member of the chosen set A is, is a matter of probability. But we won't be summing over the possible choices of variable scheme.
Seems like nonsense to me. Theres a 50% chance of choosing the envelope with the greater money. Thats it. Thats all were given. If we were given any other information, as in the Monty Hall problem, then perhaps switching is correct. But in that problem you start with a 1/3 chance of choosing the prize, and another option is revealed afterwards. In this case, however, nothing has changed.
So what were left with is the claim that going through the motion of picking one envelope and then with absolutely nothing else changing switching and picking up the other envelope is somehow rational. Thats simply wrong.
Is it worth doing? Yes. But we have no way of improving our choice by switching, any more than switching our call of heads to tails.
The expected value doesnt matter. That just tells you its profitable to take the bet, especially in the long run. It doesnt change the probability of choice. If I offer $1,000,000 if it comes up heads and you pay me $100 if it comes up tails, thats a very profitable bet to take especially if run many times. But it tells us nothing whatsoever about whether to switch our call, because the probability of choosing correctly is still 50/50. The simple act of changing our minds doesnt magically change that.
Going back five years.
The "puzzle" is not figuring out what the right way to analyze this is -- although @Michael did argue, at length, in the old thread, for switching, and there are some interesting practical aspects to switching in the real world where the probability distributions need to be possible -- but why a seemingly natural way to analyze the game is wrong. There's not even agreement among analysts about whether this is a probability problem. (I don't think it is.)
Neither do I. I believe I showed what the problem is in the OP.
Yes, the conclusion is certainly wrong. The puzzle is in figuring out which step in the reasoning that leads to it is false.
According to this interpretation of the paradox, the paradox is only psychological and concerns the mental state of an agent who derives contradictory credence assignments that conflict with his understanding of his mental state. So this interpretation isn't adequately analysed by appealing to a physical model.
Suppose the participant called Bob, before opening either envelope, tells himself that he knows absolutely nothing regarding the smallest quantity of dollars S that has been inserted into one of the two envelopes:
Before opening either envelope, Bob reasons that since he knows absolutely nothing about the value of S, that he should appeal to Laplace's principle of indifference (PoI) by assigning equal credence to any of the permissible values for S. He justifies this to himself by arguing that if he truly knows nothing about the value for S, then he doesn't even know the currency denominations that is used to describe S. So he assigns
P(S = s) = P(S = 2s) = P(S = 3s) = P(S = 4s) ..... for every positive number s.
There is only one "distribution" satisfying those constraints, namely the constant function P(S) = c ,
that cannot be normalised, where c is any positive number which can therefore be set to c = 1. This is called an 'improper prior', and it's use often results in conflicting credence estimates, as shown by other paradoxes, such as Bertrand's Paradox.
Having chosen this so-called "prior", Bob reasons that when conditioned on the unknown quantity S, the unknown quantity X in his unopened envelope has the value S with a subjective probability p, else the value 2S with subjective probability (1 - p):
P(X | S) = p Ind (X,S) + (1- p) Ind (X, 2S) (where Ind is the indicator function)
He again appeals to PoI and assigns p = 1/2 (which merely a non-informative proper prior)
Substituting his choices for P(S) and p, Bob realises that the unnormalised joint distribution P(S,X) describing his joint credences for S and X is
P (S , X) is proportional to 0. 5 Ind (X ,S) + 0.5 Ind (X , 2S)
Summing over S, he derives his credences for X, namely P(X) that he realises is also an improper prior.
P(X) is proportional to 1
Consequently, his subjective 'unnormalized' posterior distribution (which does in fact sum to 1, but is nevertheless the ratio of the two unnormalised distributions P(S,X) and P(X) ) is described by
P (S | X) 'is proportioanal to' 0. 5 Ind (X ,S) + 0.5 Ind (X , 2S)
Bob wonders what would happen if he were to naively compute expectations over this 'unnormalised' distribution. He decides to compute the implied expectation value for the unopened envelope V conditioned on the value of his unopened envelope:
P (V = 2x | X = x ) = P(S = x | X = x) = 0.5
P(V = 0.5 x | X = x) = 1 - P(S = x | X = x) = 0.5
E [V | X ] = 5/4 X
Bob decides that he cannot accept this expectation value, because it contradicts his earlier credences that are totally agnostic with regards to the states of S and X. However, Bob also knows that this conditional expectation value is a fallacious value, due to the fact that his subjective probability distribution P(S | X) isn't really normalised, in the sense of it being the ratio of two unnormalised distributions P(S,X) and P(X).
Bob therefore knows how to avoid the paradox, without needing to revise his earlier credences.
Crucially, Bob realises that his 'unnormalised' subjective distribution P(S | X) should only be used when calculating ratios of P(S | X) .
So instead of strongly concluding that E [ V | X ] = 5/4 X that involved averaging with respect to an unnormalised posterior distribution P(S | X), he reasons more weakly to only conclude
P(V= 2x | X = x) / P( V = 0.5x | X = x) = 1
Which merely states that his credences for V=2X and V=0.5X should be the same.
So if Bob is mad enough to reason with subjective probability distributions (which IMO should never be used in science, and which can be avoided even when discussing credences by using imprecise probabilities), Bob can nevertheless avoid self-contradiction without revising his earlier credences, simply by recognising the distinction between legitimate and non-legitimate expectation values.
As an aside, AFAIK Bayesian applied statistics uses priors not based on previously collected data all the time. I don't know if researchers care about the distinction between subjective and objective priors nowadays.
Don't you mean:
[math]E(x) = {1\over2}({2x}) + {1\over2}({x}) = 1.5x[/math]
?
The [math]x[/math] in that case wasn't referring to the smaller value but to the value of the other envelope. I'll rephrase it to make it clearer:
Let [math]x[/math] be the amount in one envelope and [math]2x[/math] be the amount in the other envelope.
Let [math]y[/math] be the amount in the chosen envelope and [math]z[/math] be the amount in the unchosen envelope.
[math]P(y = x) = P(z = 2y) = {1\over2}\\P(y = 2x) = P(z = {y\over2}) = {1\over2}[/math]
[math]E(z) = {1\over2}2y[where\space y={1\over2}z] + {1\over2}({y\over2})[where\space y=2z]\\E(z) = {1\over2}({2z\over2}) + {1\over2}({2z\over2}) = {z\over2} + {z\over2} = z[/math]
[math]P(y = x) = P(z = 2y = 2x) = {1\over2}\\P(y = 2x) = P(z = {y\over2} = x) = {1\over2}\\\\E(z) = {1\over2}2y[where\space y={1\over2}z = x] + {1\over2}({y\over2})[where\space y=2z = 2x]\\\\E(z) = {1\over2}({2x}) + {1\over2}({2x\over2}) = x + {x\over2} = 1.5x = E(x)[/math]
I agree with that. I think I wrote something similar, but with more words, in my reply here. I think that's the right generating mechanism for what's random in the scenario. The agent doesn't know which of (x,2x) they have. But they know (x,2x) are the amounts in the envelopes. Rather than the agent doesn't know which of (x/2, 2x) is in the other envelope - since (x/2, 2x) isn't a pair of envelopes, it's a representation of the agent's epistemic uncertainty that doesn't reflect the uncertainty of allocating amounts to the envelope pair. The variable they ascribe to the other envelope doesn't describe the allocation of amounts to envelopes at all; so their expectation, based on that variable, can't be expected to reflect the true expectation of switching.
But if in reality, someone told them they had 10 in their envelope, flipped a coin to choose between 5 and 20, shoved the resultant amount into another envelope, and presented the choice to switch to the agent - in that scenario they should switch.
Do we have the same take?
I don't see any disagreement.
Same. Alright, I'm a bit less confused now. Thank you.
What are we supposed to learn here?
I mainly take it as a logic puzzle: there's a bit of fallacious reasoning, you have to spot it, explain why it's fallacious, and why it's attractive, easy to fall into.
On the other hand there is a way of taking it as an interesting probability problem, where you have to model several different sorts of issues, and under some additional conditions there are interesting switching systems available. (I don't have the background for that stuff anyway, as you know.)
So there's kinda an old school and a new school response possible.
The old school take is just another case study in faulty reasoning, but might have some interest in how people elide different sorts of uncertainty and chance. And last time I recall speculating about some use for the new school approach. (Something about improving performance without feedback -- it was pretty half-baked.)
Besides an opportunity to play with our respective toys, do you get anything out of this?
I could imagine using it for teaching probability modelling. Get students to analyse the problem. Then do it IRL with both sampling mechanisms. Should be a cool demonstration of "physical" differences between what's seen as a merely "epistemic" probability assignment!
Even if we accept the premise (as I do) that [math] P(z = 2y) = P(z = {y\over2}) = {1\over2}[/math], there's still no reason to switch, so the paradox has nothing to do with probability at all.
Eh, probability modelling also includes assigning random variables. It has a lot to do with what random variables you put in play.
But in this case they're using a variable to represent more than one value at the same time. It's a fallacy to add [math]y[/math] to [math]{1\over4}y[/math] for two different values of [math]y[/math].
This is most obvious when we assume two values for the sake of argument, e.g. £10 and £20, where [math]y[/math] is the value of the chosen envelope and [math]z[/math] the value of the unchosen envelope:
[math]P(y = 10) = P(z = 2y) = {1\over2}\\P(y = 20) = P(z = {y\over2}) = {1\over2}[/math]
[math]E(z) = {1\over2}2y + {1\over2}({y\over2}) = {5\over4}y\\E(z) = {1\over2}(2\cdot10) + {1\over2}({20\over2}) = {5\over4}12[/math]
Given that, assuming one envelope contains £10 and the other £20, the chosen envelope doesn't contain £12, it is false to assert that [math]E(z) = {5\over4}y[/math], where [math]y[/math] is the value of the chosen envelope. [math]y[/math] is in fact a different value entirely.
This is true for all [math]x[/math] and [math]2x[/math] pairs.
That's all the paradox is.
With respect to the case where we open the chosen envelope to find £10:
[math]P(y = x|y =10) = P(z = 2y|y =10) = {1\over2}\\P(y = 2x|y =10) = P(z = {y\over2}|y =10) = {1\over2}[/math]
[math]E(z) = {1\over2}2y + {1\over2}({y\over2})\\E(z) = {1\over2}2x + {1\over2}({2x\over2}) = {3\over2}x\\E(z) = {1\over2}(2\cdot10) + {1\over2}({2\cdot5\over2}) = {3\over2}8.\bar{3}[/math]
Given that the chosen envelope contains £10, the smaller envelope doesn't contain £8.333..., and so once again [math]E(z)[/math] commits a fallacy in using the same variable to represent more than one value at the same time.
Given a set {10, 20}, the expected value of a number selected from that set is 15. There's nothing wrong with your first set of equations, and it gives the right answer. You don't have to go through all that; you just need the average.
The second set of equations is different.
Are the situations described in the following questions the same? If not, what's the difference?
(1) What are the chances that y = x and the chances that y = 2x if y is chosen randomly from a set {x, 2x}? (You may, if you like, write it backwards as x = y and x = y/2.)
(2) What are the chances that a y chosen randomly from a set {x, 2x} was chosen from a set {y, 2y} and the chances it was chosen from a set {y/2, y}?
My concern is in explaining where the switching argument goes wrong.
The switching argument says that because [math]E(z) = {5\over4}y[/math], where [math]y[/math] is the value of the chosen envelope and [math]z[/math] the value of the unchosen envelope, it is rational to switch.
But given the set [math]\{10, 20\}[/math], [math]E(z) = 15 = {5\over4}12[/math], and [math]12[/math] isn't the value of the chosen envelope.
The same reasoning holds for all [math]\{x, 2x\}[/math] sets (where [math]x \gt 0[/math]).
What this shows is that the [math]y[/math] in [math]E(z) = {5\over4}y[/math] isn't the value of the chosen envelope, and so it does not suggest to switch.
Quoting Srap Tasmaner
I think I explained it best here:
And if the above is true then so too is [math]P(y = 2x|y = 10) = {1\over2}[/math].
Learning the value of the chosen envelope is an uninformative posterior, and so the prior probability of picking/having picked the envelope containing the smaller amount is maintained.
That's true. It's not in the set. Neither is 15. But 15 is, for all that, the expected value of both envelopes.
The total value of any pair of envelopes is 3x, where x is the smaller of the two. I choose one, and call it y; if I got the smaller then y is x, if the larger then y is 2x; so x is either y or y/2. The total value of the envelopes is then either 3y or 3y/2, so the average total value is 9y/4. Alternatively, you could just say that since x is y or it's y/2, the average value of x is 3y/4, again making a total of 9y/4, on average. By definition, my envelope is y, so the other envelope must be worth 5y/4, on average.
Where have I gone wrong?
I explained it in more detail in that earlier post above. The variable [math]y[/math] is used to represent three different values, two of which are the possible values of the chosen envelope, and the third (the one in [math]{5\over4}y[/math]) isnt the value of the chosen envelope, and so the conclusion that the unchosen envelope has a greater expected value than the chosen envelope doesnt follow.
1. [math]y = x[/math] or [math]y = 2x[/math]
2. [math]E(z) = {5\over4}y = {3\over2}x[/math]
3. [math]y = {6\over5}x[/math] (solving [math]{5\over4}y = {3\over2}x[/math] for [math]y[/math])
3 contradicts 1.
This is clearer if we assume the value of [math]x[/math] for the sake of argument:
1. [math]y = 10[/math] or [math]y = 20[/math]
2. [math]E(z) = {5\over4}y = 15[/math]
4. [math]y = 12[/math] (solving [math]{5\over4}y = 15[/math] for [math]y[/math])
3 contradicts 1.
So the conclusion of the switching argument, that [math]E(z) = {5\over4}y[/math], where [math]y[/math] is the value of the chosen envelope, is false.
Despite the initial definition, the [math]E(z)[/math] formula covertly redefines [math]y[/math].
And for the case where we know that [math]y = 10[/math]:
1. [math]x = 5[/math] or [math]x = 10[/math]
2. [math]E(z) = {3\over2}x = {5\over4}10[/math]
3. [math]x = 8.\bar{3}[/math] (solving [math]{3\over2}x = {5\over4}10[/math] for [math]x[/math])
3 contradicts 1.
The paradox arises because the same variable ([math]x[/math] and/or [math]y[/math]) is used to represent more than one value. It's a disguised fallacy, unrelated to any probability assignments.
The math is completely OK.
But the experiment cannot be set up because the distribution of the envelopes is not producible.
Give an example of a set of envelopes and I'll demonstrate.
I can be sure it isn't as per the post immediately before yours.
The task is quite simple. Name a few amounts that are in a set of envelopes and it becomes quite clear where the paradox is.
This and this explain it quite clearly I think.
I think you're conflating two different expectations. I think your post should read :
Let [math]y[/math] be the value of the chosen envelope and [math]z[/math] be the value of the unchosen envelope.
1. Let [math] y = x [/math] or [math] y = 2x [/math]
2. [math] E[z | y ] = {5\over4}y [/math] and [math] E[y | x] = {3\over2}x [/math]
This is the reasoning that leads to the switching argument:
[math]P(y = x) = P(z = 2y) = {1\over2}\\P(y = 2x) = P(z = {y\over2}) = {1\over2}[/math]
[math]E(z) = {1\over2}2y + {1\over2}({y\over2}) = {5\over4}y[/math]
But then by the exact same logic (or by performing the appropriate substitions):
[math]E(z) = {1\over2}2x + {1\over2}x = {3\over2}x[/math]
Therefore:
[math]E(z) = {5\over4}y = {3\over2}x[/math]
Yes, my previous criticism was directed more towards the confusing use of notation with respect to expectation values. It is clearer to condition expectation values on the variables and information they assume.
But the switching argument isn't invalid; It's contradictory expectation values don't appeal to faulty reasoning given acceptance of the premises. Rather the switching argument is unsound, for among it's premises is an improper prior distribution over x, the smallest amount of money in an envelope. And this premise isn't possible in a finite universe.
Intuitively, it's contradictory conclusions makes sense; if the smallest amount of money in an envelope could be any amount of money, and if the prior distribution over the smallest amount of money is sufficiently uniform, then whatever value is revealed in your envelope, the value of the other envelope is likelier to be higher.
I believe it does, as I showed above. It covertly redefines [math]y[/math] such that when it concludes [math]E(z) = {5\over4}y[/math], [math]y[/math] is no longer the value of the chosen envelope.
Quoting sime
It doesn't require anything like that. The only premises are that one envelope contains twice as much as the other and that I pick one at random. The probability that I will pick the envelope with the smaller amount is [math]1\over2[/math].
No covert redefinitions of y are happening, just covert pathological expectations over infinity. For the problem implicitly assumes
[math] P(x) = c [/math]
This is an improper prior that has a divergent mean.
[math] E [x] = \infty [/math]
The distribution of envelope values conditioned on the smallest value is symmetric, consisting of two spikes on the infinite plane at (x,2x) and (2x,x)
[math] P(y , z| x) = 0.5 \delta(x,2x) + 0.5 \delta (2x,x)[/math]
Summing out, say, z gives
[math] P(y | x) = 0.5 \delta(x) + 0.5 \delta (2x)[/math]
The two previous formulas imply the conditional
[math] P(z | y, x) = 0.5 \delta(2x)^{y=x} + 0.5 \delta(x)^{y=2x} [/math]
since [math] P(x) [/math] is improper, one cannot "integrate out" x from the last equation and get a well-defined distribution. Instead one gets a function defined up to proportionality.
[math] P(z | y) \propto 0.5 \delta(2y)+ 0.5\delta((1/2)y) [/math]
Taking the expectation of [math] y [/math] over this function as if it were a distribution gives
[math] E [ z | y ] = \frac 5 4 y [/math]
Hence by symmetry
[math] E [ y | z ] = \frac 5 4 z [/math]
No redefinitions, covert or otherwise, of any variable were involved here.
But I perhaps should correct myself a bit; the expectation values aren't contradictory in the sense of
[math] E [ y | z ] > E [ z | y ] > E [ y | z ] [/math]
rather, the expectations imply contradictory strategies in the context of utility maximisation.
It doesn't require anything like that. The only premises are that one envelope contains twice as much as the other and that I pick one at random. The probability that I will pick the envelope with the smaller amount is 1/2.
There is. I explained it above. I'll do it again.
Assume, for the sake of argument, that one envelope contains £10 and one envelope contains £20, and that I pick an envelope at random.
[math]P(y = 10) = P(z = 2y) = {1\over2}\\P(y = 20) = P(z = {y\over2}) = {1\over2}[/math]
[math]E(z) = {1\over2}2y + {1\over2}({y\over2}) = {5\over4}y\\E(z) = {1\over2}(2\cdot10) + {1\over2}({20\over2}) = {5\over4}12[/math]
Notice that in [math]E(z)[/math] the variable [math]y[/math] stands for 3 different values. In the first case it stands for the value of the smaller envelope (10), in the second case it stands for the value of the larger envelope (20), and in the third case it stands for a different value entirely (12).
That third value (12) isn't the value of the chosen envelope, given that the chosen envelope contains either £10 or £20.
This is true for every possible pair of values and is true even when we don't assume the values of the two envelopes (it's just harder to see).
And it should be obvious that the probability assignments are correct. If one envelope contains £10 and one envelope contains £20 and I pick one at random then the probability that I will pick the envelope that contains £10 (the smaller amount) is [math]1\over2[/math] and the probability that I will pick the envelope that contains £20 (the larger amount) is [math]1\over2[/math].
Yes, I see that. So why are you redefining y?
Your two definitions of E[z] aren't equivalent. The first one is implicitly referring to
E[ z | y], the conditional expectation of z given the value of envelope y. Whereas the second definition refers to E[z | x] the conditional expectation of z given the value of the smallest envelope x.
I'm not redefining y, the switching argument is. I'm showing you what it covertly does.
I haven't redefined y, and still derive the switching argument from it's premises.
They are equivalent.
There are two, equally probable, situations that [math]E(z)[/math] uses:
1. [math]z = 2y[/math] and [math]y = 10[/math]
2. [math]z = {y\over2}[/math] and [math]y = 20[/math]
So it's doing this:
[math]E(z) = {1\over2}2y[where\space y=10] + {1\over2}({y\over2})[where\space y=20][/math]
I'm just making this more explicit:
[math]E(z) = {1\over2}(2\cdot10) + {1\over2}({20\over2})[/math]
Which leads to:
[math]E(z) = 15 = {5\over4}12[/math]
Hence the redefinition of [math]y[/math]. It no longer stands for the value of the chosen envelope given that the value of the chosen envelope isn't 12.