Non-Normalizable Probability Measures for Fun and Profit

By Sean Carroll | May 17, 2010 9:43 am

Here’s a fun logic puzzle (see also here; originally found here). There’s a family resemblance to the Monty Hall problem, but the basic ideas are pretty distinct.

An eccentric benefactor holds two envelopes, and explains to you that they each contain money; one has two times as much cash as the other one. You are encouraged to open one, and you find $4,000 inside. Now your benefactor — who is a bit eccentric, remember — offers you a deal: you can either keep the $4,000, or you can trade for the other envelope. Which do you choose?

If you’re a tiny bit mathematically inclined, but don’t think too hard about it, it’s easy to jump to the conclusion that you should definitely switch. After all, there seems to be a 50% chance that the other envelope contains $2,000, and a 50% chance that it contains $8,000. So your expected value from switching is the average of what you will gain — ($2,000 + $8,000)/2 = $5,000 — minus the $4,000 you lose, for a net gain of $1,000. Pretty easy choice, right?

A moment’s reflection reveals a puzzle. The logic that convinces you to switch would have worked perfectly well no matter what had been in the first envelope you opened. But that original choice was complete arbitrary — you had an equal chance to choose either of the envelopes. So how could it always be right to switch after the choice was made, even though there is no Monty Hall figure who has given you new inside information?

Here’s where the non-normalizable measure comes in, as explained here and here. Think of it this way: imagine that we tweaked the setup by positing that one envelope had 100,000 times as much money as the other one. Then, upon opening the first one, you found $100,000 inside. Would you be tempted to switch?

I’m guessing you wouldn’t, for a simple reason: the two alternatives are that the other envelope contains $1 or $10,000,000,000, and they don’t seem equally likely. Eccentric or not, your benefactor is more likely to be risking one dollar as part of a crazy logic game than to be risking ten billion dollars. This seems like something of a extra-logical cop-out, but in fact it’s exactly the opposite; it takes the parameters of the problem very seriously.

The issue in this problem is that there couldn’t be a uniform distribution of probabilities for the amounts of money in the envelopes that stretches from zero to infinity. The total probability has to be normalized to one, which means that there can’t be an equal probability (no matter how small) for all possible initial values. Like it or not, you have to pick some initial probability distribution for how much money was in the envelopes — and if that distribution is finite (“normalizable”), you can extract yourself from the original puzzle.

We can make it more concrete. In the initial formulation of the problem, where one envelope has twice as much money as the other one, imagine that your assumed probability distribution is the following: it’s equally probable that the envelope with less money has any possible amount between $1 and $10,000. You see immediately that this changes the problem: namely, if you open the first envelope and find some amount between $10,001 and $20,000, you should absolutely not switch! Whereas, if you find $10,000 or less, there is a good argument for switching. But now it’s clear that you have indeed obtained new information by opening the first envelope; you can compare what was in that envelope to the assumed probability distribution. That particular probability distribution makes the point especially clear, but any well-defined choice will lead to a clear answer to the problem.


  • TimG

    Do you mean for the assumed probability distribution in your last paragraph to go from $1 to $20,000, not $1 to $10,000 as you currently state it?

  • Sean

    No; if the envelope with less money is between $1 and $10,000, the envelope with more money would be between $2 and $20,000.

  • TimG

    Ah, sorry, I misread it.

  • Jim

    This sounds similar to the reasons that many rational actors shouldn’t insure against “infinite” losses.

    Take for example the present situation in the Gulf. What additional amount of money should BP have spent to avoid this gushing oil well? If there was only an extremely small chance that this well would completely destroy the Gulf and Atlantic ecosystems, there would have been no reason to spend (the inverse of switching envelopes) a tremendous amount of money just to add a little more safety because the company would not survive the lawsuits (keeping the envelop) anyway.

  • ollie

    Delightful problem!

    What you are doing is showing that the practical problem is NOT equivalent to the following theoretical problem: you have two envelopes which have a number in them. One number is X and the other one is 2X. I offer you Y dollars right now. If you switch and you get the envelope with X, you get Y/2 dollars. Else if you switch and get 2X, you get 2Y dollars. You have no way of knowing if you have envelope X or 2X.

  • Sili

    I can’t decide if I should say “$10,002” or just accept that between is exclusive and inclusive in this case.

  • ollie

    I just thought it about it a bit more: this is really a conditional probability problem isn’t it? The question really is “what is the probability that I have the larger amount given that the upper bound is approximately X dollars”, isn’t it?

  • Pingback: 17 May 2010 (noonish) « blueollie()

  • John

    I always loved that problem. I came across it in high school, and couldn’t get over the fact that it was so clearly and obviously better to switch, but equally clear and obvious that it couldn’t matter. I didn’t figure it out until I decided I would write a little computer program to try it a bunch of times and figure out which was right. The instant I started working on the simulation, it became obvious :)

    It is also a great example of how people approach problems and especially disagreements. I’ve given the problem to groups of people before, and it’s amazing how hard it is to get people to refute one argument once they’ve settled on the second. Once people take sides, it can be incredibly hard to make them address the counter argument, other than by saying “I’m right, and so you must be wrong”. But of course, when both sides are “equally” right, you get people just repeating what they’ve said and totally shutting out the counter argument.

  • Koray

    I posted the following at the original blog:

    You have to identify the random events first.

    I have two envelopes: red and blue and I randomly pick one envelope to put the higher amount in. This is the random event that happens first. (Let’s say I picked blue.)

    Secondly, you pick one envelope at random, say, red. This is the second and *final* random event.

    There are *no more random events*. No matter how much you’d like to believe that “the other envelope has X% chance”, it does not since money doesn’t teleport between envelopes while you keep changing your mind. Everything’s been already decided.

  • Cody

    Reminds me of a problem Randall Munroe posted a while back, with intriguing results. I think it resolves Ollie’s first formulation of the problem as well. It doesn’t yet sit comfortably with me, but I get the idea and other people seem plenty confident that it works.

    Randall’s post:

    This cool puzzle (and solution) comes from my friend Mike.

    Alice secretly picks two different real numbers by an unknown process and puts them in two (abstract) envelopes. Bob chooses one of the two envelopes randomly (with a fair coin toss), and shows you the number in that envelope. You must now guess whether the number in the other, closed envelope is larger or smaller than the one you’ve seen.

    Is there a strategy which gives you a better than 50% chance of guessing correctly, no matter what procedure Alice used to pick her numbers?

    I initially thought there wasn’t, or that the problem was paradoxically defined, but it turns out that it’s perfectly valid and the answer is “Yes.” See my first comment for an example of one such winning strategy.

  • Alex

    Suppose the problem states that the amount of money given away will not exceed N. For all cases in which the first envelope contains an amount greater than N/2 the solution is trivial, but if the first envelope contains n, n < N/2 and the amounts are randomly determined (but constrained by N) it seems that you always benefit by switching, the original paradox. Have I completely missed the point?

    edit: I see the point now… I did miss it the first time:p

  • Eugene

    That’s funny that you post this. We were just having a workshop over the weekend on Eternal Inflation and was arguing about this!

  • Strether

    Sean & Koray —

    Isn’t it easier to follow Keynes and to think of this problem as relating to “the weight of the evidence,” under uncertainty, rather than as a conventional “probability” problem? I don’t see how hypothesizing a “probability distribution” before you open the first envelope makes anything easier.

    All you want to know is whether, upon seeing the amount of money in the first envelope, you have any probative evidence — or are, instead, just as ignorant (uncertain) as you were before. In order to think this through, you *don’t need to “pick an initial probability distribution” for all amounts of money.* You can just count the money in the first envelope, and *then* ask yourself whether you have a good reason to favor (or disfavor) the possibility that the other envelope contains 2x that amount. That’s the only bet that matters! There is no other relevant probability, and I submit that the “initial distribution” is both total speculation and window dressing for the correct solution to the puzzle.

  • Strether

    And I think I agree with BK’s comment on Cody’s puzzle theorem:

    “…. the [proposed solution] implicitly assumes that A and B are independently drawn. The logistic distribution assumes zero is the halfway point, and that values closer to plus or minus infinity are less likely than values near zero. We assume that 10 is less likely than nine (i.e. that Alice isn’t being human). These are all probably OK, but add structure that was not stated in the problem. Given what was written, the only not-incorrect distributional assumption would be a neutral prior, like the improper uniform distribution (p(x)=1 for all x), where the CDF is hopelessly undefined, and we’re back to not having a way to guess which envelope has the bigger value. Instead, you’ve made a number of assumptions in the logistic distribution, and then used those assumptions to assign subjective probabilities. That’s a fine human way to approach the problem, but works exactly to the extent that you believe the assumptions.”

  • Nav

    There are some interesting variations on the two envelope problem which are not so easily resolved. See this post of Tim Gowers (and comments) for a discussion:

    Here’s a summary:

    Let there be two envelopes with amount 10^n and 10^{n+1} dollars. And let P(n) = 2^{-n}, n is an integer greater than zero. Here the distribution is normalizable, but it still seems like you should switch. Given that you’ve picked an envelope with, say, 10^m dollars, then either n=m-1 (i.e. you picked the larger envelope with probability 2^{-(m-1)}) or n=m (smaller envelope with probability 2^{-m}). This gives conditional probabilities of 2/3 of having the smaller envelope and 1/3 of having the larger. However, since the potential gain of switching is much larger it’s still worth it – the expected return from switching is a factor of 3.4.

    Any thoughts?

  • dtlocke

    “Like it or not, you have to pick some initial probability distribution for how much money was in the envelopes — and if that distribution is finite (”normalizable”), you can extract yourself from the original puzzle.”

    And what is the solution if the probability distribution is not finite?

  • Brendon Brewer

    >>All you want to know is whether, upon seeing the amount of money in the first envelope, you have any probative evidence — or are, instead, just as ignorant (uncertain) as you were before. In order to think this through, you *don’t need to “pick an initial probability distribution” for all amounts of money.*<<

    Except that you do…any rule for deciding your state of knowledge after obtaining the information is equivalent to a choice for the prior probabilities.

  • Brendon Brewer

    >>And what is the solution if the probability distribution is not finite?<infinity.

    Thank you Ed Jaynes.

  • Sean

    Nav– that’s a good extension of the puzzle. But I tend to believe the resolution that is alluded to right in the post you linked — the probability distribution over dollars is actually not normalizable. (The expected payoff is infinite.)

  • Koray


    You haven’t learned anything to favor the other envelope, which is why people tend to call this a paradox because they think the “math” tells them otherwise.

    In the 10^n with 2^(-n) variant it’s the same mistake. Re-stating the problem:
    * 2nd random event: they pick red or blue at random to place the larger amount.
    * 3rd random event: you pick red or blue at random.

    I didn’t even list the 1st random event because it’s irrelevant. You can draw a decision tree and assign any probability you want for generation of amounts in envelopes, but you’ll see that all that matters is whether their pick matches yours.

  • Strether

    @21, Koray, I *think* I’m trying to say the same thing you are: The math leads people astray because they insist on creating a “probability distribution” when the problem is really one of uncertainty, in the sense J.M. Keynes described. All that matters is whether you can find some “ticket” out of the fundamental uncertainty — like the idea of the upper bound on the payoff that Sean introduced.

    @18, B Brewer, no, you actually *don’t* need to choose a probability function (1) “before” opening the first envelope or (2) for any dollar amount *other than* 2x what’s in that envelope. So the whole idea of generalized (but finite) “initial probability distribution” isn’t doing any work here.

    Or look at it another way: 300 years ago, no one would have “picked an initial probability distribution,” because they didn’t know what that was. But they could still solve this puzzle (in situations when it can be solved at all) by common sense. The solution can be dressed up in “probability” language, but you’re really just assigning math symbols to a guess (which creates an illusion of hard rationality, which generates the “paradox,” because the true nature of the guess is buried in the math, rather than helpfully exposed by it).

  • Moshe

    Interesting example, maybe this can be phrased as an anomaly: the symmetry between the two initial envelopes is broken by any attempt to regulate the problem. Not sure what this says about eternal inflation, the benefactor with unlimited resources is the spatially infinite universe, maybe assuming it is strictly infinite makes no logical sense?

  • Brian137

    There would be a natural upper bound on the possibilities: there are only a finite number of dollars in circulation. Before we hit that limit, depending on the size of the envelopes, only a certain number of bills could fit inside (ask any drug dealer).

  • Strether

    I recommend the comment of Bernard Kirzner, M.D., at the page linked by Nav.

  • TwoHalves

    See also “The Error in the Two Envelope Paradox”, arXiv:physics/0608172, which retains a flat distribution.

  • steeleweed

    The difference between theory and reality:

    If I flip a coin, it’s 50/50 on coming up Heads. If I flip it a second time, it’s still 50/50, presumably ad infinitum But if I flip Heads 1000 times in a row, would you still consider the 1001st flip a 50/50? Not me!

    The odds of the flips being honest and truly random and yielding 1000 consecutive Heads are a lot higher than the odds of it being crooked and non-random.

  • Tim

    A related problem:
    Suppose you have two cards with a random number on each. You have to turn over one card, observe the number and then judge whether it is larger than the hidden number on the second. Obviously you can win exactly half the time by always stopping with the first number, or always stopping with the second number, without even peeking.

    Claim: There exists a strategy with which you can win this game more than half the time.

    (Readers take comfort: When mathematicians first heard this claim, many of us found it implausible)
    solution in Ted Hills Optimal stopping article:

  • Joshua Zelinsky

    My favorite version of this puzzle is a similar argument with two people considering the deal of switching how much money each one has in their wallet. Almost identical logic to this one suggests that they both have positive expected value. The problem is the same- the lack of a consistent uniform distribution.

  • Craig

    Whenever you follow the proper rules for statistical analysis and end up with a paradox you should say ‘The problem is poorly formulated’. There is no single correct answer to what is wrong with the problem. The Monty Hall problems is completely different because it can be formulated so that no paradoxes emerge.

  • OXO

    I took the view that if the opened envelope had $4000 cash, the other one could contain either:

    $2000 and a cheque for $1,000,000 (not cash you see, well you did say he was eccentric)



  • Philoponus

    Sean, your point about the probability distribution is well taken, but what drives me crazy in these examples are the assumptions (or maybe stipulations) that utility of the bet is well measured by its nominal payoff ($x) and this distribution increases linearly or some uniform fashion ($4000 is twice as good as $2000 and half as good for the agent as $8000). Except for trivial bets, this is almost NEVER the case with real agents. Suppose you are poor person whose quality of life could be significantly improved by $4000 windfall. Then you are absolutely crazy to risk this amount in a gamble that may cut your windfall in half. Or suppose you are another poor person who needs $7500 for critical medical treatment. Then $4000 won’t do it and he must gamble for enough money. Familiar stuff, I know, but if the ultimate point here is to assess the rationality of the bets real agents make, then we cannot model their utilities with some linear nominal payoff distribution.

  • Koray


    There are probability distributions (and the problem is not poorly stated), but the 2^(-n) distribution is irrelevant. They might as well flip a coin or roll a dice (biased if needed) to determine whether it’s going to be {$100, $1000} or {$10, $100} in the envelopes. Draw the tree and look at the cases where you find $100.

    Yes, uncertainty is not synonymous with randomness; a coin on my table does not have a “1/2 chance” of having heads up just because you can’t see it, imho. However, there’s an interpretation of probability called Subjective in philosophy where you are allowed to assign probabilities to things any way you want (as long as a Dutch book can’t be made against you, which keeps it still logically consistent).

  • Aaron F.

    This is the best explanation of the two-envelope paradox that I’ve ever heard! I never realized that the “paradox” comes from the absence of an obvious choice of prior, and the convention that priors, in probability puzzles, are rarely stated explicitly.

    Has anyone here seen the two-envelope paradox used to illustrate the necessity of choosing a prior distribution (whether or not you realize you’re doing it), and the dependence of your predictions on the choice of prior?

  • http://none Peter Coates

    Congratulations on a very nice explanation, but this also a great example of why mathematical arguments are so unpersuasive to most people. ‘Should’ to a mathematician means that the expected value of a choice is greater, but personal satisfaction and expected value are not at all the same, nor is the human value of a sum of money linear in the dollar amount.

    Take the first case. If you’re a mathematics student who is four thousand dollars short of next semesters tuition, then the four thousand in hand is life changing. On the other hand, an additional four thousand would merely be nice to have. Gambling and ending up with two thousand might mean that you’ll be driving a cab instead of attending school.

    Even without a life-changing condition, two psychological factors argue for the opposite conclusion. The personal joy derived from windfalls tails off in something like a logarithmic curve. Finding a $100 bill only makes you a little happier than finding a $20. Moreover, winning and losing are not symmetrical for most people. Most people are more distressed by loss than they are pleased by gain.

  • Sam Gralla

    Thank you for pointing out this awesome example (of why speculating about initial conditions for the universe is meaningless)

  • efp

    After all, there seems to be a 50% chance that the other envelope contains $2,000, and a 50% chance that it contains $8,000.

    This is the mistake in the problem. You’re treating the amount of money in the second envelope as a random variable, where it is not. This would be the logic if you are given an amount X, and given a choice to take 2x or x/2 based on a fair coin flip. You take the flip.

    If you open the envelope and find X, it is true that the other envelope contains either 2X or X/2, but there is not a 50% chance of it being either one; it IS one or the other. The concept of ‘expectation value’ does not apply to a non-random value.

    If you start the analysis calling the lesser value X, there being a 50% probability of having chosen X or 2X to begin with, then the expectation value of switching or not is identical.

  • Strether

    @efp, Maybe I’m heading off-track — but isn’t that also true in the Monty Hall problem? The prize IS behind one of the two curtains that Monty didn’t open. Yet the conventional solution holds that your original odds of winning the car were 1/3, and that after Monty opens a curtain, there is a 2/3 chance or expectation that the car is behind the curtain you didn’t pick. That seems to be treating the objectively non-random location of the car as a random variable — but it also seems correct??

  • Pingback: A Couple of Posts From Cosmic Variance « College Math Teaching()

  • Baby Bones

    Hi Sean,

    I’d like to draw your attention to two bizarre little problems. One is Penney’s Game. I read about it back in the seventies in an issue of Scientific American and rediscovered it when I created a similar problem. It’s a game where you pick a sequence of heads and tails, and for any choice you make, your opponent can find a sequence of the same length that is more probable. It is a binary probabilistic version of Rocks, Paper, Scissors.

    The other involves infinite sums of distributions. That is, in comparing infinite sums of Gaussians and Cauchy Lorentz distributions, one arrives at a strange insight. In Euwald’s trick to deal with charge distributions, one uses a sum of identical equally spaced Gaussians (equally space the centers of the Gaussians along the x-axis). Way back when I studied this I was confounded by the statement that the value of the sum (not the area under the curve) is periodic in the finite case. For years I thought that this was approximate because of edge effects, but eventually I realized that even if you take the infinite case, the sum will be truly periodic and finite. Furthermore, it works in any finite number of dimensions. The max value of the sum function converges.

    However, the same is not true for all distributions. If you replace the Gaussians with Cauchy-Lorentz distributions, the sum now diverges for two or more dimensions (and maybe even for one dimension). I should be careful here to say that this doesn’t have anything to do with conditional convergence in the sense of alternating series. The series are all positive definite. Hence, one is left with the strange observation that convergence depends entirely on the shape of the function and not the amount of stuff under the curve. I wonder if this points to a flaw in risk management strategies employed in the investment business. For instance, the natural tendency would be to spread out risk in the market. This phenomena would be more like distributing according to a Lorentzian (a long tailed distribution) rather than a Gaussian, so if you model such risk with Gaussians you will significantly underestimate the actual risk. If other investors make the same mistake, the underestimates compound.

  • efp

    @Strether, you’re not off-track; it is subtle. I had to think a while about how to answer. If I say: there are two envelopes with amounts X and 2X in them, my sample space is {X, 2X}. If I pick one with a coin flip, my expectation value is 3X/2. This is the amount my average will converge to if I repeat the process many times (of course, it is impossible to get 3X/2 on any single experiment). Suppose I pick one, and call the amount in it Y. Then either Y = X or Y = 2X. Buy my sample space is still {X, 2X}, because if Y = X then the other envelope MUST have 2X, and if Y = 2X then the other envelope MUST have X.

    It is correct that the other contains an amount 2Y or Y/2, and it’s even correct to say those possibilities are equally likely, but it is incorrect to use those probabilities to compute an expectation value, as the scenarios {Y, 2Y} and {Y, Y/2} represent seperate sample spaces. If the amount in the envelopes is fixed at the beginning (at 3X), it is only one of these (Y=X or Y=2X), and expectation values must be computed from one or the other (with differnt values of Y!), both of which will come to 3X/2.

    If the amount in the envelopes is not fixed at the beginning, then it’s the coin-flip game and you will win by switching. The ‘actuality’ of what’s in the envelopes (or behind the curtains) is established at the beginning of the game, when the sample space is defined. The trick to the paradox is fooling you into thinking the sample space is {2Y, Y/2}, which is only true if it is a stochastic process.

  • efp

    I thought of a more concise way to put it: in order to compute an expectation value, you need to know what the sample space is. In this case, you need ot know the total amount in the envelopes.

    If you don’t know this, you can’t (correctly) compute an expectation value after opening one envelope, any more than you could before opening either envelope.

    If you do know this, and you open one envelope, you know what is in the other.

  • Strether

    @efp. Thanks! So it sounds like — pretty much as I was saying upthread :-) — the error isn’t so much in treating the unknown as a “variable,” the error is in thinking you can deduce something about the “variability” when you don’t know the sample space. As I’d put it, we’re in the land of uncertainty, not probability.

  • Brendon J. Brewer

    >>As I’d put it, we’re in the land of uncertainty, not probability.<<

    Same thing (Cox, Jaynes, Caticha, et al)

  • skepticistical rootoftisast

    Myself, my ‘ol daddy used to say “a bird in the hand worth two in the bush”. I’d keep the first envelope if it had sufficient money to be useful to me (right now that’s anything over $4.99).

  • Ahmad

    @ 44 (Brendon): No they’re not the same in real life. Real life is quantum mechanical (throughout). For details on the difference between bayesian formulations of probability and quantum mechanical uncertainty, please see Feynman 1982 “simulating physics with computers”.

    As for the discussion here, I actually agree with you. The reason they are disagreeing with you is a matter of perspective (they don’t see what you are saying), which is often the case in such discussions on probability.

  • Ahmad

    @ 45: People will be thinking you are making the silliest comment on the thread so far, but you are actually 100% correct. I would keep what I know for sure that I have. Winning in games of chance is dependent on large, isolated ensembles of the same experiment being repeated. For one chance, I would do the same thing as yourself.

  • Strether

    Hmmm …..

    Reviewing the bidding in this thread, we have one group of obviously expert people saying that no matter what you do with the envelopes, you’re implicitly using a probability distribution. (But unless I misread them, this group seems to admit that a distribution is probably useless in a given case.) We have another group which — excluding myself — seems similarly well-informed about the subject, saying that, unless boundary constraints on the size of the prize begin to kick in, it will always be logically wrong to try to assign or use an initial probability distribution.

    Is this always the way the discussion of this paradox plays out? Are we hitting on some deep philosophical divide? Or is there some authoritative source that discusses this paradox (or a similar one) and adjudicates between these analyses?

  • locke

    This discussion reminds me of the time the Monty Hall problem was popular. In talking to members of the math dept. at my college at the time (I was actually a part of that dept. but thru a historical accident as I’m no mathematician), not a single person could correctly solve the problem (and it’s quite a good liberal arts college and a well respected department). After
    reading the solution, EVERY SINGLE MEMBER declared that he had known that all along and that the solution was, of course, trivial.

  • Aaron Sheldon

    Sorry to burst the Bayesian priori-zing bubble, but…this puzzle only demonstrates the shortcomings in our commonplace interpretation of the expected value. For a single round of the envelope game, using statistical parameters to justify a strategy is ill-conceived. The realization of the expected profit only comes after many successive trials. You can only use arguments on being better off on average when you have the opportunity to repeatedly play, that is the opportunity the actually realize the average. This game is the dual to the martingale doubling-up strategy, essential reversing the roles of the house and the gambler.

  • Aaron Sheldon

    Oh and there is also a bit of a cheat in this puzzle, the conceit being the implicit use of a scale (dollars). If you re-formulate the problem as:

    A game master has two tokens marked A and B, and a sealed envelop which contains the game masters prior choice of the relative values of A and B. In the game you choose A or B, and then are allowed to choose the other, after which the envelop is opened.

    In this context one can see the second choice is irrelevant. This also demonstrates the significant impact of the extra information of the choice of units in the puzzle. If you left out just the dollars part the puzzle is trivial.

    So what this teaches us is not anything about how counter intuitive and strange probability is, but rather how tricky our implicit assumptions are.

  • Brendon J. Brewer

    @Aaron Sheldon

    Re: Post 50. Tosh and piffle!

    Re: Post 51. Excellent point, probably the most useful post so far. :-)

  • Strether

    @49 😉
    But can anyone help with the meta-question @48?

  • Aaron Sheldon

    @53: The only way I have been able to personally rectify the debate is by adopting the following definition of probability that cleanly separates the role of empirical frequency:

    Probability is a hypothesized inference on the frequency of occurrence of future events that must agree with the actual observed empirical frequency of events in the limit of a large number of repetitions of a trial.

    There are similar statements in the literature, this is essentially an adaption of consistency. I have found this definition useful because it allows me to find a logical rectification between Bayesian and Frequentest interpretations. The reason why this works is because the statistical tests in either interpretation can be shown to be consistent estimators; the predicted probability converges to the empirical frequency. Both interpretations also suffer from the same flaws of not being particularly robust against violations of assumptions on the form of the sampling distribution.

  • Strether

    Huh. Certainly doesn’t square with the ordinary meaning of probability. If there are 3 envelopes and 2 contain money, then the “probability” of choosing one with money is 2/3, no? Doesn’t require any prior trials. But if you’re distinguishing “odds” from “probability” — why? Your definition seems to mean we can’t deduce probabilities (2 of 3 envelopes) and can only get them by induction. Why?

  • Aaron Sheldon

    Your example contains an implicit hypothesis, that 2 of 3 envelops contain money. You can only ever statistically test that hypothesis by repeating the trial a large number of times. One trial of opening a single envelop cannot meaningfully test the hypothesis that 2 of 3 envelopes contain money.

    There other non-statistical method is to open all three envelops at once.

    The main reason for cutting such a delicate surgery is to avoid the paradoxes involved in interpreting frequencies of events that have already occurred as probabilities, which leads to contradictory statements on the odds of observing events that have already occurred.

    This is in part a method of expanding statements like:

    The probability of getting a heads given the flip of a fair coin is 1/2

    Into the more precise and analytically tractable:

    Given the hypothesis that the coin and the toss is fair then the probability of flipping a heads is 1/2.

    The former statement more clearly illustrates that it will take a large number of trials to statistically test the hypothesis.

    *by paradox I mean real actual paradoxes of the same order that had to be overcome during the formulation of set theory.

  • Pingback: Gravity’s Rainbow » Blog Archive » What I’ve Noticed()

  • Ahmad

    Aaron, good point about the hypothesis assumption. Couple of things, too:

    There is no such thing as a fair coin. The fairness of a coin is a continuous chaotic function of time, in the real world, that changes even as the coin is in the air. And by relating ‘fairness’ of the experiment to the outcome, all statistics essentially makes a grand assumption that the probability is a frequency distribution with no underlying information. You are defining the probability of one out of two events to be one out of times frequency, if it is ‘fair’. The fairness is an illusion that can be violated very heavily depending on the nature of the chaotic function producing the outcome.

  • Brian137

    About the problem of the two envelopes, one containing twice as much money as the other, can we just say that the fact that at least one of two non-negative numbers is positive does not imply that the two numbers must be equal? Upon discovering that the envelope we have chosen contains $4000, all we really know is that at least one of the two a priori probabilities P(2000, 4000) and P(4000, 8000) is nonzero.

  • Robert

    What happens if the money is a foreign currency that I don’t recognize and can’t value?

    It seems to me that mathematically, switching should still be the correct move.

  • Brian137

    My friend, Hermione Granger, explained the problem to me this way. Call the amounts in the two envelopes that you are presented with Y and 2Y. If the envelope you chose contains Y dollars, then you will gain 2Y – Y = Y dollars by switching. If , on the equally likely other hand, you chose the envelope containing 2Y dollars, then you will lose 2Y – Y = Y dollars by switching. Hermione says there is no advantage in switching. I started to object, but I just can’t stand that patronbizing look of hers. Too bad we can’t ask Dumbledore.

  • Tim

    I’d be like, “Woohoo, 4k in free money!”
    Then think again and say to myself, “Well, 2k would be free too, and it wouldn’t take that much to earn the other 2k to get back up to my current 4k. Certainly easier than coming up with the 4k to get to the potential 8k.”
    “Okay, gimme the other envelope.”
    That’s right; I’m not a mathematician. 😉

  • lumberjack

    I am disturbed by the contrast between “Hermione”‘s elegant solution (#61) and the gross failure of what seems to be a closely related algebraic approach i.e. why should calculating in dollar deltas work better than in dollar fractions ?

    Hermione’s solution is simple and leads to the intuitively correct result, so why does the mathematical formulation of the solution in the original post fail ?

    I.e. the description:

    “After all, there seems to be a 50% chance that the other envelope contains $2,000, and a 50% chance that it contains $8,000. So your expected value from switching is the average of what you will gain ($2,000 + $8,000)/2 = $5,000 minus the $4,000 you lose, for a net gain of $1,000.”

    would seem to be equivalent to:

    The value of the current envelope is, say, X:

    Vc = X

    and the expected value of the other envelope is the average of 0.5X and 2X:

    Vo = 0.5 * (0.5X + 2X) = 1.25X

    Vo > Vc for positive X and so it is always worth switching.

    Ironically, considering the original post is a non-physics problem in a physics blog, my fix is to introduce some units.

    As given, the problem involves two sums of money e.g. $2000 and $4000. Call the smaller amount one “small” (S) and the larger amount one “large” (L) where the conversion rate is 1L = 2S.

    Having chosen an envelope the value of the current envelope is either 1S or 1L, with 50% probability each:

    Vc = 0.5 * (1S + 1L) = 0.5 * (1S + 2S) = 1.5S

    When the current envelope holds an S then the other holds an L and vice versa. The expected value of the other envelope is then:

    Vo = 0.5 * (1L + 1S) = 0.5 * (2S + 1S) = 1.5S

    So, now Vc = Vo and there is no value in switching.

    Introduction of units has forced us to consider the probabilistic nature of Vc (rather than just calling it X) and then allowed us to calculate Vc and Vo in a consistent manner i.e. in units of S.

    Likewise Hermione’s solution implicitly uses a consistent set of units i.e. also my S units, where Y = 1S.

    Note that efp (#41) gave effectively the same explanation as here but I was keen to clarify the contrasting success of Hermione’s solution and the original post’s approach.

  • SteveM

    The puzzle results from “stopping” in the middle and doing the expected value. That is, since the values in the envelopes don’t change, you have to calculate the expected value of the entire process of the initial choice and the decision to switch or not. So there are 4 states, pick envelope A and stick with A, pick A switch to B, pick B and stick with B, or pick B then switch to A. Let A have X dollars and B have 2X dollars. For the two states where you stick with your original choice the total value is 3X so the EV = 3X/2. For the two states where you switch, again the total value is 3X for an EV of 3X/2. So there is no value in switching.

    Another way to put it is to only look at what you end up with at the end of your two choices, 2 states yield X, 2 states yield 2X all with equal probability. The “gain or loss” after making only one choice is an illusion, because it doesn’t matter whether you open the first envelope or not, it is still just as unknown (with respect to the value in the other envelope) as it was before. Since no real information is gained by opening your first envelope, your probability of having chosen the 2X envelope has not changed. Just as in the Monty Hall problem where Monty really doesn’t add any information so the probability that you chose the correct door initially remains 1/3.

  • Atider84

    @SteveM… Monty does add information because he opens up a goat door among the two doors not initially chosen. In the end, it tells you the probability of finding the car between the door you chose and the only other remaining door.

  • SteveM

    @Atider84, The reason Monty does not add any information is that you already know that there is at least one door with a goat. Monty showing you a goat does not add any information you didn’t already know. If Monty added any information then the probability of your door being the winner would have to change.


Discover's Newsletter

Sign up to get the latest science news delivered weekly right to your inbox!

Cosmic Variance

Random samplings from a universe of ideas.

About Sean Carroll

Sean Carroll is a Senior Research Associate in the Department of Physics at the California Institute of Technology. His research interests include theoretical aspects of cosmology, field theory, and gravitation. His most recent book is The Particle at the End of the Universe, about the Large Hadron Collider and the search for the Higgs boson. Here are some of his favorite blog posts, home page, and email: carroll [at] .


See More

Collapse bottom bar