Quantum Mechanics and Decision Theory

By Sean Carroll | April 16, 2012 8:20 am

Several different things (all pleasant and work-related, no disasters) have been keeping me from being a good blogger as of late. Last week, for example, we hosted a visit by Andy Albrecht from UC Davis. Andy is one of the pioneers of inflation, and these days has been thinking about the foundations of cosmology, which brings you smack up against other foundational issues in fields like statistical mechanics and quantum mechanics. We spent a lot of time talking about the nature of probability in QM, sparked in part by a somewhat-recent paper by our erstwhile guest blogger Don Page.

But that’s not what I want to talk about right now. Rather, our conversations nudged me into investigating some work that I have long known about but never really looked into: David Deutsch’s argument that probability in quantum mechanics doesn’t arise as part of a separate ad hoc assumption, but can be justified using decision theory. (Which led me to this weekend’s provocative quote.) Deutsch’s work (and subsequent refinements by another former guest blogger, David Wallace) is known to everyone who thinks about the foundations of quantum mechanics, but for some reason I had never sat down and read his paper. Now I have, and I think the basic idea is simple enough to put in a blog post — at least, a blog post aimed at people who are already familiar with the basics of quantum mechanics. (I don’t have the energy in me for a true popularization at the moment.) I’m going to try to get to the essence of the argument rather than being completely careful, so please see the original paper for the details.

The origin of probability in QM is obviously a crucial issue, but becomes even more pressing for those of us who are swayed by the Everett or Many-Worlds Interpretation. The MWI holds that we have a Hilbert space, and a wave function, and a rule (Schrödinger’s equation) for how the wave function evolves with time, and that’s it. No extra assumptions about “measurements” are allowed. Your measuring device is a quantum object that is described by the wave function, as are you, and all you ever do is obey the Schrödinger equation. If MWI is to have some chance of being right, we must be able to derive the Born Rule — the statement that the probability of obtaining a certain result from a quantum measurement is the square of the amplitude — from the underlying dynamics, not just postulate it.

Deutsch doesn’t actually spend time talking about decoherence or specific interpretations of QM. He takes for granted that when we have some observable X with some eigenstates |xi>, and we have a system described by a state

$latex |psirangle = a |x_1rangle + b |x_2rangle , $

then a measurement of X is going to return either x1 or x2. But we don’t know which, and at this stage of the game we certainly don’t know that the probability of x1 is |a|2 or the probability of x2 is |b|2; that’s what we’d like to prove.

In fact let’s just focus on a simple special case, where

$latex a = b = frac{1}{sqrt{2}} . $

If we can prove that in this case, the probability of either outcome is 50%, we’ve done the hard part of the work — showing how probabilistic conclusions can arise at all from non-probabilistic assumptions. Then there’s a bit of mathematical lifting one must do to generalize to other possible amplitudes, but that part is conceptually straightforward. Deutsch refers to this crucial step as deriving “tends to from does,” in a mischievous parallel with attempts to derive ought from is. (Except I think in this case one has a chance of succeeding.)

The technique used will be decision theory, which is a way of formalizing how we make rational choices. In decision theory we think of everything we do as a “game,” and playing a game results in a “value” or “payoff” or “utility” — what we expect to gain by playing the game. If we have the choice between two different (mutually exclusive) actions, we always choose the one with higher value; if the values are equal, we are indifferent. We are also indifferent if we are given the choice between playing two games with values V1 and V2 or a single game with value V3 = V1 + V2; that is, games can be broken into sub-games, and the values just add. Note that these properties make “value” something more subtle than “money.” To a non-wealthy person, the value of two million dollars is not equal to twice the value of one million dollars. The first million is more valuable, because the second million has a smaller marginal value than the first — the lifestyle change that it brings about is much less. But in the world of abstract “value points” this is taken into consideration, and our value is strictly linear; the value of an individual dollar will therefore depend on how many dollars we already have.

There are various axioms assumed by decision theory, but for the purposes of this blog post I’ll treat them as largely intuitive. Let’s imagine that the game we’re playing takes the form of a quantum measurement, and we have a quantum operator X whose eigenvalues are equal to the value we obtain by measuring them. That is, the value of an eigenstate |x> of X is given by

$latex V[|xrangle] = x .$

The tricky thing we would like to prove amounts to the statement that the value of a superposition is given by the Born Rule probabilities. That is, for our one simple case of interest, we want to show that

$latex Vleft[frac{1}{sqrt{2}}(|x_1rangle + |x_2rangle)right] = frac{1}{2}(x_1 + x_2) . qquadqquad(1)$

After that it would just be a matter of grinding. If we can prove this result, maximizing our value in the game of quantum mechanics is precisely the same as maximizing our expected value in a probabilistic world governed by the Born Rule.

To get there we need two simple propositions that can be justified within the framework of decision theory. The first is:

Given a game with a certain set of possible payoffs, the value of playing a game with precisely minus that set of payoffs is minus the value of the original game.

Note that payoffs need not be positive! This principle explains what it’s like to play a two-person zero-sum game. Whatever one person wins, the other loses. In that case, the value of the game to the two participants are equal in magnitude and opposite in sign. In our quantum-mechanics language, we have:

$latex Vleft[frac{1}{sqrt{2}}(|-x_1rangle + |-x_2rangle)right] = – Vleft[frac{1}{sqrt{2}}(|x_1rangle + |x_2rangle)right] . qquadqquad (2)$

Keep that in mind. Here’s the other principle we need:

If we take a game and increase every possible payoff by a fixed amount k, the value is equivalent to playing the original game, then receiving value k.

If I want to change the value of a playing a game by k, it doesn’t matter whether I simply add k to each possible outcome, or just let you play the game and then give you k. I don’t think we can argue with that. In our quantum notation we would have

$latex Vleft[frac{1}{sqrt{2}}(|x_1+krangle + |x_2+krangle)right] = Vleft[frac{1}{sqrt{2}}(|x_1rangle + |x_2rangle)right] +k . qquadqquad (3)$

Okay, if we buy that, from now on it’s simple algebra. Let’s consider the specific choice

$latex k = -x_1 – x_2 $

and plug this into (3). We get

$latex Vleft[frac{1}{sqrt{2}}(|-x_2rangle + |-x_1rangle)right] = Vleft[frac{1}{sqrt{2}}(|x_1rangle + |x_2rangle)right] -x_1 – x_2. $

You can probably see where this is going (if you’ve managed to make it this far). Use our other rule (2) to make this

$latex -2 Vleft[frac{1}{sqrt{2}}(|x_1rangle + |x_2rangle)right] = -x_1 – x_2 , $

which simplifies straightaway to

$latex Vleft[frac{1}{sqrt{2}}(|x_1rangle + |x_2rangle)right] = frac{1}{2}(x_1 + x_2) , $

which is our sought-after result (1).

Now, notice this result by itself doesn’t contain the word “probability.” It’s simply a fairly formal manipulation, taking advantage of the additivity of values in decision theory and the linearity of quantum mechanics. But Deutsch argues — and on this I think he’s correct — that this result implies we should act as if the Born Rule is true if we are rational decision-makers. We’ve shown that the value of a game described by an equal quantum superposition of states |x1> and |x2> is equal to the value of a game where we have a 50% chance of gaining value x1 and a 50% chance of gaining x2. (In other words, if we acted as if the Born Rule were not true, someone else could make money off us by challenging us to such games, and that would be bad.) As someone who is sympathetic to pragmatism, I think that “we should always act as if A is true” is the same as “A is true.” So the Born Rule emerges from the MWI plus some seemingly-innocent axioms of decision theory.

While I certainly haven’t followed the considerable literature that has grown up around this proposal over the years, I’ll confess that it smells basically right to me. If anyone knows of any strong objections to the idea, I’d love to hear them. But reading about it has added a teensy bit to my confidence that the MWI is on the right track.

CATEGORIZED UNDER: Science, Top Posts
  • anon.

    This is cute, but one aspect of it is bothering me. Believing in QM and understanding decoherence gets you to the point that Hamiltonian evolution in the presence of an environment gives you states that have some “weight,” measured by the Hilbert space measure, clustered around apparent classical outcomes. The inner product, which measures this “weight,” is an intrinsic part of QM, I think. I see the problem of deriving the Born Rule as being the problem of showing that if you repeat an experiment a number of times, the frequencies approach those corresponding to counting these states by the Hilbert space weight. In other words, the inner product isn’t just a mathematical device that hangs around, it plays a key role in determining observable outcomes. So: where’s the inner product on Hilbert space hiding in the argument you outlined above? It might be hiding in some assumption about how the x states are normalized, but can it be made explicit in a way that shows that this is really addressing the right question?

    • http://blogs.discovermagazine.com/cosmicvariance/sean/ Sean Carroll

      The step from the equation just before “You can probably see where this is going” to the equation just after makes implicit use of the inner product. (Update: oops, not true, see #6 and #7 below.) Note that we switched the order of |x_1> and |x_2> in the sum, which wouldn’t have been possible if they didn’t have equal amplitudes.

  • MPS17

    Thanks for the post. Zurek has some ideas on this too. Although I haven’t read the paper, I heard the talk and they seemed more in line with ways we physicists like the approach problems.

    http://arXiv.org/abs/arXiv:1105.4810

    UPDATE: I think this links to the original literature. I haven’t thought carefully about this so please excuse if this discusses a differently nuanced issue:

    http://arxiv.org/abs/quant-ph/0211037

  • http://blogs.discovermagazine.com/cosmicvariance/sean/ Sean Carroll

    I think it is the same kind of issue, and Zurek’s papers are extremely interesting. Instead of talking about decision theory, he talks about symmetries. He claims that, once we allow for the existence of an environment, there is a new symmetry (“envariance”) that applies to states like (1), so that the probabilities of getting x_1 and x_2 must be equal. From there the same reasoning applies.

    There is some critique along the lines of “Zurek shows that if it’s appropriate to think of quantum mechanics in terms of probabilities at all, then those probabilities should obey the Born Rule, but he doesn’t actually demonstrate the need for probabilities.” It’s not clear to me that this couldn’t also be applied to Deutsch’s argument. But this is philosophical terrain, and I think the underlying thrust of Deutsch and Zurek are actually quite similar, although using quite different vocabularies.

  • http://www.dudziak.com will

    the 1/sqrt(2) does not seem justified, and as that is the crux of the discussion, this argument does not convince me well.

    You might as well replace 1/sqrt(2) with a variable ‘m’ for example throughout all the equations, and your final conclusion would be just as “correct”.

    With 1/sqrt(2) removed, the whole argument becomes a tautology… interesting no doubt, but proving nothing except that the author is well versed in basic algebra.

  • http://mattleifer.info Matt Leifer

    Sean, that is not using the inner product. It is simply using the vector space structure. You can’t assume that the inner product has any a-priori relevance within this approach because that is what you are trying to derive, i.e. the only reason you pay attention to things like inner products and unitarity within conventional quantum mechanics is because you are trying to avoid negative probabilities, but you have no reason for connecting those two things until you have first derived the Born rule.

    I too like this argument, although I have my own version of it that makes use of Gleason’s theorem which I prefer, since it tells you that you should structure your probability assignments according to traces of operators against some density operator, even if you don’t know what the “wavefunction of the universe” is.

    There are legitimate issues surrounding the interpretation of probability in this approach, i.e. should one also be trying to derive a limiting frequency. Many of these issues are not specific to QM, since people differ on whether this is required even in the classical case. However, whether or not you think frequencies are required, it must be admitted that getting the decision theoretic interpretation right is even more important. After all, if I could derive a relative frequency, but was not able to derive the fact that I should use probabilities to inform my decisions then that would be a complete disaster. What use is it if I can derive that a fair quantum coin should have limiting 50/50 relative frequencies, but not that I should consider a bet on heads at stake $1 that pays $2 to be fair?

    There are also issues surrounding the very meaning of terms like “probability” and “utility” in this approach, since we are assuming that all outcomes actually occur. The two concepts get mushed together into something like a “caring weight” which measures how much we should care about each of our successors at the end of a quantum experiment. If you think about that for a minute it leads to moral issues, e.g. why should I care less about a successor who lives in a branch that happens to have a small amplitude. In the analagous classical case we can say it is because there is a very small chance that such a successor will exist, but quantum mechanically they definitely will exist. Thus, one can question whether it is moral to accept a scenario in which you get a large sum on money on a large amplitude branch, but die a horrible painful death in another branch, even with an amplitude that is epsilon above zero. In light of the Deutsch-Wallace argument, this indicates one of two things, either:

    - The usual intuitions about decision theory break down in a many-worlds scenario.
    - They do not break down, but we would always use extremal utilities, which makes it vacuous.

    By an extremal utility, I mean one that is infinity or -infinity on some outcomes, e.g. dying a painful death. The principle of maximum expected utility is useless in such cases.

    I have a lot more to say on this subject, but not the energy to go into it right now. I do have a paper on the backburner at the moment that deals with these issues.

  • http://blogs.discovermagazine.com/cosmicvariance/sean/ Sean Carroll

    Matt– You’re right, I was being very sloppy. That’s just the vector-space structure. The role of the inner product is essentially what you’re trying to derive, as you say. Thanks for the other comments. As you say, most of the additional issues refer to the nature of probability (or the definition of “value”), not really specifically to quantum mechanics.

    will– The argument certainly isn’t a tautology. Of course you could replace the 1/sqrt{2} by any number, as long as the coefficient of both terms is the same (that’s what was used in the argument just referenced). But that’s what you want! If that number were something else, you would have a non-normalized wave function. But you would still want to have equal probabilities for two branches with equal weights.

  • Peli Grietzer

    This fantastic paper by Adrian Kent has some great arguments about why the ‘but what does speaking about probabilities even mean’ issue for MW is sharply unlike any similar issues that arises for one-world theories: http://arxiv.org/abs/0905.0624

  • CU Phil

    There is quite a bit of criticism of the decision-theoretic proposal (most vociferously from David Albert and Adrian Kent) as well as several papers advocating the approach in this volume:

    http://ndpr.nd.edu/news/24515-many-worlds-everett-quantum-theory-and-reality/

    The review gives a nice summary of the debate. Also, Bob Wald reviewed the above volume in Classical and Quantum Gravity:

    http://iopscience.iop.org/0264-9381/28/22/229001

    and also gives an insightful review.

  • Michael Bacon

    Peli,

    I don’t think that Kent’s argument succeeds in proving the failure of the Everett program. However, assuming that his argument does succeed, Kent goes on to say that such Everettarian failure “adds to the likelihood that the fundamental problem is not our inability to interpret quantum theory correctly but rather a limitation of quantum theory itself.” Perhaps, but at least for now, my money remains on quantum theory.

  • http://mmcirvin.livejournal.com/ Matt McIrvin

    @will: The requirement that state vectors have norm 1 is already a requirement of quantum mechanics separate from any interpretation of amplitudes as probabilities. Given that, the factor of 1/sqrt(2) (up to some arbitrary complex phase) is necessary if the two terms have equal coefficients.

    Once you make any move in the direction of a probabilistic interpretation, the Born rule falls out as the only one that makes mathematical sense; there are many ways of demonstrating this. But that first step is a doozy, and I always have the sneaking suspicion that arguments like this one have somehow smuggled their conclusion in as part of an assumption that only seems less controversial.

  • http://mmcirvin.livejournal.com/ Matt McIrvin

    …my own favorite handwaving quasi-derivation of the Born rule was a probably-not-original stochastic argument that I thought up on a long walk along the Charles River many years ago.

    Consider the Feynman path integral for a particle that travels from point A to point B. Now suppose that you put a screen between point A and point B that randomly tweaks the particle’s wavefunction phase to a different value at each point (maybe coarse-grain it a little to make the math tractable: divide it into tiny “pixels” that each have a different random phase factor).

    Now consider the amplitude that the particle goes from point A to point B traveling through some coarser-grained but still small bundle of pixels. The amplitudes for each pixel will add like a random walk, yielding an overall amplitude that increases as the square root of the number of pixels. Which is exactly what you’d get by interpreting the square of the amplitude as a probability.

  • Moshe

    I’m puzzled about something really basic: you are trying to argue for an expression that is quadratic in the coefficients a,b of your wavefunction (something that encodes in it interference, the essential mystery of QM). Instead you are deriving an expression which is linear in these coefficients (as pointed out, you have only used the linear structure of the Hilbert space, not the inner product). The derivation seems to use in an essential way the equality of both coefficients a=b, and of course that is precisely the only case where quadratic and linear expressions have the same consequences. But, what happens in the generic case? For example, what happens if a,b only differ by a phase? that should still lead to the same final expression. It seems to me that if you put a=-b and repeat your derivation, you’d find the same minus sign in the RHS of (1), instead of the result predicted by the Born rule.

    • http://blogs.discovermagazine.com/cosmicvariance/sean/ Sean Carroll

      Moshe– I encourage you to put a minus sign in front of the x_2 term and go through the math. :)

      Obviously there is work to be done generalizing to other amplitudes, but that’s done in the paper; I don’t think there’s much controversy about that part.

  • http://www.uweb.ucsb.edu/~criedel/ Jess Riedel

    Sean: Like Peli Grietzer, I highly recommend Kent’s criticism of the decision-theory approach. To add to what Peli said, I think Kent conclusively shows that the axioms of decision theory in the many-worlds context are not nearly as obvious as they first appear, to the point that they become much less attractive than approaches which rest on Gleason’s theorem like Matt Leifer suggests.

    Of course, this is all truly philosophy; the game here is to try to reduce the axioms of quantum mechanics to their most beautiful (and, usually, simple) form. Sometimes, this improvement is so dramatic that I think everyone should agree that the new axioms are superior [such as my advisor Zurek's work--which I am constantly advertising--showing that the mysterious declaration that observables be Hermitian operators can be traced back to the linearity of evolution and the need for amplification (http://arxiv.org/abs/quant-ph/0703160)]. But sometimes, it’s just a matter of taste.

    Also, I’d like to clarify Michael Bacon’s comment. Kent’s paper strongly concentrates on attacking the decision-theoretic basis of Born’s rule, and only addresses the attractiveness of quantum theory in general as an aside. In particular, by the “Everett program”, Kent means that claim that quantum theory need not be supplemented by an ad-hoc assumption for extracting probabilities. I believe Kent is open to the idea that quantum theory need not be modified *if* a sufficiently attractive assumption can be found which allows the extraction of unambiguous probabilities (e.g. if the “set-selection problem” in the consistent histories framework could be solved, which he has written about). But yes, Kent does take the extreme difficulty of finding non-ad-hoc assumption as weak evidence that quantum theory is fundamentally wrong.

  • Michael Bacon

    Jess,

    You obviously are closer to this than I am, and you may well be right that all Kent really thinks is that the extreme difficulty of finding non-ad-hoc assumptions is “weak” evidence that quantum theory is fundamentally wrong. However, that’s not what the language I quoted says. At least here, he’s clearly saying that there is a “likelihood” that quantum theory is wrong — i.e, more likely than not. And, that his work merely adds to that “likelihood”. Nevertheless, perhaps I’m making too much of the particular words he chose to describe his view. By the way, I love the picture of you in your natural environment on your web page. ;)

  • Anonymous Coward

    I’d be interested to how you view the relation to classic thermodynamics.

    There, likewise, a probability distribution “falls out of the sky”. There is some justification in things like the Sinai-Boltzmann Conjecture, stating that the standard (Liouville-phase-space-) measure is the only sensible one (uniquely ergodic, for the toy problem of hard-ball billiard)… IF you assume, that the god who has chosen the initial conditions of the world has done so with an absolutely continuous probability distribution (SRB-measure). If you admit “pathological” probability measures, the entire argument collapses unto itself.

    I always viewed, maybe naively, the Born rule as a similiar thing. People conjecture and hope to prove at some point, that the Born rule follows if we make the pretty basic (and mind-boggingly subtle!) assumption that the initial conditions of our universe have been picked compatibly with some infinite-dimensional generalization of Lebegue-measure.

    [sorry for the theistic metaphor... personificating some aspects of nature helps me think more clearly]

    • http://blogs.discovermagazine.com/cosmicvariance/sean/ Sean Carroll

      I think it’s certainly a good question. People like Albrecht and Deutsch believe that the only way to justify any classical probability distribution is ultimately in terms of the Born Rule. I wouldn’t necessarily think it’s a failure if the answer is “that’s the most natural measure there is,” but I’m hopeful that some better picture of the connection between QM and classical stat mech (plus perhaps some initial-conditions input from cosmology) explain why the Liouville measure is the “right” one.

  • Moshe

    I see where I was confused: you are using a linear structure in the space of eigenvalues, not for the coefficients, so the value for a=-b is not determined by the above considerations. I should probably take a look at the paper sometime, sounds mysterious how one can get anything quadratic from what you wrote so far.

  • Ben

    Hi Sean,
    I remember a great lecture by Nima Arkani-Hamed at T.A.S.I. 2007,
    http://physicslearning2.colorado.edu/tasi/hamed_02/SupportingFiles/video/video.wmv ,
    where he points out that the Born Rule can be derived from the operator postulate, i.e.
    that physical measurement outcomes can be identified with the eigenvalues of a corresponding Hermitian operator.

    The argument is as follows: Construct the tensor-product state of N identically prepared
    copies of a|x1> + b|x2>. This could be expanded out using binomial coefficients. There is a Hermitian operator N1 which counts how many copies are in the state |x1>. Then if
    we take N1/N in the limit N to infinity, we obtain a Hermitian operator whose eigenvalue
    is |a|^2, i.e. it is the probability operator.

    So we get the Born Rule for free!

  • http://jbg.f2s.com/quantum2.txt James Gallagher

    You can’t get fundamental probability out without putting fundamental probability in, the Everett approach is just untenable and even quite ridiculous imho compared to just accepting that fundamental randomness exists – then the Born rule emerges as a kind of thermodynamic property of the Schrödinger evolution – the Bohmian guys have even demonstrated this (based on their wrong ontological model)

    Also, as I keep trying to tell everyone, the past universe does not exist, you have to look at the (discrete) flow of the Schrödinger evolution exp(hL).U(t) – U(t) to describe what we observe, and in this case we get 3D space as period-3 points in the Hilbert Space.

  • Colin

    The math is incorrect on your equation 3 (and also in Deutsch’s original paper). You only add 1/rt2 K to each outcome of the game on the left side of the equation, whereas you add an entire K to the right side of the equation. In reality, where you have 1/rt2(x1+x2) standing in place for the entire system Psi, you can do one of two things to manipulate the equation: value psi as game with only one outcome, and add a single k to each side (trivial)….Or you can keep 1/rt2(x1) and 1/rt2(x2) separate, and add an entire K to each…but still only 1 k on the right.

    EDIT…I noticed that this argument is a little skewed as you are adding K to each eigenstate…so it’s not the simple math; but the premise is still correct…what has been added to each outcome on the left is not what as been added to teh entire game on the right. If I started with V(Psi>) instead of V(1/rt2x1> +1/rt2x2>) (which are identical by assumption), I would add K to get V(psi>+k).

  • Pingback: Daily Run Down 04/16/2012 | Wayne's Workshop

  • Pingback: Linkblogging for 16/04/12 « Sci-Ence! Justice Leak!

  • http://qpr.ca/blog/ Alan Cooper

    The reference to state vectors of form |x+k> seem to be as eigenvectors for the operator X+k rather than for X, so I am not clear that it makes sense to say |x1-(x1+x2)>=|-x2>
    (In fact, making the operator explicit, we would seem to have
    |x1-(x1+x2) for X-(x1+x2)>=|x1 for X> not |-x2 for X>)

    And in any case the argument seems to be showing that if there was an expectation function with the expected properties then it would have to satisfy the Born rule. But that is not the same as saying that such a function should actually have a probabilistic interpretation. (Actually I guess this is the same complaint as what you alluded to in the second para of your coment#4 but I do think it’s a serious one.)

  • http://prce.hu/w/ Huw Price

    I’m disappointed that CU Phil thinks that my objections in the Many Worlds@50 volume are less vociferous than those of Adrian Kent and David Albert! My piece is here: http://philsci-archive.pitt.edu/3886/

    I think that the plausibility of the Deutsch-Wallace axioms actually presupposes what needs to be shown, viz that there is some analogue of classical uncertainty in the MW picture. Moreover, if we assume that the argument is a good one with that assumption made explicit, then we can exploit a point noticed by Hilary Greaves to show the assumption must be false.

    Here’s how. Let P = “There is a suitable analogue of classical uncertainty in MW”, and Q= “Rationality requires that any Everettian agent should maximise her expected in-branch utility, using weights given by the Born Rule”.

    Then if the Deutsch argument works, it establishes:

    1. If P then Q.

    But Greaves’ observation shows that Q simply can’t be true, because MW introduces a new kind of outcome that an agent may have preferences about, namely the shape of the future wave function itself (or at least, the portion of it causally downstream from the agent’s current choice). In effect, Q is telling us that rationality requires us to prefer future wave functions with a characteristic feature, that of maximising Born-rule-weighted in-branch utility. But this is obviously and trivially wrong, in the case of an agent who has preferences about the shape of the wave function itself, and just prefers (i.e., assigns a higher utility to) some other kind of future wave function. Decision-theoretic rationality tells us what to do, given our preferences. It doesn’t tell us what our preferences should be. (But wouldn’t such an agent already be crazy, for some other reason? No — see the paper for details.)

    Given Greaves’ observation, then, there are only two possibilities: either P is false, or the Deutsch argument fails — either way, it’s bad news for the project of making sense of MW probabilities in terms of decision-theoretic considerations.

  • http://www.pipeline.com/~lenornst/index.html Len Ornstein

    Sean:

    It’s seems you’re approaching this ‘problem’ as a Platonist, looking for a model(s) which comes closest to some preconceived (and not widely entertained) concept of an absolutely true representation of reality – rather than from the general scientific requirement that a model(s) status must be judged by how closely the empirical record can be matched?

    For the Platonist ‘test’, the issue is whether or not Born’s added ‘axioms’ and his formulation fit together with QM better than do the construction of Decision Theory and ITS unique axioms – perhaps an Occam’s Razor type question.

    For the more generally accepted scientific requirement – match to the empirical record – you have so far offered no arguments to distinguish the ‘performance’ of Born’s probability interpretation from that of a Decision Theoretical approach!

  • http://skepticsplay.blogspot.com miller

    So… suppose that we have N identical systems, each with state |x1>+|x2>, where x1 and x2 are eigenvalues of operator X. And suppose we have an operator Y which represents a simultaneous measurement of X in all of the N systems. Operator Y gives the value 1 if nearly half of the measurements of X result in x1. Otherwise, operator Y gives the value 0.

    If I understand Deutsch’s paper, we cannot say that a measurement of Y has a high probability of returning 1. But if we are rational decision makers, we would treat the expected value of Y as being close to 1 (and getting even closer to 1 as N goes to infinity).

    This may not prove that the results actually follow the frequency distribution given by Born’s rule, but it sure seems like the next best thing.

  • http://blogs.discovermagazine.com/cosmicvariance/sean/ Sean Carroll

    Huw– I’ll admit I haven’t read your paper or Greaves’s, but that objection doesn’t seem very convincing at face value. Can’t we just say that preferences are something that people have about outcomes of measurements, not about wave functions? Outcomes are what we experience, after all.

  • http://prce.hu/w/ Huw Price

    Sean, I don’t think that response is going to help Deutsch and Wallace, who are trying to establish a claim about any rational agent, not just about agents with the kind of preferences we happen to have. But in any case, it is easy to think of examples of preferences for wave functions of the kind my objection needs, which are themselves grounded on what the wave functions imply about the experiences of people in the branches of those wave functions — e.g., a preference for a wave function in which I don’t get tortured in any branches (even very low weight branches), over a wave function in which I do get tortured in a very low weight branch, but get rich in all the high weight branches. (My Legless at Bondi example in the paper is much like this, and I discuss why MW makes such a difference, compared to the classical analogue.)

  • http://qpr.ca/blog/ Alan Cooper

    The key equation seems to be asserting that the “Value” (expectation?) of an observation (of the observable X-(x1+x2)) where the possible values are -x2 and -x1 is the same as the subtracting (x1+x2) from the Value of an observation (of X) where the possible values are x1 and x2. And then the application of (2) seems to be saying that the Value of that observation (of the observable X-(x1+x2)) where the possible values are -x2 and -x1 is the negative of the Value of an observation (of X) where the possible values are x1 and x2. But if (2) is being applied this way – ie without regard to which observable is involved and so without regard to which of the two terms is associated with which value, then isn’t that essentially assuming that equal probabilistic weights are being assigned to each of the two outcomes which amounts to begging the question of probabilistic weights being equal when the vector magnitudes are?

    (After all, the principle that negating the payoffs negates the expectation requires keeping the same probabilities, and switching cases only works if the probabilites are equal:
    eg 1/3(-x1)+2/3(-x2)= -{1/3(x1)+2/3(x2)} but {1/3(-x2)+2/3(-x1)} is not the same)

  • http://alastairwilson.org/ Alastair Wilson

    I think the Greaves/Price objection is a serious worry for probability in EQM in general, and for the decision-theory strategy in particular. Assigning objective probabilities to outcomes does seem to presuppose the possibility of uncertainty about which of the outcomes will occur. But EQM seems to say they all occur. So there’s a prima facie problem here. (Greaves’ response is: so much the worse for probability in Everett, but Everettians can do without it.)

    Wallace doesn’t think the problem is too serious these days (in contrast to his older papers which argue that Everettians must make sense of ‘subjective uncertainty’) – roughly, he now thinks that the objection appeals to pre-theoretic intuitions about the nature of uncertainty, and that intuition is unreliable in such areas. However, in his new book he does provide a semantic proposal which allows us to recover the truth of ordinary platitudes about the future (like ‘I will see only one outcome of this experiment’)', by interpreting them charitably as referring only to events in the speaker’s own world.

    I have a new paper forthcoming in British Journal for Philosophy of Science which argues that the Greaves/Price objection can be met on its own terms, by leaving the physics, the epistemology and the semantics alone and instead tinkering with the metaphysics. Here’s the link: http://alastairwilson.org/files/opieqmweb.pdf

    Sean’s remarks above capture the spirit of my suggestion nicely: if Everett is right, then our ordinary thought and talk about alternative possibilities *just is* thought and talk about other Everett worlds. To reply to Huw’s last points from this perspective: a) if Everett worlds are (real) alternative possibilities then any possible rational agent (not just one with preferences like ours) is going to be an agent with in-branch preferences, b) the kinds of ‘preferences for wave-functions’ that you describe can be made sense of on this proposal, though I would describe them differently; they correspond to being highly risk-averse with respect to torture.

  • http://www.astro.multivax.de:8000/helbig/helbig.html Phillip Helbig

    “Last week, for example, we hosted a visit by Andy Albrecht from UC Davis.”

    What do you think of Andy’s de Sitter equilibrium cosmology (e.g. http://arxiv.org/abs/1104.3315 and references therein)?

    • http://blogs.discovermagazine.com/cosmicvariance/sean/ Sean Carroll

      Philip– I think it’s an interesting idea, although the chances that it’s right are pretty small. Andy takes the requirement of accounting for the arrow of time much more seriously than most cosmologists do, which is a good thing. But his intuition is that the real world is somehow finite, while my intuition is the opposite. (Intuition can’t ultimately carry the day, of course, but it can guide your research in the meantime.)

  • http://prce.hu/w/ Huw Price

    Alastair, Thanks for the link, though as you know, I prefer to tinker with metaphysics as little as possible ;)

    Concerning your (a), my point doesn’t depend at all on denying that we have in-branch preferences, but only on pointing out that the new ontology of the Everett view makes it possible for us to have another kind of preference, too — a preference about the shape of the future wave function. Concerning (b), any ordinary notion of risk-aversion is still a matter of degree, whereas the worry about low weight branches isn’t a matter of degree. So you’ll need infinite risk aversion, won’t you? And in any case, what does the response buy you? A demonstration that the choices of an ordinary agent in an Everett world should be those of a highly risk-averse agent in a classical world? That doesn’t seem good enough, for the Deutsch-Wallace program. They want to show that the ordinary agent should make the same choices in the two cases.

  • Daryl McCullough

    Alan,

    I’m not sure I understand what you’re saying.

    In Sean’s derivation, all the states are eigenstates of the X operator. The meaning of the state |x> is the eigenstate of the X operator with eigenvalue x. |x+k> is an eigenstate of the X operator with eigenvalue x+k.

    Sean’s assumptions might make more sense to you if we explicitly introduce some additional operators.

    Let T(k) be the operator (the translation operator) defined by T(k) |x> = |x+k>.
    Let P be the operator (the parity operator) defined by P |x> = |-x>.
    We assume that they are linear, which means
    T(k) (|Psi_1> + |Psi_2>) = T(k) |Psi_1> + T(k) |Psi_2>
    P (|Psi_1> + |Psi_2>) = P |Psi_1> + P |Psi_2>

    So Sean’s assumptions about the value function V(|Psi>) are basically:

    (1) V(|x>) = x
    (2) V(T(k) |Psi>) = V(|Psi>) + k
    (3) V(P |Psi>) = – V(|Psi>)

    (2) and (3) follow from (1) for eigenstates of the X operator, but we need the additional assumption that they hold for superpositions of eigenstates, as well.

  • http://qpr.ca/blog/ Alan Cooper

    ok – Maybe trying three times is considered rude, but I would really appreciate it if someone could explain what I have wrong here. In the Deutsch paper we have “It follows from the zero-sum rule (3) that the value of the game of acting as ‘banker’in one of these games (i.e. receiving a payoff -xa
    when the outcome of the measurement is xa) is the negative of the value of the original game. In other words” followed by your equation (2). But acting as banker is *not* the same as just having a *set* of outcome values which are the negatives of those of the player. They also have to be matched to the outcomes – ie it is the *ordered* sets which must be negatives. And in the case with Y=X-(x1+x2) it is in the situation where X sees x2 that Y sees -x1 and in the situation where X sees x1, Y sees -x2. This is not the same as Y being the “banker” when X is the “player” so I don’t see why the values should sum to zero. Please, what am I missing?

  • Daryl McCullough

    Alan,

    I don’t understand what you mean when you say “in the case with Y=X-(x1+x2) it is in the situation where X sees x2 that Y sees -x1 and in the situation where X sees x1, Y sees -x2″

    That doesn’t agree with the meaning of the “game” as described. I think you’re confusing a sum of states with a tensor product of states.

    There is no need to talk about X and Y. You only need to talk about one operator, X. The game works by starting in a state |Psi>, measuring X in that state to get a value x. If x > 0, then the banker pays the player x dollars. If x < 0, then the player pays the banker -x dollars. So it's not that the banker measures one observable and the player measures a different one. There is only one measurement, and that determines who pays who. The banker's winnings are always the negative of the player's winnings.

  • Sudip

    Dear Sean,

    It seems to me that assuming the “two simple propositions” is just a way of putting the Born rule through a backdoor. Of course, they seem very intuitive but how sure can we be that nature upholds them? I’m reminded of Neumann’s proof of the impossibility of deriving QM from a deterministic theory. As Bell pointed out Neumann made seemingly innocuous assumptions which may not be true. After all why V|x+k> has to be V|x>+k? Why it can’t be V|x>+k^2? I understand that these are justified using decision theory. However decision theory is a theory of decision making by rational agents – why should it have any relevance in the natural world?

    I admit that I haven’t looked at Deutsch’s paper or at Zurek’s paper mentioned in the comments.

    On a related note, do you know of any attempt of defining what constitutes a measurement in the context of MWI? As the wave function branches there it seems to me that a fully formulated theory should explain where those branchings occur.

  • Anonymous Coward

    @Sudip:
    As far as I understood MWI (correct me if I’m wrong; I didn’t read Everetts’s paper, just a coule of graduate textbboks) the words “branching” and “measurements” should be viewed as a heuristic description of the following process and theorem:
    Suppose you do a (for simplicity spin of an electron) measurement; the measurement is described by a unitary operator $U_M$ (time propagation of your apparatus). You call it a branching into two possible worlds (orthogonal subsapces spanning the entire Hilbert space of the MWI-world) $+$ and $-$, if the time-propagation for all later times leaves these subspaces almost invariant. If this should be the case, we can simplify all further calculations by projecting onto one of the subspaces and calculating the future evolution of each of these branches (“collapse the wavefunction”). What a nifty trick to get approximate results!

    Everetts contribution was to show that for suitable limits (larger hilbert-space, many particles, suitable definition of “almost invariant”) and actual measurement devices (full QM toy models of amplifiers), this does in fact occur. Therefore, Schroedinger’s equation alone implies the very good heuristic of collapsing wave-functions. Furthermore, if we should ever wish to assign weights to different branches, the only way to do this consistently is the Born rule — where consistently means “If I collapse after two measurements and calculate the evolution until the scond measurement in full QM, I get roughly the same result as if I collapsed after the first measurement and again afer the second one”.

    This way, even if we believed in magical “Kopenhagen collapse induced only by human observers”, Everett has shown that “occasional collapse + Born rule” yields very good approximate methods to calculate time-evolution until the “magical collapse”.

    From here it is not far fetched to postpone the “magical collapse” into the far future or *gasp* remove it at all. Futhermore, we can set out to precisely define “branching in the sense of invariance of subspaces up to $varepilon$” or “up to order such and so”. However, the words “branching” or “measurement” without further qualifiers should remain a (undefinable but not meaningless) heuristic, like “two points are close”.

  • Daryl McCullough

    Sudip writes:

    “After all why V|x+k> has to be V|x>+k? Why it can’t be V|x>+k^2?”

    The meaning of |x> is that it is a state such that the measurement of operator X is certain to produce result x. So the expected result of an X-measurement is V(|x>) = x.
    Similarly, |x+k> is a state such that the measurement of X is certain to produce result x+k. So V(|x+k>) is x+k.

  • Neal J. King

    What leaves me unsatisfied about this approach is that you are postulating the existence of an operator V with a complete set of states that behaves in the manner indicated; and then applying the inferred “Born’s rule” to the rest of quantum mechanics.

    Can you make the argument work for real quantum operators that we have some reason to believe in? Like the z-component of spin-1/2 ?

  • Hal S

    I am not entirely sure why the Born rule is hard to understand.

    The point of process is to allow one to use the computational flexibility associated with functions on the order of the reals and extract certain features of those functions (like the peaks and valleys…or extrema of the function). Remember, the wave function itself is a continuous deterministic function.

    More specifically we operate in the complex plane in order to exploit the computational power associated with manipulating systems with uncountable basis’.

    If we accept the information extraction interpretation, the question is how to economise that process. Since we are dealing with complex numbers, and we are dealing with countable features of the wave function, we can ask the question what happens when we take the function to other powers.

    Since 2 is the smallest prime number, we can interpret any even numbered power to simply being a rescaling of the information associated with squaring the number.

    If we consider odd powers, we can interpret the effect as being a rescaling of the wave function by some real number.

    If we consider all the potential combinations, one quickly must consider all possibilities, and essentially one quickly realizes that what the are really doing is trying to capture all the information in the wave function and essentially are also building a type of matrix that should be recogizable as an operator in a type of transformation procedure.

    In any case, squaring the amplitude is a process that economizes the information extraction from the complex plane into a series of integer indexed real numbers.

  • Hal S

    It makes me wonder if one can make and argument that if all the trivial zeros of the zeta function lie on the real line, then all the non-trivials have to be on the one-half real line. Interesting.

  • Pingback: The Alternative to Many Worlds « My Discrete Universe

  • Sudip

    @Anonymous Coward Thanks, that’s helpful.

    @Daryl Sorry, I didn’t mean to say that. Of course V|x+k>=V|x>+k by definition. What I intended to ask was why should V act linearly on a superposition of kets?

  • http://jbg.f2s.com/quantum2.txt James Gallagher

    The biggest criticism of Sean’s post is that the argument fails to explain why the Born Rule must obey a squared power relation rather than a quartic or higher one.

    Even Pauli recognized this problem back in 1933 (republished in english translation in his ‘General Principles of Quantum Mechanics’ Ch 2 p15) where he deduced that the Born Rule must be a positive definite quadratic form in ‘psi’ and anything not involving the product of psi.psi* would not be conserved by the Schrödinger Evolution so we only have terms in psi.psi* = |psi|^2 and higher powers as possibilities.

    Pauli, being a genius, realised that only Nature then determines that the rule is a squared one (rather than a higher even power) and ultimately the rule is fixed by experimental observation – not deduced from anything simpler (and certainly not from obfuscating arguments involving rational beings and decision theory!)

    I mentioned above that there is a Bohmian argument for how the absolute squared law is emergent from the dynamics of the Evolution ( eg http://arxiv.org/abs/1103.1589 ) – this is true unless your initial distribution was a higher power invariant one – so the squared power one seems favoured on a positive measure set of starting distributions (maybe even measure 1).

    But you can just chuck away all the troublesome baggage that the Bohmian model entails and accept fundamental randomness – then the squared power rule is the most likely outcome, a large numbers result – ie it is a thermodynamic property of the evolution

  • Hakon Hallingstad

    Sean @ 16 and Moshe,

    If one carries out the calculation for a|-x2> – a|-x1>, one comes to the equation:
    V[a|x2> - a|x1>] + V[-a|x2> + a|x1>] = x1 + x2.

    However under the assumptions above, we are not allowed to assume
    V[a|x2> - a|x1>] = V[-a|x2> + a|x1>]
    and so the derivation stalls at this point. It is absolutely crucial for the argument that the coefficients in a|x1> + b|x2> are equal, contrary to QM which allows an arbitrary phase.

    For instance
    V[a|x1> + b|x2>] = (|a| x1 + |b| x2) / (|a| + |b|),
    would be consistent with the 2 axioms. As far as I can tell, the axioms imply
    - V is linear in x1 and x2
    - The coefficient of x1 is some function f(a, b), with f(1, 0) = 1
    - The coefficient of x2 is f(b, a) = 1 – f(a, b)

    In order to show V is the expected average value of a measurement of X, one will have to prove f(a, b) = |a|^2/(|a|^2 + |b|^2), so there is still a lot of derivations left to be done. And showing the coefficient goes as |a|^2 is the hard part of the Born rule.

  • http://alastairwilson.org/ Alastair Wilson

    Huw – actually, I’d have thought that freely modifying metaphysics in situations like this is congenial to pragmatism. The ‘harder’ scientific claims of physics, confirmation theory, natural language semantics, etc aren’t meddled with; we just pick (on a pragmatic basis) whichever metaphysical framework allows the harder claims to hang together most naturally.

    On a) – I was suggesting that any possible agent is going to be an agent with *only* in-branch preferences – sorry for being unclear. From the perspective I advocate, the whole state of the wavefunction is a non-contingent subject-matter: the only contingency is self-locating. On a functionalist account of mental states, it makes no sense to ascribe preferences defined over non-contingent subject-matters. (What’s going on here is that the modal framework is helping reinforce Wallace’s ‘pragmatic’ argument for his principle Branching Indifference.)

    On b) – yes, the equivalent of wanting to avoid torture in any world, in the limiting case of infinitely many worlds, will be infinite risk aversion. Is that a problem? (In any case, the limiting case might turn out to be metaphysically impossible – that’s an empirical matter.) What the response is meant to buy is a translation between ‘preferences over wavefunctions’ and ordinary preferences. Everettians who take this line can explain away the apparent coherence of preferences over wavefunctions by showing that they’re just ordinary kinds of preferences (i.e. preferences about self-location) under an unfamiliar mode of presentation.

  • Hal S

    Just one last note.

    Using ‘ to represent an index, an equation that makes some of the previous comments clearer is

    = Sum (E’ |z’|^2)

    which is understood as meaning that the probability of seeing eigenvalue E is the absolute value of complex number z squared.

    Now Dirac has some interesting points that should be considered in ‘The Principles of Quantum Mechanics 4th ed’.

    pg35

    “One might think one could measure a complex dynamical variable by measuring separately its real and imaginary parts. But this would involve two measurements or two observations, which would be all right in classical mechanics, but would not do in quantum mechanics, where two observations in general interfere with one another-it is not in general permissible to consider that two observations can be made exactly simultaneously,..”

    pg38

    “In the special case when the real dynamical variable is a number, every state is an eigenstate and the dynamical variable is obviously and observable. Any measurement of it always gives the same result, so it is just a physical constant, like the charge of an electron.”

    pg74

    “Even when one is interested only in the probability of an incomplete set of commuting observables having specified values, it is usually necessary first to make the set a complete one by the introduction of some extra commuting observables and to obtain the probability of the complete set having specified values (as the square of the modulus of a probability amplitude), and then to sum or integrate over all possible values of the extra observables.”

    So an observer can not make two simultaneous measurements of the same observable, physical constants are real numbers, and if you don’t have enough indices to fully describe the state you add more indices and consider all potential values.

    Since this procedure can continue indefinitely one begins running into the same problems with the continuum.

    The point in this rambling is that although we can not know whether such higher order hierarchy has real existence, we have to resort to it from a computational standpoint.

  • http://van.physics.illinois.edu/qa/index.php Michael Weissman

    Just a quick semi-coherent placeholder note, since I have to run now. As you say, the issue of P in MW is much trickier than if you have some sort of extra collapse in which to insert special new rules. The traditional argument justifying Born is the one that Ben refers to, reproduced by Arkani-Hamed, but that’s long since been known to be invalid, since the limiting procedure is irrelevant.

    On Deutsch and decision theory: “Given a game with a certain set of possible payoffs, the value of playing a game with precisely minus that set of payoffs is minus the value of the original game.” What does a “precisely minus payoff” even mean, except in the context of little financial games, where the statement is well-known to be false?

    The question is not so much what a rational actor would bet, but how the existence of rational actors can be reconciled with the unitary structure+decoherence. The problem becomes one of why the probabilities for sequential observations factorize, i.e. why the chance of Schroedinger’s cat having survived the Tuesday experiment doesn’t change on Wednesday due to quantum fleas. As has repeatedly been shown, only the standard quantum measure gives the conserved flow needed to allow that factorization and hence allow the existence of rational actors.
    So that’s a requirement but not an explanation. The best (only) explanation I’ve seen is by Jaques Mallah. If the state consists of the usual part we think about plus some maximal entropy white noise, a physical definition of a thought as a robust quantum computation, together with ordinary signal-to-noise constraints on robustness (square-root averaging), gives the Born rule from ratios of counts of thoughts!

    Why that particular (mixture of low S +high S parts) starting state? Mallah doesn’t like this idea but I suggest the old cheat: anthropic selection. If that type of state is needed to allow the existence of rational actors, nobody will be arguing about why they find themselves part of some other type of state.

    I’ll try to get back to fill this in more coherently in 24 hours.

    p.s. Zurek’s paper sneaks in context-independent probabilities, and thus doesn’t really address the core question.

  • Abram Demski

    How do the coefficients enter into the story at all? It looks like assumptions (2) and (3) make just as much sense if the coefficients for the two states are different, but if that’s true, the we can derive (1) for the case when the coefficients are different as well… in other words, taken at face value, the argument seems to prove that V[a|x_1>+b|x_2>]=1/2(x_1+x+2) no matter what ‘a’ and ‘b’ are.

  • Abram Demski

    I revoke my previous question (after actually trying to carry though the math).

  • Michael Weissman

    I should make at least one small correction to my hasty and over-compact note. The background entropy in Mallah’s picture is high, not maximal.

  • Hakon Hallingstad

    Since this article doesn’t explain where the absolute square of the amplitude comes in with Deutsch’s argument (48), I have read his paper which introduces it in equations 16 – 21. However I don’t understand the argument. It would be great if someone could explain why the value of eq. 18 equals the LHS of eq. 16, i.e. why is

    V[|x1>|y1> + ... + |x1>|ym> + |x2>|y m+1> + ... + |x2>|yn>] =
    V[sqrt(m) |x1> + sqrt(n - m) |x2>]

    when y1 + … + ym = y_{m+1} + … + yn = 0? Can this actually be derived or is it an axiom? If the former, it does seem to rely on the state vectors being normalized, which would also need to be postulated?

  • Hakon Hallingstad

    .

  • Hal S

    @47

    Got a copy of Pauli’s book. Good stuff.

    I like this on the first page, written in 1933

    “The solution is obtained at the cost of abandoning the possibility of treating physical phenomena objectively, i.e. abandoning the classical space-time and causal description of nature which essentially rests upon our ability to separate uniquely the observer and the observed.”

    Combined with the fact that any bound state can be represented in a quantum field theory, it appears we are getting closer to completely abandoning any notion that general relatively is even needed.

  • http://qpr.ca/blog/ Alan Cooper

    Daryl, Thank you for responding (@36&38) to my question. Unfortunately I have been away for a few days and so have been slow to respond, but I hope you are still around and following this discussion as I remain puzzled.

    I have no problem with agreeing that your conditions (1)(2)(3) imply the Born rule (and similarly for Sean’s and David’s similarly numbered equations) but I still don’t see how these are implied by decision theory without essentially assuming the Born rule to start with.

    Yes, the “states” in question are all eigenfunctions for the same observable, but on the two sides of each value equation (other than (1)) they correspond to different eigenvalues so they are not actually the same states.

    In fact the decision theoretic increment of value that is expected from replacing X by X+k and the reversal that comes from replacing X with -X seem to me to be obvious only if we work with the same state and consider the observable to be what is changing.

    To ask for these to also apply when the operator stays the same but the states are changed seems to involve an implicit assumption that V(a1|x1>+a2|x2>) is a linear combination of x1 and x2 with coefficients p1(a1,a2) and p2(a1,a2) which are independent of |x1> and |x2>. And to me that looks very much like begging the question.

    Is there a way to show (without assuming the usual expectation formula) that V(X+k,|Psi>)=V(X, T(k)|Psi>?

  • http://qpr.ca/blog/ Alan Cooper

    What seems odd about this business of starting with the Hilbert space and inferring a probabilistic interpretation after the fact, is that the Hilbert space itself arises naturally as a way of representing the possible families of probability distributions for observables. In that approach, pioneered by von Neumann and Mackey, and nicely developed and summarized in the books by Varadarajan, the starting point is a lattice of questions (observables with values in {0,1}) and the notion of probability for these seems to be no less elementary than that of decision theoretic “value” since the expected value of a proposition in any state is just the same as the probability that it is observed to be true.

  • Hakon Hallingstad

    Here’s an example where (2) and (3) is consistent with a different probability rule than Born’s.

    Just before we’re measuring the observable X (or “playing the game” in Deutsch’ terminology), we will scale |psi> such that the sum of the expansion coefficients is 1.
    1. |psi> = a1 |x1> + a2 |x2>
    2. a1 + a2 = 1
    This scaling is not as physically illogical as you might think, for instance the collapse of the wavefunction can also be viewed to contain a rescaling of the observed eigenvector immediately after/during the observation.

    Let |base> be the sum of all eigenvectors of X.
    3. |base> = |x1> + |x2> + …

    I’m going to show that the following definition of the expected value of the measurement (“payoff”) satisfies (2) and (3) in this article.
    4. V[|psi>] = <base| X |psi>

    Here’s how (2) is satisfied:
    5. V[a1 |x1 + k> + a2 |x2 + k>]
    = a1 (x1 + k) <base|x1 + k> + a2 (x2 + k) <base|x2 + k>
    = a1 x1 + a2 x2 + k
    = V[a1 |x1> + a2 |x2>] + k

    Above, <base|x1 + k> is 1 since |base> is the sum of the eigenvectors, and |x1 + k> is an eigenvector. Similarily, (3) is satisfied because:
    6. V[a1 |-x1> + a2 |-x2>]
    = -a1 x1 <base|-x1> – a2 x2 <base|-x2>
    = -(a1 x1 + a2 x2)
    = -V[a1 |x1> + a2 |x2>]

    The main result in the article is then reproduced easily:
    7. V[(|x1> + |x2>) / 2] = (x1 + x2) / 2

  • Hakon Hallingstad

    Let me follow the same arguments in this blog article and Deutsch’s, to prove something other than the Born rule:
    0. V[a1 |x1> + a2 |x2> + …] = |a1| x1 + |a2| x2 + …

    To be able to make the argument we will need to postulate that the state vector should be scaled just prior to the measurement, such that the sum of the absolute values of the probability amplitudes is 1, instead of being normalized.
    1. If |psi> = a1 |x1> + a2 |x2> + …, then |a1| + |a2| + … = 1

    Because of this postulate, instead of (|x1> + |x2>) / sqrt(2) we will use (|x1> + |x2>) / 2, etc. If we now assume the equivalent of equation (2) and (3) from this blog article:
    2’. V[(|-x1> + |-x2>) / 2] = -V[(|x1> + |x2>) / 2]
    3’. V[(|x1 + k> + |x2 + k>) / 2] = V[(|x1> + |x2>) / 2] + k

    we will end up with the equivalent equation for exactly the same reasons made in this article, since the sqrt(2) never comes into the derivation.
    4’. V[(|x1> + |x2>) / 2] = (x1 + x2) / 2

    Let’s move over to Deutsch’s article, and the chapter “The general case”. We first want to prove the equivalent of equation (Deutsch.12):
    12’. V[(|x1> + |x2> + … + |xn>) / n] = (x1 + x2 + … + xn) / n

    The proof is made by induction in two stages. Now I must admit that I don’t understand the first stage, but it doesn’t sound like that will be a problem for my argument (the sqrt(2) is again not used). For the second stage we can use the same arguments of “substitutibility”. Let V[|psi1>] = V[|psi2>] = v, then:
    13’. V[(a |psi1> + b |psi2>) / (|a| + |b|)] = v

    If we now set
    14’. |psi1> = (|x1> + … + |x n-1>) / (n – 1), |psi2> = | V[|psi1>] >, a = n – 1, b = 1

    Then (13’) implies:
    15’. (x1 + x2 + … + x_{n-1} + V[|psi1>]) / n = V[|psi1>]

    Note, (15’) is identical to (Deutsch.15).

    Now to the crucial part of Deutsch’s argument. What we want to show and the equivalent of (Deutsch.16) is:
    16’. V[ (m |x1> + (n - m) |x2>) / n] = (m x1 + (n – m) x2) / n

    and (Deutsch.17), (Deutsch.18), and (Deutsch.20) are:
    17’. sum_{a = i}^m |ya> / m or sum_{a = m + 1}^n |ya> / (n – m)
    18’. (sum_{a = i}^m |x1>|ya> + sum_{a = m + 1}^n |x2> |ya>) / n
    20’. (sum_{a = i}^m |x1 + ya> + sum_{a = m + 1}^n |x2 + ya>) / n

    Again, we’re allowed to do this according to the postulate, because we’re just about to do a measurement, and then we need to scale such that the sum of the absolute values of the probability amplitudes is 1.

    Equations (Deutsch.19) and (Deutsch.21) are not changed. (Deutsch.22) obviously reads:
    22’. sum_a p_a |x_a>, sum_a p_a = 1

    The next arguments may pose a problem. They’re supposed to show that even though the above results are valid for p_a being rational numbers, they should also apply if p_a is a real number.

    For instance the unitary transformation is imagined to transforms eigenvectors into a higher eigenvalued eigenvectors. The value of the game is then guaranteed to increase. Not so with our postulate, since we need to scale our state vector just prior to a measurement, and in general the scale factor would be different before a unitary transformation and after.

    I’m guessing there is an argument for proving how to extend it to real numbers, but I just don’t see it yet. So for now, we will have to be content with the probability amplitudes being rational numbers.

    The conclusion of all of this is that the normalization of the state vector is crucial for Deutsch’s derivation.

  • DanW

    @Hakon Hallingstad:

    your reasoning here is, I’m afraid, totally bogus. I’m not trying to be nasty and I’m sure you won’t take it as such since you seem in other posts to be pretty keen on learning properly how to do these things. One particular error I can spot:

    “For instance the unitary transformation is imagined to transforms eigenvectors into a higher eigenvalued eigenvectors. ”

    In what follows, m* = “hermitian conjugate of m” not, “times by” :-) .

    Unitary operators have eigenvalues of magnitude 1. to see this, consider that the definition of a unitary operator is that its inverse is equal to its hermitian conjugate.

    UU* = 1 by the definition of unitarity.
    If U|a> = m |a> , this implies <a| U* = <a| m*,
    but = by unitarity definition.
    From above,
    = m m* , hence m m* = 1.
    This means that the magnitude of m is 1. So you can’t have a “unitary transformation” that makes “the eigenvalues higher”. It is a contradiction in terms.

    cheers

  • Hakon Hallingstad

    @DanW

    > your reasoning here is, I’m afraid, totally bogus. [...]

    Right, assumption (61.1) does not hold in Quantum Mechanics proper. I’m interested in knowing about other flaws you can point out, and to see whether those flaws can also be applied to Deutsch’s original arguments.

    > One particular error I can spot. [...]

    I was too careless with my choice of words, so you misunderstood me. I was only trying to refer to Deutsch’s argument on page 12, for instance he says “Now, if U transforms each eigenstate |xa> of X appearing in the expansion of |psi> to an eigenstate |xa’> with higher eigenvalue.” See there for details.

NEW ON DISCOVER
OPEN
CITIZEN SCIENCE
ADVERTISEMENT

Discover's Newsletter

Sign up to get the latest science news delivered weekly right to your inbox!

Cosmic Variance

Random samplings from a universe of ideas.

About Sean Carroll

Sean Carroll is a Senior Research Associate in the Department of Physics at the California Institute of Technology. His research interests include theoretical aspects of cosmology, field theory, and gravitation. His most recent book is The Particle at the End of the Universe, about the Large Hadron Collider and the search for the Higgs boson. Here are some of his favorite blog posts, home page, and email: carroll [at] cosmicvariance.com .

ADVERTISEMENT

See More

ADVERTISEMENT
Collapse bottom bar
+

Login to your Account

X
E-mail address:
Password:
Remember me
Forgot your password?
No problem. Click here to have it e-mailed to you.

Not Registered Yet?

Register now for FREE. Registration only takes a few minutes to complete. Register now »