## Bayesian statistics and Bell’s inequalities

I was surfing the web and stumbled across a fascinating example of the application of Bayesian statistics that I thought had some pedagogical power to it. The original post, which is self-admittedly excruciating, is here. In any case, here’s the data for the example:

1.) 1% of women over the age of 40 who participate in routine screening have breast cancer

2.) 80% of women with breast cancer will get positive mammographies

3.) 9.6% of women *without* breat cancer also get positive mammographies

Question: A woman in this age group has a positive mammography. What is the likelihood she has cancer?

Apparently most doctors get the answer to this question wrong. Perhaps surprisingly (depending upon how your brain naturally processes these statistics) most doctors answer that the likelihood that the woman has breast cancer is somewhere between 70% and 80%. Before I show you the correct answer I will note that only 15% of doctors actually get this right as it is worded. Rather, if it is worded as follows, 46% of doctors get it correct:

1.) 100 out of 10,000 women over the age of 40 who participate in routine screening have breast cancer

2.) 80 of 100 women with breast cancer will get a positive mammography

3.) 950 of 9900 women *without* breast cancer will also get positive mammographies

Question: A woman in this age group has a positive mammography. What is the likelihood she has cancer?

The correct answer is 7.8%. How do you get that result? Well, the total number of women who get *positive* mammographies is 950 + 80 = 1030 (notice that it doesn’t really matter how many get a mammography at all, just how many who had one got a positive result). But only 80 of them actually turned out to have cancer. Therefore:

Pr(cancer if MMO +) = 80/1030 = 0.07767 or 7.8%

What’s the key to Bayesian statistics? The key is prior knowledge. Bayesian probabilities can easily be modified if the given information changes. In a sense it is because there is some correlation or link between certain quantities. The way in which a typical probability is interpreted is as a measure of how frequent an event is. So if you roll a pair of dice 10 times in a row and come up with a roll of four more times than any other roll you might be tempted to think that four is the most likely roll on any pair of dice which is patently false (theoretically a roll of seven is the most likely). This is the frequentist interpretation of probabilities. The Bayesian interpretation assigns probabilities to propositions that are uncertain since it is in some sense a measure of the degree of certainty. Certainly there are plenty of instances when the two give the same result but often cases where they do not. In the analysis above it is important not to care about frequencies, rather just exact data for the given situation. In the dicing example the number of rolls wouldn’t make a difference. Rather a Bayesian analysis might look at the full situation and make and argument from that (in that sense I argue that the idea of counting microstates and macrostates in order to determine probabilities is Bayesian since it has nothing to do with how frequently an event *occurs* but rather how many possible combinations are available, that is to say the amount of knowledge one has).

What does this have to do with Bell’s inequalities? Well, in Wigner’s derivation of Bell’s inequalities he *clearly* uses the frequentist approach to probabilities. Are Bell’s inequalities inherently frequentist then? Not necessarily since it is quite clear that one could consider even the Wigner form and assume that information about two of the systems both independently depend on (or can be informationally updated from) a third. Plenty of authors have considered this point of view but the details are beyond this current post.

**Note to my students:** Think you’ve found an elusive macroscopic violation of (A, not B) + (B, not C) â‰¥ (A, not C)? Post it here!

September 14, 2007 at 8:28 am

Hi,

take a look to these two articles:

1) The conjunction fallacy and interference effects, http://xxx.lanl.gov/pdf/0708.3948

2) The inverse fallacy and quantum formalism, http://xxx.lanl.gov/pdf/0708.2972

They show how the quantum formalism implies violations of Bayes’ theorem, and these violations correspond to known heuristics of cognitive science

September 14, 2007 at 11:56 pm

Interesting. I will have to look at these. There is an entire subset of quantum theorists who are adherents to a Bayesian interpretation of quantum probabilities (e.g. Chris Fuchs). Have you been in touch with any of these folks? The Bayesian formalism has certain attractions for quantum probabilities so perhaps there is a middle ground that might be found. I’ll have to read your papers though.

On the other hand, a pseudo-Bayesian interpretation does not necessarily require Bayes’ theorem in its existing form. It simply requires a belief in the “updating” ability of successive probabilities. For instance, I just had a discussion with two of my colleagues about the lottery. In theory, every drawing is completely independent of every other drawing.

But let’s simplify things for a moment and consider a “lottery” consisting of just two numbers where these numbers can vary between 1 and 6 (i.e. a pair of dice). Let’s say I choose the numbers 1 and 1 (snake eyes). If I keep playing those numbers over and over again, eventually I’ll win since a pair of dice will, over the long run, exhibit the typical distribution expected. In fact we’ve been collecting data from a probability lab we run in our classes in which we have assembled thousands of rolls over many years and the distribution matches what you’d expect.

So, if you scale it up, by playing the same numbers repeatedly, you should in theory increase your chances of winning since those numbers

willcome up at some point (though it may be thousands of years after you’re dead). On the other hand, playing different numbers each time is like tossing darts at a moving target. In theory, the probability of having your number come up is the same either way, but since that probability is based on outcomes, previous outcomes can be used to “update” this probability.To me, that’s the Bayesian interpretation. Of course, there is a seriously subtle concept and some people might not think there’s really any difference.

Anyway, thanks for the post and I’ll have to look at those articles.