## Statistical Superpowers

Science fun for all. Science fairs, homemade experiments, amateur microscopy, comics, puzzles, videos, or anything else you'd like to share.

### Statistical Superpowers

Suppose you are exposed to nuclear radiation. You must be tested to find out whether you have either cancer or superpowers.

Only 10 out of 100 people exposed in this way develop superpowers.

The test for superpowers is 90% reliable in the sense that:

If 100 superheroes were tested then 90 of them would get a positive result
If 100 irradiated cancer sufferers were tested then 90 of them would get a negative result

You get tested and the result is positive.

What is the likelihood that you are now a superhero?
genemachine
Member

Posts: 166
Joined: 01 Apr 2005

### Re: Statistical Superpowers

50% ?

BioWizard

Posts: 12071
Joined: 24 Mar 2005
Location: United States
Blog: View Blog (3)

### Re: Statistical Superpowers

BioWizard wrote:50% ?

No credit. The "?" shows a lack of confidence, and you did not show your work. :p

But you are correct. Out of 100 cases, you will get 9 true positives results (90% of 10) and 9 false positive results (10% of 90).

I can actually make your correct answer seem even less likely for those who don't do the math, by making the accuracy of the test higher, but lowering the prevalence rate.

If 5% of a population become superheros, and a test for superheroism a 95% accurate, what is the chance that a person from the population who tests positive is actually a superhero?

50%. But that seems really WRONG for a test that is 95% accurate. A coin flip, really? Run the numbers and yep, really. 5% of 95% non-superheros (4.75%) will test positive. Also 5% are superheros and 95% of them (4.75%) will test positive. Equal odds on either side.

For the general case:

http://en.wikipedia.org/wiki/Positive_predictive_value

Ursa Minimus
Member

Posts: 605
Joined: 05 Feb 2012
Location: Northwoods, USA

### Re: Statistical Superpowers

But what does cancer have to do with the likelihood of being a superhero? This is what I find confusing. Are we to assume that all those exposed develop either cancer or superpowers, and none develop both? This is not stated.

Shouldn't the question read as follows?

Suppose you are exposed to nuclear radiation. You must be tested to find out whether you have superpowers or not.

Only 10 out of 100 people exposed in this way develop superpowers.

The test for superpowers is 90% reliable in the sense that:

If 100 superheroes were tested then 90 of them would get a positive result
If 100 non-superheroes were tested then 90 of them would get a negative result

You get tested and the result is positive.

What is the likelihood that you are now a superhero?
Positor
Active Member

Posts: 1090
Joined: 05 Feb 2010

### Re: Statistical Superpowers

Ursa Minimus wrote:
BioWizard wrote:50% ?

No credit. The "?" shows a lack of confidence, and you did not show your work. :p

My work was scribbled amongst my notes and I was too lazy to copy it here, so I just posted the answer. The "?" was to keep the thread rolling and others guessing.

BioWizard

Posts: 12071
Joined: 24 Mar 2005
Location: United States
Blog: View Blog (3)

### Re: Statistical Superpowers

I think Positor is correct:
my familiarity with English language is not sufficient to make me sure that "either cancer or superpowers" strictly means nobody gets both, nobody gets neither one.

The quiz is correct only if "either or" intrinsically implies XOR, but eeven in that case, it would not be bad to explicitly say that nobody can have both and nobody can be free of either effect.

neuro
Forum Moderator

Posts: 2624
Joined: 25 Jun 2010
Location: italy

### Re: Statistical Superpowers

neuro wrote:I think Positor is correct:
my familiarity with English language is not sufficient to make me sure that "either cancer or superpowers" strictly means nobody gets both, nobody gets neither one.

The quiz is correct only if "either or" intrinsically implies XOR, but eeven in that case, it would not be bad to explicitly say that nobody can have both and nobody can be free of either effect.

I felt that the below sentence obviated that it's either or.

genemachine wrote:Suppose you are exposed to nuclear radiation. You must be tested to find out whether you have either cancer or superpowers.

But I guess there's no harm in extra clarification, just to be sure.

BioWizard

Posts: 12071
Joined: 24 Mar 2005
Location: United States
Blog: View Blog (3)

### Re: Statistical Superpowers

Bayes to the rescue!

Prior probability of being super-hero: P(Super) = .1

Likelihood of testing positive for Superheroitus:
P(Positive) = P(Positive|Super) * P(Super) + P(Positive|Not Super) * P(Not Super)
= .9 * .1 + .1 * .9
= .18

Hence, probability of being super given you tested positive:
P(Super | Positive) = P(Positive | Super) * P(Super) / P(Positive)
= .9 * .1 / .18
= .50

Hooray for BioWizard!

xcthulhu
Resident Member

Posts: 2156
Joined: 14 Dec 2006
Location: Cambridge, MA
Blog: View Blog (3)

### Re: Statistical Superpowers

Well done to everyone who got this right, especially xcthulhu and Ursa Minimus. Sorry for any confusion caused by my rephrasing of the original question.

This question was recently used to estimate science literacy and only 3% of people got it right.

The phrasing used then was:

"Suppose you have a close friend who has a lump in her breast and must have a mammogram. Of 100 women like her, 10 of them actually have a malignant tumor and 90 of them do not. Of the 10 women who actually have a tumor, the mammogram indicates correctly that 9 of them have a tumor and indicates incorrectly that 1 of them does not have a tumor. Of the 90 women who do not have a tumor, the mammogram indicates correctly that 81 of them do not have a tumor and indicates incorrectly that 9 of them do have a tumor. The table below summarizes all of this information. Imagine that your friend tests positive (as if she had a tumor), what is the likelihood that she actually has a tumor?"

Also, supposedly, only 15% of doctors get the right answer to this variant which is a bit harder:

"1% of women at age forty who participate in routine screening have breast cancer. 80% of women with breast cancer will get positive mammographies. 9.6% of women without breast cancer will also get positive mammographies. A woman in this age group had a positive mammography in a routine screening. What is the probability that she actually has breast cancer?"
genemachine
Member

Posts: 166
Joined: 01 Apr 2005

### Re: Statistical Superpowers

BioWizard wrote:My work was scribbled amongst my notes and I was too lazy to copy it here, so I just posted the answer. The "?" was to keep the thread rolling and others guessing.

I understand your intent was to leave things open, and I assumed you did run the numbers. I wonder if you understand my intent by posting " :P " was to indicate that the words preceding the emoticon were to be taken less than seriously?

While I am all for puzzles and letting people thrash with them...

genemachine wrote:This question was recently used to estimate science literacy and only 3% of people got it right.

...

Also, supposedly, only 15% of doctors get the right answer to this variant which is a bit harder:

... in this case, people should know how these things work, but they don't. Doctors should know, but they don't. It is a health issue that most people will face at some time. Especially with cancer screenings (for an example, look at the recent issue with screening tests for prostrate cancer--search PSA+testing+controversy to see the details). I think it is very important that people know how to rationally face their loved ones and their own health challenges. Which is why I showed the work, and gave another example of why people get these wrong using "common" sense, and provided a link to the general PPV equation for all to see and use as they need in the future.

Ursa Minimus
Member

Posts: 605
Joined: 05 Feb 2012
Location: Northwoods, USA

### Re: Statistical Superpowers

xcthulhu wrote:Bayes to the rescue!

I believe this is particularly appropriate!

The point is that the a priori probability of being Superhero is 10%, rather low.
The a posteriori probability (given the positivity of the test) is 50%, quite relevant!
and a quite relevant difference!

This is what our mind implicitly captures, before we do any computation, and makes us feel that the correct answer, 50%, is a bad score, and the probability of being a superhero, given the positive test, should be higher... The point is 50% is absolutely no bad score, given we only had 10% before!

We should stress that the probability is much (5-fold) higher now, after the positive test. This would make everything more acceptable, intuitively.

neuro
Forum Moderator

Posts: 2624
Joined: 25 Jun 2010
Location: italy

### Re: Statistical Superpowers

I understand statistics is an important tool for policymakers, but it doesn't help all that much when it fails to respond to what the patient wants to know. Do I (or did I) have prostate cancer or not? I tested marginally positive for it on the basis of a PSA test (around 4). I went for a biopsy and it came back with a fairly high score. I was then diagnosed as having prostate cancer. I did see pictures from the original film, and I suppose it can look scary, but it's not the picture of a tumor or set of tumors so I'm not sure what I'm looking at. I asked for a second opinion, but the second opinion was based on the film of the same test, though it reduced the aggressiveness value slightly (and if taken seriously would have put it into a lesser category, but it wasn't taken seriously by my oncologist). Now, since I'm at the mercy of all these tests and their diagnosis, and am not able to confirm them on the basis of how I feel or on any other symptoms I might have, I'm left without any real assurances on my condition. The doctors base their diagnosis on statistics (at least in how they represent their responses when questioned), so, I don't quite know whether to trust that I'm an exception or the rule. The logic of a statistical account doesn't tell me what I want to know. As such, whenever I respond to a questionnaire, or questions about my past medical history, I can't quite come to grips with the question they ask me when they ask me if I've had cancer. I can only say I've been diagnosed (and treated for) cancer. (I even have a problem with the diagnosis of lung cancer that presumably killed my mother, even though I saw the large tumor on the X-Ray, one which I feel confident was the result of long years of smoking, though she had quit in later years and subsequently given a clean reading of her lung. What I saw on the screen was not a tumor in the lung itself, but it seemed attached to a region in her anatomy adjacent to vertebrate at or near the place where you put your hand for CPR. But what can I say, I have no real experience in reading such charts, nor much in the way of anatomy either.)

James
owleye
Honored Member

Posts: 5685
Joined: 19 Sep 2009

### Re: Statistical Superpowers

Ursa Minimus wrote:
BioWizard wrote:My work was scribbled amongst my notes and I was too lazy to copy it here, so I just posted the answer. The "?" was to keep the thread rolling and others guessing.

I understand your intent was to leave things open, and I assumed you did run the numbers. I wonder if you understand my intent by posting " :P " was to indicate that the words preceding the emoticon were to be taken less than seriously?

Sorry, I'm impervious to emoticons.

BioWizard

Posts: 12071
Joined: 24 Mar 2005
Location: United States
Blog: View Blog (3)

### Re: Statistical Superpowers

Does anyone here find Bayes rule intuitive and apply it by default on such a problem?

I solve these problems by starting with the priors

(A = superpowers)
P(A) = 0.1
P(~A) 0.9

Add in the posterior (B = positive test result)

P(A and B) = 0.1*0.9 = 0.09
P(A and ~B) = 0.1*0.1 = 0.01
P(~A and B) = 0.9*0.1 = 0.09
P(~A and ~B) = 0.9*0.9 = 0.81

We know we've got a positive test result, so we're in either

P(A and B) = 0.1*0.9 = 0.09
or
P(~A and B) = 0.9*0.1 = 0.09

So that's 1:1, 50%
genemachine
Member

Posts: 166
Joined: 01 Apr 2005

### Re: Statistical Superpowers

owleye wrote:I understand statistics is an important tool for policymakers, but it doesn't help all that much when it fails to respond to what the patient wants to know. Do I (or did I) have prostate cancer or not? I tested marginally positive for it on the basis of a PSA test (around 4). I went for a biopsy and it came back with a fairly high score. I was then diagnosed as having prostate cancer. I did see pictures from the original film, and I suppose it can look scary, but it's not the picture of a tumor or set of tumors so I'm not sure what I'm looking at. I asked for a second opinion, but the second opinion was based on the film of the same test, though it reduced the aggressiveness value slightly (and if taken seriously would have put it into a lesser category, but it wasn't taken seriously by my oncologist). Now, since I'm at the mercy of all these tests and their diagnosis, and am not able to confirm them on the basis of how I feel or on any other symptoms I might have, I'm left without any real assurances on my condition. The doctors base their diagnosis on statistics (at least in how they represent their responses when questioned), so, I don't quite know whether to trust that I'm an exception or the rule. The logic of a statistical account doesn't tell me what I want to know. As such, whenever I respond to a questionnaire, or questions about my past medical history, I can't quite come to grips with the question they ask me when they ask me if I've had cancer. I can only say I've been diagnosed (and treated for) cancer. (I even have a problem with the diagnosis of lung cancer that presumably killed my mother, even though I saw the large tumor on the X-Ray, one which I feel confident was the result of long years of smoking, though she had quit in later years and subsequently given a clean reading of her lung. What I saw on the screen was not a tumor in the lung itself, but it seemed attached to a region in her anatomy adjacent to vertebrate at or near the place where you put your hand for CPR. But what can I say, I have no real experience in reading such charts, nor much in the way of anatomy either.)

James

James,

Are you using yourself as an example, or do you still actually have concerns? I'll assume an example, but if you have concerns, feel free to send me a private message and we can talk in as much detail as you are comfortable with. I am not an MD, but I have some experience wading through medical records from my research. Also, there are some huge problems in communication that result from a variety of things, like medical providers not listening very well, ignoring emotional cues. One might even say that some of them are "impervious" to the concerns being expressed by verbal and non-verbal means because they are not focused on trying to understand what is TRYING to be said. There are ways to get around that if you run into it, but you have to be aware it might happen and ready for it. Not something most are in a mind to do when dealing with a diagnosis of cancer. What you describe sounds like it is a communication issue to me, but that would require a lot of details to get into. Again, maybe not a thing you want to do in the thread.

But for the diagnosis process itself....

Keep in mind that any other symptoms are also part of a diagnosis. If these symptoms presented before the test, then that would make me think it is more likely to be an accurate result than if one was symptom free.

Biopsy means they took a physical sample. I would NEVER accept a cancer diagnosis without a biopsy, just based on film. Tumor, sure, malignant tumor, no.

That sample is then tested. The first test is to look at it, both without magnification (called a "gross examination") and with. The techs will look at it before sending it to the lab. In my experience, they will tell you something before you leave, in one of two forms:

"It looked fine, and we will send it to the lab and the doctor will call you when the results are in."

"The procedure went fine, and we will send it to the lab and the doctor will call you when the results are in."

The second version means it looked malignant. If I heard the first version THEN got a positive lab result, I might do more than get a second opinion. I might get a second biopsy.

While the PSA test is a statistical one (in terms of the false positive rate), the actual examination involved non-statistical methods. So if one wants to check the work, one should find the actual descriptions of what the tissue looked like in one's medical records (pathology reports, specifically). If it looked bad, and the numbers say it was cancer, then I personally would rest assured that I had cancer.

Is there a chance I would not have it? Yes, lab errors like sample switching can take place. A second biopsy would take that down to negligible levels.

I will not go into all the ways things can be screwed up in this process. But if you want one thought that might help you feel better, if there was a mistake in your case, you are unlikely to die from it at this point. A false negative, yes, but not a false positive.

And as I said, if you had symptoms, if you heard the tech say "procedure went well", if you heard that it looked cancerous to a visual inspection (or find that out now, you can look at your own medical records and check), then you almost certainly had cancer and should act accordingly.

Now, the real problem for many people is that they get a positive diagnosis, but are told it is a slow growing cancer. Is that right? Maybe, maybe not. But with proper follow up testing over time, as multiple tests lower the uncertainty, one can figure it out pretty well. So long as the first follow up test is soon enough of course.

I hope that helps.

Ursa Minimus
Member

Posts: 605
Joined: 05 Feb 2012
Location: Northwoods, USA

### Re: Statistical Superpowers

Well... people don't naturally use Bayes' Rule when trying to judge likelihoods - there's a bunch of papers by Nobel Prize winner Prof. Daniel Kahneman on this.

I thought I'd give another one, which is a little easier to accept (from an actuarial exam):

A survey of a group’s viewing habits over the last year revealed the following
information:
(i) 28% watched gymnastics
(ii) 29% watched baseball
(iii) 19% watched soccer
(iv) 14% watched gymnastics and baseball
(v) 12% watched baseball and soccer
(vi) 10% watched gymnastics and soccer
(vii) 8% watched all three sports.

Calculate the percentage of the group that watched none of the three sports
during the last year.

xcthulhu
Resident Member

Posts: 2156
Joined: 14 Dec 2006
Location: Cambridge, MA
Blog: View Blog (3)

### Re: Statistical Superpowers

xcthulhu,
genemachine
Member

Posts: 166
Joined: 01 Apr 2005

### Re: Statistical Superpowers

xcthulhu,

Also, good question. Does Kahneman have anything to say on how we solve these questions? I used a Venn diagram of sorts.
genemachine
Member

Posts: 166
Joined: 01 Apr 2005

### Re: Statistical Superpowers

genemachine wrote:xcthulhu,

Also, good question. Does Kahneman have anything to say on how we solve these questions? I used a Venn diagram of sorts.

I don't know what Kahneman would say, but in the case of exam questions there is often a short cut that will save a lot of time over crunching the numbers. Because exams test speed, and also insight for problem solving. No need to do the math if you can game the people who make up the questions!

I won't reveal it, but I see one shortcut that makes this a 2 second question. Either that, or I am making a rookie mistake.

Ursa Minimus
Member

Posts: 605
Joined: 05 Feb 2012
Location: Northwoods, USA

### Re: Statistical Superpowers

Ursa Minimus wrote:I won't reveal it, but I see one shortcut that makes this a 2 second question. Either that, or I am making a rookie mistake.

I imagine you refer to:
watching some sport:
(i)+(ii)+(iii) - (iv)[counted twice] - (v)[idem] - (vi)[idem] + (vii)[subtracted twice] = 48%
not watching any = 52%

Obviously you can write down the full equation:
p0 = 1-p(1)-p(2)-p(3)+p(1&2)+p(1&3)+p(2&3)-p(1&2&3),
where p(1) means watch sport1, p(1&2) means watch sports 1 and 2 etc.
p0 = (100-28-29-19+14+12+10-8)% = 52%,

But it is of interest to observe that you get intuitively to the solution if you simply reverse the order of the statements:
(vii) 8% watched all three sports.
(vi) 10% watched gymnastics and soccer, ... then only 2% did so and did not watch baseball
(v) 12% watched baseball and soccer, ... then only 4% did so and did not watch gymnastics
(iv) 14% watched gymnastics and baseball, ... then only 6% did so and did not watch soccer
(iii) 19% watched soccer, 8+2+4% also watched other sports: ... 5% watched soccer only
(ii) 29% watched baseball, 8+6+4% also watched other sports: ... 11% watched baseball only
(i) 28% watched gymnastics, 8+2+6% also watched other sports: ... 12% watched gymnastics only
8+2+4+6+5+11+12 = 48 -->> 52% did not watch any sport

neuro
Forum Moderator

Posts: 2624
Joined: 25 Jun 2010
Location: italy

### Re: Statistical Superpowers

genemachine wrote:xcthulhu,

Also, good question. Does Kahneman have anything to say on how we solve these questions? I used a Venn diagram of sorts.

Well, most people get the problem wrong - this is known as the base rate fallacy.

Kahneman and Tversky argue in Subjective probability: A judgment of representativeness (1972) that when reasoning about this sort of problem, we appeal to a representative heuristic. They define "representativeness" as "the degree to which [an event] (i) is similar in essential characteristics to its parent population, and (ii) reflects the salient features of the process by which it is generated".

xcthulhu
Resident Member

Posts: 2156
Joined: 14 Dec 2006
Location: Cambridge, MA
Blog: View Blog (3)

### Re: Statistical Superpowers

neuro wrote:
But it is of interest to observe that you get intuitively to the solution if you simply reverse the order of the statements...

That's the ticket.

It is common for people to process the information in such questions in the order they read the information. Many times on exams, later information can be the surest key to the problem. See the key, answer faster. This is intentional on the part of test designers, especially for something like an actuarial exam, where everyone taking it is assumed to be good at math. Test speed, test efficiency, hopefully test insight, on top of knowledge.

Ok, 2 seconds to see the answer path, longer to actually do the calcs (which I did not do). I'm not quite that fast I admit. But it seems my test question breakdown mojo hasn't faded over the decades of not being used :)

Ursa Minimus
Member

Posts: 605
Joined: 05 Feb 2012
Location: Northwoods, USA