Tau Topics - Accuracy of Diagnostic Laboratory Medical Tests: Specificity and Sensitivity

Home

Disinfection with Vinegar and Hydrogen Peroxide

Toenail Fungus Case Study

Fly Control for Horses

Lab Test Accuracy

Risk, Rabbits and Raisins

DDT "Ban" and Malaria Deaths

Challenges in Finding Safe Building Materials for the Chemically Injured

Corn Gluten Herbicide

Weed Control Ideas

Ick! Lice!

Peanut Allergy
Op-Ed

Bona Fide Organics

Lab Test Accuracy

Accuracy of Diagnostic Laboratory Medical Tests:
Specificity and Sensitivity

Copyright © March 1999 Judy Stouffer. All rights reserved.
This article may not be copied or published anywhere, including in any electronic format,
without specific permission from Judy Stouffer, B.S., M.S.

An acquaintance recently mentioned that "20/20 or one of those news shows just did a piece on the use of hair testing. They sent a standard poodle's hair in for testing to 4 or 5 labs and got different results from each. I have come to the educated conclusion that hair testing is a great farce. But for some it might lead to being treated by just the thing that their body needs. But don't think it would be due to the accuracy of testing, just luck!"

The following is information to help anyone, including medical professionals, understand why diagnostic tests have widely varying accuracy. The reasons for the levels of accuracy depend on a variety of factors.

Results from most diagnostic tests, not just hair testing, will differ from laboratory to laboratory. There are several reasons for this, ranging from the training and skill of the lab technicians to proper test set-up protocol (for example, when analyzing hair samples, which portion of the hair shaft the sample came from is critical), to - as 20/20/whoever was trying to point out - the actual ability of the test to measure what it is supposed to measure.

We know of labs in Wisconsin that were given awards when their accuracy rate (that is, their rate of correctly carrying out tests from start to finish) was 97%, and I would guess that nationwide that is considered an 'award-winning' rate of accuracy for correctly doing diagnostic tests.

But this whole issue raises a point that is key to understanding the accuracy of any diagnostic test: even with the best of training and perfect protocols, laboratory tests do not yield yes/no answers. The accuracy of the tests are derived from statistical distributions. Because of this, the rarer a condition in the general population, the higher the probability that a laboratory test will yield inaccurate results.

Here's a bit of explanation to help understand why I say that.

<Judy pulls out and dusts off her statistics hat, and promises that this is NOT math!>

A diagnostic test investigates the statistical relationship between test results and the presence of disease. For all diagnostic tests there are two critical components that determine its accuracy: sensitivity and specificity.

Sensitivity is the probability that a test is positive, given that the person has the disease.

Specificity is the probability that a test is negative, given that the person does not have the disease.

Using AIDS as an example: assume for this example that both the specificity and sensitivity of the test for AIDS is 99%.

Given this, if you took an AIDS test and have a positive result, the test itself may only be 50% accurate!!!!

How can that be??? Well, it turns out that when you work with statistics you also have to take into account the incidence of something - how common it is - in the population at large. Let's assume the incidence of AIDS in the United States is 1%, and then test 10,000 people (since I'm assuming that 1% have AIDS, that means that 100 people out of these 10,000 really have AIDS, and 9900 really do not have AIDS). We would get the following results from our diagnostic laboratory test (each 'number' is the number of people that fall into that category):

	Test Result Positive	Test Result Negative	Total Number of People
Really Has AIDS	99^a	1	100
Doesn't have AIDS	99^c	9801^b	9900
Totals	198	9802	10,000

^aIf you look across the first row you will see that the sensitivity - the probability that the test found that the patient had AIDS, given that the person really has AIDS - is 99%. (99/100 = 99%)

^bIf you look across the second row you will see that the specificity - the probability that the test didn't find AIDS, given that the person doesn't have AIDS - is also 99% (9801/9900).

^cBut here's where it gets weird - look DOWN the first column. What this says is: the probability that you have AIDS, given that you tested positive for AIDS, is 50%!!! (99/198). If you look down this column you will clearly see that 99 of the 198 people who tested positive for AIDS in this test clearly do NOT have AIDS.

Bottom line: when someone tells you that a diagnostic test, done properly, is 99% accurate (meaning that both the specificity and the sensitivity = 99%), the actual 'accuracy' of the test will in fact really depend on how common the disease you are testing for is in the population you are testing.

This example clearly points out why diagnostic tests are designed to confirm a diagnosis for rare conditions. They should not be used to go 'fishing' for a diagnosis (which unfortunately happens as an inappropriate use of many diagnostic tests).

In real life, we don't nab 10,000 people randomly and run them through AIDS tests. People who tend to get AIDS tests are people who are at high risk for a variety of factors (lifestyle, occupation, medical condition such as hemophilia, transfusion recipient and the like), so they are in a statistical sense a different 'population' than the general population.

Even so, keep in mind that the test is most certainly NOT 99% 'accurate' in the way all of us think about accuracy, even though both its specificity and sensitivity are 99%. You always must take into account how common the disease you are testing for is in the population you are testing. Many, MANY tests have much lower specificity and sensitivity than the 99% I've used in this example. However, if the condition these tests are trying to diagnose is much more common in the population, then what we think of as 'accuracy' becomes better for a given level of specificity and sensitivity as the prevalence of the disease increases.

However, no professional should be complacent about the 'high level of accuracy' they get from laboratory diagnostic tests even for 'common' conditions. Let's quickly run through an example of a ‘best case' scenario, using the assumption that we wish to screen for a new fictional bacteria (I'll call it "Pyro") which affects one in nine people in the population at any given time.

If it affects 1 in 9 people, the incidence of Pyro in the United States is 11%. Suppose we test 10,000 people (if 11% have Pyro, that means that 1111 people out of these 10,000 really have Pyro, and 8889 really do not have Pyro). We would get the following results from our diagnostic laboratory test (each 'number' is the number of people that fall into that category):

	Test Result Positive	Test Result Negative	Total Number of People
Really Has Pyro	1100^a	11	1111
Doesn't have Pyro	89^c	8800^b	8889
Totals	1189	8811	10,000

^aIf you look across the first row you will see that the sensitivity - the probability that the test found that the patient had Pyro, given that the person really has Pyro - is 99% (1100/1111 = 99%).

^bIf you look across the second row you will see that the specificity - the probability that the test didn't find Pyro, given that the person doesn't have Pyro - is also 99% (8800/8889).

^cNow look DOWN the first column. What this says is: the probability that your test incorrectly says someone has Pyro, when in fact they do NOT have Pyro, is 8% (89/1189).

So, in this case - where we have a condition that is widespread (1 out of 9 is pretty widespread!) your best diagnostic test - one which has 99% sensitivity and 99% specificity - is still going to generate at best 92% ‘accuracy' (in what we usually think of as accuracy).

Ok, I think I'll put that statistics hat away for the night...

I'd like to take credit, btw, for laying out all the above, but it was a joint effort between my husband and myself (yes, we are both boring statistical-types! <g>). It was fueled by an article we saw many years ago by Marilyn vos Savant, where she first pointed this out and knocked our socks off! :-)

Design, layout, graphics and contents copyright 1999-2021 Judy Stouffer. All Rights Reserved. The articles, graphics and images on this website may not be copied or published anywhere, including in any electronic format, without specific permission from Judy Stouffer, B.S., M.S.

The pages on this website have been viewed millions of times by visitors from more than 140 countries since going online in March 1999.

Page last updated: March 12, 2021