An acquaintance recently mentioned that "20/20 or one of those news shows just did a
piece on the use of hair testing. They sent a standard poodle's hair in for testing to 4 or 5 labs
and got different results from each. I have come to the educated conclusion that hair testing is a
great farce. But for some it might lead to being treated by just the thing that their body needs. But
don't think it would be due to the accuracy of testing, just luck!"
The following is information to help anyone, including medical professionals,
understand why diagnostic tests have widely varying accuracy. The reasons for the levels of accuracy
depend on a variety of factors.
Results from most diagnostic tests, not just hair testing, will differ
from laboratory to laboratory. There are several reasons for this, ranging from the
training and skill of the lab technicians to proper test set-up protocol (for example, when
analyzing hair samples, which portion of the hair shaft the sample came from is critical), to -
as 20/20/whoever was trying to point out - the actual ability of the test to measure what it is
supposed to measure.
We know of labs in Wisconsin that were given awards when their accuracy rate
(that is, their rate of correctly carrying out tests from start to finish) was 97%, and I would
guess that nationwide that is considered an 'award-winning' rate of accuracy for correctly
doing diagnostic tests.
But this whole issue raises a point that is key to understanding the
accuracy of any diagnostic test: even with the best of training and perfect protocols, laboratory
tests do not yield yes/no answers. The accuracy of the tests are derived from statistical
distributions. Because of this, the rarer a condition in the general population, the higher the
probability that a laboratory test will yield inaccurate results.
Here's a bit of explanation to help understand why I say that.
<Judy pulls out and dusts off her statistics hat, and promises that
this is NOT math!>
A diagnostic test investigates the statistical relationship between test
results and the presence of disease. For all diagnostic tests there are two critical components
that determine its accuracy: sensitivity and specificity.
Sensitivity is the probability that a test is positive,
given that the person has the disease.
Specificity is the probability
that a test is negative, given that the person does not have the
disease.
Using AIDS as an example: assume for
this example that both the specificity and sensitivity of the test for
AIDS is 99%.
Given this, if you took an AIDS test
and have a positive result, the test itself may only be 50%
accurate!!!!
How can that be??? Well, it turns out
that when you work with statistics you also have to take into account
the incidence of something - how common it is - in the population at
large. Let's assume the incidence of AIDS in the United States is 1%,
and then test 10,000 people (since I'm assuming that 1% have AIDS, that
means that 100 people out of these 10,000 really have AIDS, and 9900
really do not have AIDS). We would get the following results from our
diagnostic laboratory test (each 'number' is the number of people that
fall into that category):
|
Test Result
Positive |
Test Result
Negative |
Total Number
of People |
Really Has AIDS |
99a |
1 |
100 |
Doesn't have AIDS |
99c |
9801b |
9900 |
Totals |
198 |
9802 |
10,000 |
aIf you look across the
first row you will see that the sensitivity - the probability that the
test found that the patient had AIDS, given that the person really has
AIDS - is 99%. (99/100 = 99%)
bIf you look across the
second row you will see that the specificity - the probability that the
test didn't find AIDS, given that the person doesn't have AIDS - is
also 99% (9801/9900).
cBut here's where it gets
weird - look DOWN the first column. What this says is: the probability
that you have AIDS, given that you tested positive for AIDS, is 50%!!!
(99/198). If you look down this column you will clearly see that 99 of
the 198 people who tested positive for AIDS in this test clearly do NOT
have AIDS.
Bottom line: when someone tells you
that a diagnostic test, done properly, is 99% accurate (meaning that
both the specificity and the sensitivity = 99%), the actual 'accuracy'
of the test will in fact really depend on how common the disease you
are testing for is in the population you are testing.
This example clearly points out why
diagnostic tests are designed to confirm a diagnosis for rare
conditions. They should not be used to go 'fishing' for a diagnosis
(which unfortunately happens as an inappropriate use of many diagnostic
tests).
In real life, we don't nab 10,000
people randomly and run them through AIDS tests. People who tend to get
AIDS tests are people who are at high risk for a variety of factors
(lifestyle, occupation, medical condition such as hemophilia,
transfusion recipient and the like), so they are in a statistical sense
a different 'population' than the general population.
Even so, keep in mind that the test is
most certainly NOT 99% 'accurate' in the way all of us think about
accuracy, even though both its specificity and sensitivity are 99%. You
always must take into account how common the disease you are
testing for is in the population you are testing. Many, MANY tests have
much lower specificity and sensitivity than the 99% I've used in this
example. However, if the condition these tests are trying to diagnose
is much more common in the population, then what we think of as
'accuracy' becomes better for a given level of specificity and
sensitivity as the prevalence of the disease increases.
However, no professional should be
complacent about the 'high level of accuracy' they get from laboratory
diagnostic tests even for 'common' conditions. Let's quickly run
through an example of a ‘best case' scenario, using the assumption that
we wish to screen for a new fictional bacteria (I'll call it "Pyro")
which affects one in nine people in the population at any given time.
If it affects 1 in 9 people, the
incidence of Pyro in the United States is 11%. Suppose we test 10,000
people (if 11% have Pyro, that means that 1111 people out of these
10,000 really have Pyro, and 8889 really do not have Pyro). We would
get the following results from our diagnostic laboratory test (each
'number' is the number of people that fall into that category):
|
Test Result
Positive |
Test Result
Negative |
Total Number
of People |
Really Has Pyro |
1100a |
11 |
1111 |
Doesn't have Pyro |
89c |
8800b |
8889 |
Totals |
1189 |
8811 |
10,000 |
aIf you look across the
first row you will see that the sensitivity - the probability that the
test found that the patient had Pyro, given that the person really has
Pyro - is 99% (1100/1111 = 99%).
bIf you look across the
second row you will see that the specificity - the probability that the
test didn't find Pyro, given that the person doesn't have Pyro - is
also 99% (8800/8889).
cNow look DOWN the first
column. What this says is: the probability that your test incorrectly
says someone has Pyro, when in fact they do NOT have Pyro, is 8%
(89/1189).
So, in this case - where we have a
condition that is widespread (1 out of 9 is pretty widespread!) your best
diagnostic test - one which has 99% sensitivity and 99% specificity -
is still going to generate at best 92% ‘accuracy' (in what we usually
think of as accuracy).
Ok, I think I'll put that statistics
hat away for the night...
I'd like to take credit, btw, for
laying out all the above, but it was a joint effort between my husband
and myself (yes, we are both boring statistical-types!
<g>). It was fueled by an article we saw many years ago by
Marilyn vos Savant, where she first pointed this out and knocked our
socks off! :-)
|