I don't like to cite articles in the New York Times, because they are free
on the web only for a couple of weeks. But an article by Denise Grady,
Nominal Benefits Seen in Drugs for Alzheimers, published on April 7 is
worth mentioning. Try to review this article before it disappears on April
21.
Grady writes that drugs to treat Alzheimer patients are expensive, and it
is unclear how much they really help.
Clearly, the drugs can alter brain chemistry, and some studies show
statistically significant improvements on tests that measure thinking and
memory. But while a few extra points on a mental exam, or other changes
obvious to a specialist, may be enough to get a drug approved by the Food
and Drug Administration, they may not be enough to help a person with
Alzheimer's dementia function in the real world. "You can name 11 fruits in
a minute instead of 10," said Dr. Thomas Finucane, a professor at Johns
Hopkins and a geriatrician. "Is that worth 120 bucks a month?"
This is a classic example of both clinical importance and surrogate end
points. Typically, patients in randomized trials are assessed using tests of
memory. But is the ability to name fruits really of interest to patients?
Also, is the ability to name 11 fruits rather than 10 worthwhile from a
patient's perspective.
Dr. Jason Karlawish, a geriatrician at the University of
Pennsylvania's Institute on Aging, said, "There is substantial controversy
over the claim that current F.D.A.-approved treatments improve function or
slow a patient's decline." He blamed several factors for the controversy,
including "the lack of widely understood and accepted measures to show
improvement or slowing of decline, small effects on the few measures some
experts agree are appropriate, and controversial and even outrageous
approaches to analyzing the data to make the claim the drug slows a
patient's decline." Dr. Karlawish, Dr. Finucane and other researchers said
they were particularly irked by a study published last July in The Journal
of the American Geriatrics Society, claiming that Aricept could delay a
patient's need for nursing-home care by nearly two years - something that
would clearly matter to patients and families.
The latter claim, which has been criticized, is indeed the type of outcome
that patients are interested in. There is an acronym, POEM, which stands for
Patient Oriented Events that Matter. Patients are only interested in
morbidity, mortality, or quality of life.
Along the same lines, I co-authored an editorial with Jay Portnoy, Is
3-mm Less Drowsiness Important?, which appeared in the October 2003 issue
of the Annals of Allergy, Asthma and Immunology. We commented on a study that
showed a statistically significant difference in drowsiness scores.
Drowsiness was measured using a visual analog scale. This is a line, usually
10 centimeters in length. You are asked to draw a mark on the line
representing how drowsy you feel, with the left end of the line representing
no drowsiness and the right end of the line representing the most drowsiness
possible. In this study, the difference between the two drugs was 3
millimeters. Here's an image that shows the size of 3 millimeters on a 10
centimeter scale. On your screen, of course, or on your printer, these lines
may not be equal to exactly 10 centimeters, but the distances should be
proportional.

Is a difference of 3 mm large enough to justify a claim of "less drowsy"?
The authors of the study do not discuss this, and quite honestly, most
researchers do not either (Chan 2001; Thomson 2000). It seems that once you
compute a p-value, you stop thinking. This is a bad habit. While statistical
significance is an important finding, it is also important to discuss
practical relevance. Does an extra 3 mm make it safe to drive a car or to
operate heavy machinery?
The visual analog scale is also used for pain assessments, and here there
is some guidance for what is considered clinically important. A score of 30
mm or less is considered little or no pain (Bodian 2001), and a change of 10
mm is considered clinically relevant (Powell 2001).
In other areas, it is well known that grapefruit consumption can alter the
metabolic pathways of a liver enzyme, CYP3A. Is this altering sufficiently
large, though, to warrant a warning? An editorial (Abernathy 1997) discusses
this and concludes that only an unusually high consumption of grapefruits
would warrant any serious concern.
A review of randomized trials of head injury (Dickinson 2000) suggests that
a 5% absolute improvement would be considered clinically relevant, though a
letter to the editor (Murray 2000) suggests the perhaps even a 10% absolute
improvement would not be clinically important. Given the severity of outcomes
in head injuries, I would tend to side with Dickinson, but it is very
difficult to detect differences this small.
There are many aspects to this problem which I'll try to discuss when I
have time:
- clinical importance in negative studies
- particularizing findings to an individual patient
- measuring clinical importance using absolute measures (e.g., NNT)
Grapefruits and drugs: when is statistically significant clinically
significant? Abernethy DR. J Clin Invest 1997: 99(10); 2297-8.
[Medline] [Full
text] [PDF]
The visual analog scale for pain: clinical significance in postoperative
patients. Bodian CA, Freedman G, Hossain S, Eisenkraft JB, Beilin Y.
Anesthesiology 2001: 95(6); 1356-61.
[Medline]
How well is the clinical importance of study results reported? An
assessment of randomized controlled trials. Chan KB, Man-Son-Hing M,
Molnar FJ, Laupacis A. Cmaj 2001: 165(9); 1197-202.
[Abstract]
[Full text]
[PDF]
Size and quality of randomised controlled trials in head injury: review
of published studies. Dickinson K, Bunn F, Wentz R, Edwards P, Roberts I.
British Medical Journal 2000: 320; 1308-1311.
[Medline]
[Abstract]
[Full
text]
[PDF]
Quality of randomised controlled trials in head injury. Trials in head
injury are more complex than review suggests. Murray GD, Teasdale GM.
British Medical Journal 2000: 321(7270); 1223.
[Medline]
[Full
text]
Is 3-mm Less Drowsiness Important? Portnoy JM, Simon SD. Annals of
Allergy, Asthma and Immunology 2003: 91(4); 324-325.
[Medline]
Determining the minimum clinically significant difference in visual
analog pain score for children. Powell CV, Kelly AM, Williams A. Ann
Emerg Med 2001: 37(1); 28-31.
[Medline]
Audit and feedback: effects on professional practice and health care
outcomes. Thomson OB, Oxman AD, Davis DA, Haynes RB, Freemantle N, Harvey
EL. Cochrane 2000: (2); CD000259.
[Medline]