Stats
Finding only the important studies (January 21, 2008).
Someone wrote into the MedStats listserv asking about a process that they
had chosen to select "important" articles in a particular research area. This
was, I presume, a qualitative summary of interesting results in a broad
medical area rather than a quantitative synthesis of all available research
addressing a specific medical treatment. The reason I suspect this is that the
person mentioned that they had used the statistical significance of the
studies as a filter and eliminated any negative studies from further
consideration.
The normal goal in any systematic overview is to find all the available
research, not just the research that meets some statistical criteria. To
select only positive studies in a systematic overview is a guarantee to get
biased results. In fact, the folks who do systematic overviews go to great
lengths to avoid selecting studies with a particular statistical outcome,
going as far as to blind the reviewers as to the results section while they
are determining if a study fits the entry criteria.
If the goal is to identify important findings, then filtering out the
negative results is still bad, as several people had pointed out. Sometimes
the negative studies are very important as well.
So how should you go about identifying important studies. There are several
definitions of important but most of them require human judgment. One possible
quantitative definition would be that a study is important if a systematic
overview without that study reaches a different conclusion than a systematic
overview that includes that study. In other words, a study that turns the
consensus from efficacy to lack of efficacy would be important (as would be
the reverse). By this definition, a study with statistical significance could
be important (the first definitive proof of a new treatment), or it could be
trivial (yet another study supporting an already accepted new treatment). A
study without statistical significance could be important (the first real
evidence that a well accepted practice might be worthless) or it could be
trivial (yet another nail in the coffin of a treatment that most people have
already abandoned). I mention efficacy here, but you can substitute safety and
get pretty much the same results.
So by that logic, it appears that using ANY filter on statistical
significance is a going to be worthless. Instead, a filtering process should
only incorporate scientific and medical considerations. Possibly you could
include sample size in your filter, since any study with ten patients total is
unlikely to shift the research consensus. You could also filter out those
studies that use surrogate outcomes. Both of these filters have their own
problems, of course, but they are more defensible that the filter of
statistical significance.
You could also use other people's judgments as a filter. If it gets
reported in the New York Times, it is probably important. Or you could see how
often the article is blogged. Or you could only include articles that are
published in journals that have an impact factor above a certain threshold.
This also leads to problems with bias, most notably an English language bias,
but you can still mount a defense of these approaches.
I suspect, though, that there is no automated way to screen out unimportant
studies from important studies, other than reading all the papers and making a
subjective judgment yourself.
This webpage was written
on 2008-01-21
and was last modified on
2008-07-08. Category: Systematic overviews