Stats
Post hoc power (November 1, 2002)
Category: Ask Professor Mean,
Category: Post hoc power
Dear Professor Mean, The results of my study
were negative, and the journal reviewer insists that I perform a post hoc power
calculation. How do I do this? -Jittery Jerry
Dear Jittery, Post hoc power calculations are very bad. If it's the only
way you can get the paper published, we can do this calculation, but a
confidence interval calculation is far better.
What the confidence interval tells you
Compare the width of the confidence interval to the range of clinical
indifference.
When a confidence interval is very narrow then a negative finding is
impressive. You have a large enough sample size to rule out the possibility
of any large and clinically relevant difference. This is especially true if
your confidence interval lies entirely inside a range of clinical
indifference.
A wide confidence interval, on the other hand, is an indication of an
inadequate sample size. This is especially true if your confidence interval
includes vales that might be considered clinically relevant.
Post hoc power as an update of a priori calculations
The one approach to post hoc power that is somewhat defensible is an update
of your a priori power calculation. You did do a power calculation prior to
collecting your data, didn't you?
Great! Remember that in that calculation, you used an estimate of
variability from a pilot study or from previous research. Sometimes, your
data has a lot more variability or a lot less variability than you thought it
would. Look at variability of your data and use that rather than the a priori
estimate of variability.
Keep the estimate of the clinically relevant difference the same. This is
very important. Report both the a priori and the post hoc power calculations.
Post hoc power using observed effects
Sometimes people will update both the estimate of variability and the
clinically relevant difference. They mistakenly call the difference actually
observed in the data set the clinically relevant difference and use that in
the power calculation.
This is a serious mistake. Clinical relevance requires clinical judgment,
and the mindless substitution of the value you observed in your study
abandons any intelligent consideration of this issue.
Unfortunately, the problem is worse than this. When you use the estimated
variability and combine it with the observed effects, you get a value which
marches in lock step with the p-value of the study. When the p-value is
small, the post hoc power using observed effects is large. When the p-value
is large, the post hoc power is small.
Thus, the post hoc power becomes a self-fulfilling prophecy. When the
p-value is small enough to reject the null hypothesis, you congratulate
yourself on your intelligence and good planning because the post hoc power is
large. When the p-value is large enough to accept the null hypothesis, you
notice a small post hoc power, and congratulate yourself on studying an area
that merits further research, if only someone would give you a big fat
research grant.
Never will a post hoc power based on observed effects tell you that a
negative finding is truly negative. So its calculation is pretty much
pointless.
Further reading
- Negative results of randomized clinical trials published in the
surgical literature: equivalency or error? J. B. Dimick, M. Diener-West,
P. A. Lipsett. Arch Surg 2001: 136(7); 796-800.
[Medline]
- Post hoc power analysis--another view. J. Fogel. Pharmacotherapy
2001: 21(9); 1150.
[Medline] [Full
text]
- Post hoc power analysis: an idea whose time has passed? M.
Levine, M. H. Ensom. Pharmacotherapy 2001: 21(4); 405-9.
[Medline]
[Abstract] (Sample Size, Post Hoc Power)
- The use of predicted confidence intervals when planning experiments
and the misuse of power when interpreting results. Steven Goodman.
Annals of Internal Medicine 1994: 121(3); 200-206.
[Medline]
- Resolving discrepancies among studies: the influence of dose on
effect size. I. Hertz-Picciotto, R. R. Neutra. Epidemiology 1994: 5(2);
156-63.
[Medline]
- The Abuse of Power: The Pervasive Fallacy of Power Calculations for
Data Analysis. John M. Hoenig, Dennis M. Heisey. The American
Statistician 2001: 55(1); 19-24.
- The Overemphasis On Power Analysis. Thomas Knapp. Nursing
Research 1996: 45(6); 379.
[Medline]
- Some Practical Guidelines for Effective Sample Size Determination.
R.V. Lenth. The American Statistician 2001: 55(3); 187-193.
[PDF]
- Confidence limit analyses should replace power calculations in the
interpretation of epidemiologic studies. A. H. Smith, M. N. Bates.
Epidemiology 1992: 3(5); 449-52.
[Medline]
This page was written and was last modified on
07/14/2008.