Stats
Do I have enough data after 24 months of time? (April 5, 2005)
Someone asked me about a correlation coefficient that he computed on a data
set that represented 24 months of data collection. A particular correlation
of interest (a correlation between staff turnover and resident falls) was not
significantly different from zero, but this person wanted to know how much
more data to collect before safely concluding that no relation has been or
likely will be established.
First compute a confidence interval for the correlation coefficient. If
that interval is so narrow that you can rule out the possibility of a
clinically important shift, then your sample size is large enough. How large
a correlation is clinically significant? That's very hard to say. The
correlation is a unitless quantity, and usually you need some measure in
physical units (meters, kilograms, etc.) before you can talk about clinical
importance.
You might want to look instead at the regression coefficient which does
have units of measure in it. I assume that turnover is your independent
variable and falls is your dependent variable. Think, then, about how much of
an increase in falls per unit change in turnover is important from a clinical
perspective. If that value is (I'm just making up a number) 0.5, then your
sample size is adequate as long as the confidence interval for the slope is
entirely inside plus/minus 0.5.
Please realize that an outsider like me can't tell you what's clinically
important, because that requires clinical judgment, something I lack. A good
general overview about clinical importance is on my web pages at
Stats: Confidence intervals
If this is an ongoing project, perhaps you might also find some value to
using a control chart. A control chart allows for continuous monitoring of
important processes. Who knows, maybe something that is not apparent now will
become apparent because of some of the recent changes in health care? I have
a brief outline of control charts at
Stats: Guidelines for quality control models
Another issue is that it is dangerous to look at 12 months worth of data,
then 13, then 14, etc. because you are testing multiple times on a single
hypothesis. It's sort of like being dealt three poker hands and choosing
which one you like best. It would be better to select a sample size (time
interval) prior to data collection and then test only once. If you do test
multiple times, you need to adjust your alpha level. See
Stats: Interim analysis and
Stats: Early stopping in an
animal study (July 1, 2004)
This webpage was written
on 2008-xx-xx
and was last modified on
2008-07-08. Category: Clinical importance,
Category: Early stopping