Category: Sample size justification. These pages provide formulas and advice for justifying the sample size in a research study. Some of these pages describe the pragmatic and ethical concerns about sample size. Articles are arranged by date with the most recent entries at the top. You can find the theme and closely related categories and other resources at the bottom of this page.

Stats: Too much power and precision? (January 9, 2008).There was a discussion on EDSTAT-L about studies with too much power and precision. You can indeed have too much power/precision, and here is a pragmatic example.

Stats: Justifying the sample size for a microarray study (August 9, 2007). I'm helping out with a grant proposal that is using microarrays for part of the analysis. A microarray is system for quantitative measurement of circulating mRNA in human, animal, or plant tissue. A microarray will typically measure thousands or tens of thousands of different mRNA sequences. An important issue for this particular grant (and many grants involving microarray data) is how to justify the sample size. Here are a few references that I will use to develop such a justification.

Stats: What is an adequate sample size for establishing validity and reliability? (April 9, 2007). Someone from Mumbai, India wrote in asking whether a sample of 163 was sufficiently large for a study of reliability and validity. This was for a project that was already done, and this person was worried that someone would complain that 163 is too small.

Stats: IRB review of a pilot study (March 26, 2007). Dear Professor Mean: I am the new chair of the IRB at a county hospital. Many of the studies we review are pilot studies with small samples. I have been trying to locate criteria for the scientific review of pilot studies, but have not found a consensus in the literature that I have seen. Is a pilot study merely a "dry run" of the procedures that will be used in a later, larger-scale study? Or, is it reasonable for the IRB to demand that the investigator provide specific criteria for determining whether the pilot has been a success? And, should the IRB furthermore demand that specific hypotheses be formulated? My impression is that many investigators declare their studies to be pilots in order to avoid more rigorous scrutiny of their proposals.

Stats: Do your own power and sample size calculations (January 30, 2007). Someone asked me for some power calculations and the problem was stated very tersely and completely: "Alpha .05, Power 0.8. What is sample size to detect an outcome difference of 20% vs 30% for an adverse event. Thank you." Usually people have difficulty in elaborating the conditions of the power or sample size calculation, and I am always glad to help with that process. But if you already know the conditions, you can find very nice web sites that will do power calculations for you.

Stats: Variable cluster sizes and their impact on sample size calculations (January 3, 2007). A recently published article in the International Journal of Epidemiology discusses sample size requirements for cluster randomized trials when the size of the cluster itself varies. The authors develop an approximation that uses the coefficient of variation (CV) of the distribution of cluster sizes.

Stats: Be sure to account for dropouts in your sample size calculation (December 29, 2006). I helped out a colleague with an NIH grant, and when the critique came back, it mentioned two issues that I should have been aware of. First, they pointed out the need for an intention-to-treat analysis strategy. Second, they noted the long duration of the study, with a full year of evaluations on any particular patient, and seemed bothered that we presumed that 100% of the patients would complete the full study.

Stats: Is a 10% shortfall in sample size critical? (October 23, 2006). Dear Professor Mean, I'm reviewing a paper where they did a power calculation based on 60 patients per group, but in the research study, they ended up only getting 55/58 per group. Since their sample size was much less than what they originally planned for, does this mean that the study had inadequate power?

Stats: R libraries for sample size justification (July 28, 2006). There are a lot of good commercial and free sources for sample size justification. Note that most people use the term power calculation, but there is more than one way to justify a sample size, so I try to avoid the term "power calculation" as being too restrictive. Anyway, I just noted an email on the MedStats list that suggests two R libraries.

Stats: How many charts should I pull? (March 30, 2006). I got a question from someone doing a quality review. She needs to pull a certain number of medical records out of 892 and see whether the doctors followed the clinical guidelines properly. The question is how to determine the proper number of charts to pull.

Stats: Sample size for a binomial confidence interval (October 3, 2005). Someone asked me for some help with a homework question. I hesitate to offer too much advice in these situations because I don't want to disrupt the teacher's efforts to get the students to think on their own.

Stats: Sample size for a binary endpoint (August 12, 2005). Someone sent me an email asking for the sample size needed to detect a 10% shift in the probability of recurrence of an event after one of two different surgical procedures is done.

Stats: Confidence interval for a correlation coefficient (July 11, 2005). In many exploratory research studies, the goal is to examine associations among multiple demographic variables and some outcome variables. How can you justify the sample size for such an exploratory study? There are several approaches, but one simple way that I often use is to show that any correlation coefficients estimated by this research study will have reasonable precision. It may not be the most rigorous way to select a sample size, but it is convenient and easy to understand.

Stats: Sample size calculation for a nonparametric test (March 8, 2005). I got an email inquiry about how to calculate power for a Wilcoxon signed ranks test. I don't have a perfect reference for this, but I do have a brief discussion on sample size calculations for the Mann Whitney U test as part of my pages on selecting an appropriate sample size. The same considerations would apply for the Wilcoxon test.

Stats: Unequal sample sizes (November 24, 2004). For some reasons, it seems to unnerve people when the sample size in the treatment and control group are not the same. They worry about whether the tests that they would run on the data would be valid or not. As a general rule, there is no reason that you cannot analyze data with unequal sample sizes.

Stats: Ratio of observations to independent variables (November 17, 2004). A widely quoted rule is that you need 10 or 15 observations per independent variable in a regression model. The original source of this rule of thumb is difficult to find. I briefly commented on this in an earlier weblog entry, but here is a more complete elaboration.

Stats: Sample size for an ordinal outcome (September 22, 2004). Someone asked me for some help with calculating an appropriate sample size for a study comparing two treatments, where the outcome measure is ordinal (degree of skin toxicity: none, slight, moderate, severe). It turns out that an excellent discussion of the various approaches appears in a recent journal article with full free text on the web.

Stats: Sample size calculations in studies with a baseline (July 23, 2004). Many research studies evaluate all patients at baseline and then randomly assign the patients to groups, conduct the interventions, and then re-evaluate them at the end of the study. The sample size calculations for this type of study are a bit tricky.

Stats: Sample size for a diagnostic test (July 5, 2004). Someone asked me how to determine the sample size for a study involving a diagnostic test. It seems like a tricky thing, because most studies of diagnostic tests don't have a formal hypothesis. What you need to do instead is to specify a particular statistic that you are interested in estimating and then selecting a sample size so that the confidence interval for this estimate is reasonably precise.

Stats: Sample size for cluster randomized trials (May 27, 2004). One of my favorite people to work with, Vidya Sharma, was asking me how to compute the sample size in a cluster randomized trial. I had started to write a web page about this, but never found the time to finish it. A cluster randomized trial selects several large groups of patients and then randomly assigns a treatment to all of the patients within a group. A cluster randomized trial requires a larger sample size than for a simple randomized trial.

Stats: Sample size calculation example (May 20, 2004). I received a question in Hong Kong about how to double check a power calculation in a paper by Tugwell et all in the 1995 NEJM. In the paper, they state that "With the tender-joint count used as the primary outcome, a sample of 75 patients per group was needed in order to have a 5 percent probability of a Type I error and a power of 80 percent to detect a difference of 5 tender joints between groups, with a standard deviation of 9.5, and to allow for a 25 percent dropout rate."

Stats: Sample size for a survival data model (May 13, 2004). I got an email from Japan recently with an interesting question. The question was about an analysis of mortality for children under 5 years of age. There were 1992 patients and 272 of them died. I was asked if this was an adequate sample size and whether there was a problem because the median survival time was unavailable for some of the subgroups.

Stats: Binary outcome sample size calculations (August 23, 2000). Dear Professor Mean, I have to calculate a sample size for a binary outcome variable. The research study is on breast feeding failures within 7 to 10 days of birth for mothers who intended to breast feed. The rate of failure overall is expected to be about 12%. What sample size do I need? -- Baffled Bob

Stats: Cluster randomization (May 9, 2003). This appears to be a duplicate of the May 27, 2004 weblog entry.

Stats: Three things you need for a power calculation (November 8, 2001). Dear Professor Mean, I want to do research. Is forty subjects enough, or do I need more? Didn't I hear you mention something about three things you need for a power calculation? -- Eager Edward

Stats: Documenting negative results in a research paper (October 11, 2001). Dear Professor Mean, I have just finished a well-designed research study and my results are negative. I'm worried about publication bias; most journals will only accept papers that show positive results. How do I document the negative findings in a research paper in a way that will convince a journal to accept my paper? -- Apprehensive Arturo

Stats: Quick sample size calculations (October 11, 2001). Dear Professor Mean, I'm reading a research paper. I suspect that the sample size is way too small. I don't like the findings of the study anyway, so I'm hoping that you will help me discredit this study. Is there a quick sample size calculation that I can use? -- Cynical Chris

Stats: Confidence interval with zero events (January 19, 2001). Dear Professor Mean, I was working with a colleague on some confidence intervals for the probability of an adverse event during several different types of operations. One of the proportions was zero, since the event never occured. My friend computed a confidence interval and it went from zero to zero. I told him that this couldn't be right and computing a confidence interval with zero events is impossible. Isn't that right? -- Killjoy Karlina

Stats: The minimal impact of population size on power and precision (January 19, 2001). Dear Professor Mean, Can you explain why it is okay to have similar sample sizes for populations of very different sizes. For example, why is it that a sample size for a population of 10 million doesn't have to be much larger than a sample size for a population of 10 thousand? -- Skeptical Sam

Stats: Sample size for Mann-Whitney U (September 28, 2000). Dear Professor Mean, I need to calculate the sample size for the Mann-Whitney U test. How do I do this? -- Bewildered Bob

Stats: Binary outcome sample size calculations (August 23, 2000) Dear Professor Mean, I have to calculate a sample size for a binary outcome variable. The research study is on breast feeding failures within 7 to 10 days of birth for mothers who intended to breast feed. The rate of failure overall is expected to be about 12%. What sample size do I need? -- Baffled Bob

Stats: Sample size for a confidence interval (January 26, 2000). Dear Professor Mean, We have a large dataset with about 400 million records. We need to randomly select a subsample from it. However we need help in determining the sample size. What sample size do we need for the confidence interval calculations? -- Frantic Frank

Stats: Sample size for a diagnostic study (September 3, 1999). Dear Professor Mean, How big should a study of a diagnostic test be? I want to estimate a sample size for the sensitivity and specifity of a test. I guess confidence intervals would address this, but is there a calculation analogous to a power analysis that would apply to figure out the size of the groups beforehand? -- Jovial John

Theme and closely related categories:

Other resources:

[Return to full topic list] [Read current weblog entries]

Creative Commons License This work is licensed under a Creative Commons Attribution 3.0 United States License. It was written by Steve Simon on 2007-08-09, edited by Steve Simon, and was last modified on 2008-04-03. Send feedback to ssimon at cmh dot edu or click on the email link at the top of the page.