Category: Measuring agreement. There are several ways to calculate the degree of agreement between two variables that are purporting to measure the same thing. In addition to describing these measures, this category includes discussion of assessment of reliability and validity, which is typically done by establishing a strong degree of agreement. Articles are arranged by date with the most recent entries at the top. You can find the theme and closely related categories, definitions, and other resources at the bottom of this page.
Stats: Cohen's Kappa with small cell sizes (April 26, 2007). Someone on Edstat-L wrote in asking about using Cohen' Kappa with a small sample size in some of the cells.
Stats: What is an adequate sample size for establishing validity and reliability? (April 9, 2007). Someone from Mumbai, India wrote in asking whether a sample of 163 was sufficiently large for a study of reliability and validity. This was for a project that was already done, and this person was worried that someone would complain that 163 is too small.
Stats: What is construct validity? (March 8, 2006). Someone asked me to define face validity, criterion validity, and construct validity. That's a tall order. In general, validity means that a measurement that we take represents what we think it should. This is important to establish, because many times we think we are measuring one thing, but we are measuring something else entirely. It is important to remember that validity is a journey and not a goal. You never reach a place called the land of valid measurements. Instead, you gradually strengthen the evidence for validity, but there is no threshold that you cross where you can say, "We can now conclude that the measure is valid." Similarly, there is no region we can point to where we can say with confidence "We have not yet reached the point where we can say that the measure is valid."
Stats: Very low values from Cronbach's Alpha (July 19, 2005). Someone came to me with an analysis of a scale of 16 items, where the Cronbach's alpha computed for that scale was only 0.14. The first thing you should look for in this situation is whether some of the items on the scale are reverse scaled.
Stats: Measuring agreement (April 19, 2005). Someone reviewing a paper asked me about all the "weird statistics" being used in the paper, such as the Bland-Altman plot and Deming regression.
Stats: What's a good value for Cronbach's Alpha? (September 9, 2004). Someone sent me a question a month ago that I never got around to responding to. She asks, What would you consider a Cronbach alpha of .60 to be in terms of “label” (i.e., fair, poor, etc.)?
Stats: Goodness of fit test (May 18, 2004). The chi-square test appears in a lot of different places. Some recent data on Astrology, published in the May newsletter of the Skeptic Society, offers an interesting opportunity to show one of these tests.
Stats: Establishing validity and reliability (November 6, 2002). Dear Professor Mean, I need to establish validity and reliability of a new measurement. How do I do this? --- Flustered Fred
Theme and closely related categories:
- Stats: What is a point biserial correlation?
- Stats: What is a correlation? (Pearson correlation)
- Stats: What is a Kappa coefficient? (Cohen's Kappa)
- Stats: What are odds?
- Stats: What is an odds ratio?
- Stats: What is a phi coefficient?
- Correlation coefficients in medical research: from product moment correlation to the odds ratio
- Knee-heel length measurement in healthy preterm infants Description: This article provides an illustrative example of how to use the coeficient of variation to measure agreement on a continuous trait among several raters.
[Return to full topic list] [Read current weblog entries]
This webpage was written by Steve Simon on 2007-06-27, edited by Steve Simon, and was last modified on 2008-07-08. Send feedback to ssimon at cmh dot edu or click on the email link at the top of the page.
