You have a questionnaire which asks several related questions on a Likert
scale (1=Strongly Disagree, 2=Disagree, etc.). You want to add these items
together and then report an average. Is this a legitimate thing to do?
It depends on who you talk to. There is no real consensus in the research
community. That means that you are free to use whatever approach you want,
but prepare yourself for the possibility that your supervisor/your
dissertation committee/the journal peer reviewer will force you to switch to
the "other" way.
Basically, when you assign numbers like 1, 2, 3, 4, and 5 to the categories
strongly disagree, disagree, neutral, agree, strongly agree, you are making
an assumption that the difference between any two successive values is
comparable. So a shift from disagree to neutral is comparable to a shift from
neutral to agree. Equivalently, you are assuming that a patient who strongly
disagrees with half of the statements and is neutral on the remaining half is
comparable to a patient who simply disagrees with all items on the scale.
A perfectly reasonable alternative is to assign the values -3, -1, 0, 1, 3
to the five categories. This assignment makes the assumption that a strong
disagreement is three times as serious as a simple disagreement.
Since there is more than one reasonable way to assign numbers to the
categories, you might wish to use an ordinal model that provides the same
answer no matter what values you decide to assign.
This is not unlike the process of assigning grades. When you calculate a
grade point average, you assign the numbers 0, 1, 2, 3, and 4 to the grades
F, D, C, B, and A. Is this a reasonable thing to do? It is if you believe
that a student with two B's is comparable to a student with an A and a C. Or
more extremely, you would believe that a student with two C's is comparable
to a student with an A and an F.
Perhaps you could assign alternate numbers: A=100, B=90, C=80, D=70, F=0.
That would penalize someone quite strongly for a single F, much more so than
the scoring system that everyone uses.
One alternative to averaging is to rank the data. With a small number of
ordinal categories, the ranks would have a lot of ties. It seems like a
reasonable approach, but it can sometimes give nonsensical results. Consider
a salary survey that asks for your yearly salary using the following
categories:
- 0 to 10 thousand dollars
- 10 to 20 thousand dollars
- 20 to 50 thousand dollars
- 50 to 100 thousand dollars
- more than 100 thousand dollars
Suppose that the number of people responding in each category is
- 49 people select 0 to 10 thousand dollars
- 21 people select 10 to 20 thousand dollars
- 9 people select 20 to 50 thousand dollars
- 3 people select 50 to 100 thousand dollars
- 3 people select more than 100 thousand dollars
Then the average ranks are 25, 60, 75, 81, and 84. This says that the
difference between 0 to 10 and 10 to 20 (45 units) is three times more
severe than the difference between 10 to 20 and 20 to 50 (15 units).
Even worse, the difference between 0 to 10 and 10 to 20 is fifteen times more
severe than the difference between 50 to 100 and more than 100.
A much better approach for this type of data is to assign the midpoint to
each interval and assign a reasonably large value (say 150 thousand or 200
thousand) to the last interval.
There isn't any real consensus, so you can probably find a justification
for just about any type of approach in the list of readings offered below. I
have no problem with averaging ordinal data, because I haven't seen that many
situations where using something more complex has resulted in a substantively
different conclusion.
Further reading
- Regression models for ordinal responses: a review of methods and
applications. Ananth CV, Kleinbaum DG. International Journal of
Epidemiology 1997: 26(6); 1323-33.
- Pearson's R and Coarsely Categorized Measures. Bollen KA, Barb K.
American Sociological Review 1981: 46; 232-39.
- Tutorial in Biostatistics: A review of tests for detecting a monotone
dose-response relationship with ordinal response data. Chuang-Stein C,
Agresti A. Statistics in Medicine 1997: 16(22); 2599-618.
-
Logistic
Regression. Garson GD, College of Humanities and Social
Sciences, North Carolina State University. Accessed on 2003-08-28.
www2.chass.ncsu.edu/garson/pa765/logistic.htm
- Alternative models for ordinal logistic regression. Greenland S.
Stat Med 1994: 13(16); 1665-77.
- Multivariable prognostic models: issues in developing models,
evaluating assumptions and adequacy, and measuring and reducing errors.
Harrell FE, Jr., Lee KL, Mark DB. Stat Med 1996: 15(4); 361-87.
- Development of a clinical prediction model for an ordinal outcome:
the World Health Organization Multicentre Study of Clinical Signs and
Etiological agents of Pneumonia, Sepsis and Meningitis in Young Infants.
WHO/ARI Young Infant Multicentre Study Group. Harrell FE, Jr., Margolis
PA, Gove S, Mason KE, Mulholland EK, Lehmann D, Muhe L, Gatchalian S,
Eichenwald HF. Statistical Medicine 1998: 17(8); 909-44.
[Medline]
- Multivariate Analysis and Ordinal Data. Henry F. American
Sociological Review 1982: 47; 299-307.
- Ordinal Measures in Multiple Indicator Models: A Simulation of
Categorization Error. Johnson D, Creech J. American Sociological Review
1983: 48; 398-407.
- Multivariate Analysis of Ordinal Variables. Kim J. American
Journal of Sociology 1975: 81; 261-98.
- Multivariate Analysis of Ordinal Variables Revisited. Kim J.
American Journal of Sociology 1978: 84; 448-56.
- The Assignment of Numbers to Rank Order Categories. Labovitz S.
American Sociological Review 1970: 35; 515-24.
- The Use of Pearson's R with Ordinal Data. O'Brien R. American
Sociological Review 1979: 44; 851-57.
- Likelihood ratios with confidence: sample size estimation for
diagnostic test studies. Simel DL, Samsa GP, Matchar DB. J Clin
Epidemiol 1991: 44(8); 763-70.
[Medline]
- Sample size and power estimation for studies with health related
quality of life outcomes: a comparison of four methods using the SF-36.
Walters SJ. Health Qual Life Outcomes 2004: 2(1); 26.
[Medline]
[Abstract] [Full text]
[PDF]
-
Data
Levels and Measurement. Garson GD, North Carolina State
University. Accessed on 2003-11-19. www2.chass.ncsu.edu/garson/pa765/datalevl.htm