The keynote address at the 18th Annual Applied Statistics in Agriculture
Conference, sponsored by Kansas State University was "Random Observations
with Mixed Feelings", given by Oliver Schabenberger, SAS Institute Inc. The
original title was "Estimating Gene Expression Profiles Using All Available
Information." Here are my notes from that seminar.
Dr. Schabenberger started with a historical overview. Approximately 25
years ago, the the general linear mixed model was an obscure complicated
mathematical construction that required specialized software that was
utilized by a limited number of statisticians. The times have changed, but
there is still more work that needs to be done.
He noted that the typical users have unreasonable expectations. They expect
that all mixed models make sense, converge, and behave as expected (that is,
behave well). "The pace of mixed model applications is outrunning knowledge
acquisition."
The goal of his talk was to take another look at the mixed model through
the Mixed Model Equations, dispel some myths about mixed models, and discuss
where we are going.
The mixed model equations Y=Xb+Zg+e. The original solution by Henderson was
to minimize a quadratic form. The secret is to understand the lower right
corner of the solution to this quadratic form.
He drew an analogy to a partitioned model with all fixed effects. There is
also an analogy with ridge regression which shrinks the estimates to control
for multicollinearity. The mixed model uses a similar approach, except that
the G matrix is intended not to control for multicollinearity, but to
penalize the fit. The random effects cannot be allowed to vary at will but
they must vary according to the distribution of the random effects.
If the objective function in statistical estimation is based on a penalty
function, then you can use a mixed model formulation. A good example of this
is a smoothing spline.
Just because you can draw a link, however, between these approaches does
not mean that you should. There are computational efficiency issues, among
other things, that you need to consider.
The concept of a BLUP (Best Linear Unbiased Prediction) has an
interpretation as an Empirical Bayes estimate. Dr. Shabengerger showed a
correspondance between the traditional mixed model equations and the
equations generated by a Bayesian interpretation of the mixed model.
He had a warning "variation does not imply variance" and cited a dangerous
outlook along the lines of "One of the great advantages of mixed modeling is
that you can treat effects as fixed or random depending on the kind of
analysis that you are interested in."
This is a troubling development leads to a "shoot from the hip approach"
such as
- treating all nested effects are random,
- declaring effects random because you WANT to draw inferences regardless
of how the effect came about.
- equating ignorance with randomness.
- equating the ability to estimate a variance component as evidence of
that effect being random.
A dangerous approach is to fit a model in which all effects are random and
then figuring out which ones "stick." Dr. Schabenberger noted that he hears
about this approach because the models that include everything will fail to
converge.
How do you justify random effects in observational studies? Is the
observational study the realization of a stochastic process? If so, it is
random. The crux is: what is the effect representative of?
Myths: residuals in mixed models behave like any residuals do. Reality:
they don't, in part because there are two different types of residuals.
Myth: Leverages are bounded by 1/n and 1. Reality: leverage in mixed models
is not well defined. You can look at the gradient of the fitted values with
respect to y, but this is not symmetric so cannot represent a projection
matrix. You can get negative leverage values. An interpretation could be made
for this quality.
Myth: if you fit a model with spatial correlation, the residuals are
"Purged" of spatial autocorrelation. Reality: the covariance structure of the
model infects the residuals.
Myth: Least Square Means are population means that take into account all
other model effects, both fixed and random. Reality: the Least Squares means
do not involve any function of the random effects. In fact, the formula for
the Least Square Means is identical for a model with random effects and a
model without random effects. The only reason the Least Square Means differ
is that the estimated coefficients change. If your random effects have
non-zero means, you have a very big problem.
He mentioned the use of various information criteria. He cited a famous
story about how the answer to the ultimate question of Life, Universe, and
Truth is 42 (Douglas Adams). You need to look at more than just a single
number. You should examine the pattern of residuals. The use of an
information criteria is especially troublesome with GLMM which use
linearization, and pseudo-likelihood methods. Pseudo-likelihoods are not
comparable across models, even if that are not nested.
He spent a fair amount of time talking about spatial applications. Low rank
spatial smoothing can often reduce a very complex problem to something that
is manageable and that can produce an estimate in reasonable time.
He also talked about a joint publication with Gilliland in 2001 that looked
at correlated binary variables. I don't have the full details on this
reference, but the title apparently is "Limits on Pairwise Association for
Equi-Correlated Binary Variables" and it appeared in the Journal of Applied
Statistical Sciences. There are serious constraints on binary variables that
prevent many correlation values from occurring.
Dr. Shabenbarger ended his talk with a summary of unfinished and unresolved
issues:
- the degrees of freedom issue,
- diagnostic tests and graphs,
- non-normal random effects,
- and mixture models.
07/08/2008.