Meaningful Multicollinearity Measures
KEY WORDS
Regression
Effective sample size
Extrapolation
Interpolation
Predictability
1. INTRODUCTION
In a recent paper, Hocking [8] discussed methods
for variable selection in linear regression models. His
approach, along with others such as Kendall [ll],
Gorman and Toman [5], Hoerl and Kennard [9, 101,
Massey [17], Lott [13], Webster, Gunst and Mason
[19], and Hawkins [7], may be characterized as trying
to produce a model which is robust against changes
in the interdependency relationship amongst the predictor
variables, that is, a model which is robust
against (mu1ti)collinearity effects.
One of the recommendations made by Hocking
was that statisticians should try to provide adequate
guidelines on "what constitutes a serious multicollinearity
problem". This problem has in fact been
discussed thoroughly in the excellent paper by Farrar
and Glauber [4], who gave precise statistical procedures
for detecting and localizing sources of collinearity
within a given data set. However, while these
are useful results, we feel that in many cases they are
too stringent since they are tests based on the definition
of multicollinearity as a departure from orthogonality.
The test statistics used by Farrar and Glauber
are designed to determine the likelihood of
orthogonality in the population based on a random
sample and do not address directly enough the problem
of how to cope with collinearity in the data set at
hand. One of the results of this is that the Farrar and
Glauber approach often indicates rejection of the null
hypothesis of orthogonality in cases where, from a
practical point of view, there is not a severe collinearity
problem. |