Some Notes on Evaluating the Prediction Error for the Generalized Estimating Equations
Dario Gregori
Abstract
In spite of the frequent use of generalized estimating equations
(Liang and Zeger, 1986), in particular for modeling correlated binary
data, there has been devoted very small attention by the
literature to arguments like model checking, outliers detection
and prediction accuracy evaluation. This paper is intended to
focus on the latter aspect, discussing the applicability of some
common methods to the generalized estimating equation model:
(i) Apparent error, naive or adjusted according to several
criteria (Cp, AIC, BIC);
(ii) cross-validation;
(iii) bootstrap based methods.
The main difficulty in using
cross-validation and bootstrap arises from the need of retaining
the correlation structure in the data. By sampling clusters
instead of observations we retain the correlation present in
observations belonging to the same cluster. An advantage of this
technique over more model-dependent techniques like bootstrapping
residuals is that correlation remains a nuisance term, in line
with the spirit of the generalized estimating equations, for which
a precise assumption of correlation structure is not needed.
Internal and external prediction error are evaluated using the
proposed methods with reference to a case study of public health
|