Estimation of a Panel Data Sample Selection Model
We consider the problem of estimation in a panel data samplc selection model, where
both thc selection and the regression equation of intercst contain unobservable individual-
specific effects. We propose a two-step estimation procedure, which "differences out"
both the sample selection effect and the unobservable individual effect from the cquation
of intercst. In the first step, the unknown coefficients of the "selection" equation are
consistently estimated. The estimates are then used to estimate thc regression equation of
interest. The estimator proposed in this paper is consistent and asymptotically normal,
with a rate of convergence that can be made arbitrarily close to n-'I2, depending on the
strength of certain smoothness assumptions. The finite sample properties of the estimator
are invcstigated in a small Monte Carlo simulation.
KEYWORDS:Sample selection, panel data, individual-specific effects.
1. INTRODUCTION
SAMPLESELECTION IS A PROBLEM frequently encountered in applied research. It
arises as a result of either self-selection by the individuals under investigation,
or sample selection decisions made by data analysts. A classic example, studied
in the seminal work of Gronau (1974) and Heckman (1976), is female labor
supply, where hours worked are observed only for those women who decide to
participate in the labor force. Failure to account for sample selection is well
known to lead to inconsistent estimation of the behavioral parameters of
interest, as these are confounded with parameters that determine the probability
of entry into the sample. In recent years a vast amount of econometric literature
has been devoted to the problem of controlling for sample selectivity. The
research however has almost exclusively focused on the cross-sectional data
case. See Powell (1994) for a review of this literature and for references. In
contrast, this paper focuses on the case where the researcher has panel or
longitudinal data a~ailableS.~am ple selectivity is as acute a problem in panel as
in cross section data. In addition, panel data sets are commonly characterized by
nonrandomly missing observations due to sample attrition. |