Contents
Preface xvii
Acknowledgments xxiii
I INTRODUCTION AND BACKGROUND 1
1 Introduction 3
1.1 Causal Relationships and Ceteris Paribus Analysis 3
1.2 The Stochastic Setting and Asymptotic Analysis 4
1.2.1 Data Structures 4
1.2.2 Asymptotic Analysis 7
1.3 Some Examples 7
1.4 Why Not Fixed Explanatory Variables? 9
2 Conditional Expectations and Related Concepts in Econometrics 13
2.1 The Role of Conditional Expectations in Econometrics 13
2.2 Features of Conditional Expectations 14
2.2.1 Definition and Examples 14
2.2.2 Partial E¤ects, Elasticities, and Semielasticities 15
2.2.3 The Error Form of Models of Conditional Expectations 18
2.2.4 Some Properties of Conditional Expectations 19
2.2.5 Average Partial E¤ects 22
2.3 Linear Projections 24
Problems 27
Appendix 2A 29
2.A.1 Properties of Conditional Expectations 29
2.A.2 Properties of Conditional Variances 31
2.A.3 Properties of Linear Projections 32
3 Basic Asymptotic Theory 35
3.1 Convergence of Deterministic Sequences 35
3.2 Convergence in Probability and Bounded in Probability 36
3.3 Convergence in Distribution 38
3.4 Limit Theorems for Random Samples 39
3.5 Limiting Behavior of Estimators and Test Statistics 40
3.5.1 Asymptotic Properties of Estimators 40
3.5.2 Asymptotic Properties of Test Statistics 43
Problems 45
II LINEAR MODELS 47
4 The Single-Equation Linear Model and OLS Estimation 49
4.1 Overview of the Single-Equation Linear Model 49
4.2 Asymptotic Properties of OLS 51
4.2.1 Consistency 52
4.2.2 Asymptotic Inference Using OLS 54
4.2.3 Heteroskedasticity-Robust Inference 55
4.2.4 Lagrange Multiplier (Score) Tests 58
4.3 OLS Solutions to the Omitted Variables Problem 61
4.3.1 OLS Ignoring the Omitted Variables 61
4.3.2 The Proxy Variable–OLS Solution 63
4.3.3 Models with Interactions in Unobservables 67
4.4 Properties of OLS under Measurement Error 70
4.4.1 Measurement Error in the Dependent Variable 71
4.4.2 Measurement Error in an Explanatory Variable 73
Problems 76
5 Instrumental Variables Estimation of Single-Equation Linear Models 83
5.1 Instrumental Variables and Two-Stage Least Squares 83
5.1.1 Motivation for Instrumental Variables Estimation 83
5.1.2 Multiple Instruments: Two-Stage Least Squares 90
5.2 General Treatment of 2SLS 92
5.2.1 Consistency 92
5.2.2 Asymptotic Normality of 2SLS 94
5.2.3 Asymptotic E‰ciency of 2SLS 96
5.2.4 Hypothesis Testing with 2SLS 97
5.2.5 Heteroskedasticity-Robust Inference for 2SLS 100
5.2.6 Potential Pitfalls with 2SLS 101
5.3 IV Solutions to the Omitted Variables and Measurement Error
Problems 105
5.3.1 Leaving the Omitted Factors in the Error Term 105
5.3.2 Solutions Using Indicators of the Unobservables 105
Problems 107
6 Additional Single-Equation Topics 115
6.1 Estimation with Generated Regressors and Instruments 115
Contents vi
6.1.1 OLS with Generated Regressors 115
6.1.2 2SLS with Generated Instruments 116
6.1.3 Generated Instruments and Regressors 117
6.2 Some Specification Tests 118
6.2.1 Testing for Endogeneity 118
6.2.2 Testing Overidentifying Restrictions 122
6.2.3 Testing Functional Form 124
6.2.4 Testing for Heteroskedasticity 125
6.3 Single-Equation Methods under Other Sampling Schemes 128
6.3.1 Pooled Cross Sections over Time 128
6.3.2 Geographically Stratified Samples 132
6.3.3 Spatial Dependence 134
6.3.4 Cluster Samples 134
Problems 135
Appendix 6A 139
7 Estimating Systems of Equations by OLS and GLS 143
7.1 Introduction 143
7.2 Some Examples 143
7.3 System OLS Estimation of a Multivariate Linear System 147
7.3.1 Preliminaries 147
7.3.2 Asymptotic Properties of System OLS 148
7.3.3 Testing Multiple Hypotheses 153
7.4 Consistency and Asymptotic Normality of Generalized Least
Squares 153
7.4.1 Consistency 153
7.4.2 Asymptotic Normality 156
7.5 Feasible GLS 157
7.5.1 Asymptotic Properties 157
7.5.2 Asymptotic Variance of FGLS under a Standard
Assumption 160
7.6 Testing Using FGLS 162
7.7 Seemingly Unrelated Regressions, Revisited 163
7.7.1 Comparison between OLS and FGLS for SUR Systems 164
7.7.2 Systems with Cross Equation Restrictions 167
7.7.3 Singular Variance Matrices in SUR Systems 167
Contents vii
7.8 The Linear Panel Data Model, Revisited 169
7.8.1 Assumptions for Pooled OLS 170
7.8.2 Dynamic Completeness 173
7.8.3 A Note on Time Series Persistence 175
7.8.4 Robust Asymptotic Variance Matrix 175
7.8.5 Testing for Serial Correlation and Heteroskedasticity after
Pooled OLS 176
7.8.6 Feasible GLS Estimation under Strict Exogeneity 178
Problems 179
8 System Estimation by Instrumental Variables 183
8.1 Introduction and Examples 183
8.2 A General Linear System of Equations 186
8.3 Generalized Method of Moments Estimation 188
8.3.1 A General Weighting Matrix 188
8.3.2 The System 2SLS Estimator 191
8.3.3 The Optimal Weighting Matrix 192
8.3.4 The Three-Stage Least Squares Estimator 194
8.3.5 Comparison between GMM 3SLS and Traditional 3SLS 196
8.4 Some Considerations When Choosing an Estimator 198
8.5 Testing Using GMM 199
8.5.1 Testing Classical Hypotheses 199
8.5.2 Testing Overidentification Restrictions 201
8.6 More E‰cient Estimation and Optimal Instruments 202
Problems 205
9 Simultaneous Equations Models 209
9.1 The Scope of Simultaneous Equations Models 209
9.2 Identification in a Linear System 211
9.2.1 Exclusion Restrictions and Reduced Forms 211
9.2.2 General Linear Restrictions and Structural Equations 215
9.2.3 Unidentified, Just Identified, and Overidentified Equations 220
9.3 Estimation after Identification 221
9.3.1 The Robustness-E‰ciency Trade-o¤ 221
9.3.2 When Are 2SLS and 3SLS Equivalent? 224
9.3.3 Estimating the Reduced Form Parameters 224
9.4 Additional Topics in Linear SEMs 225
Contents viii
9.4.1 Using Cross Equation Restrictions to Achieve Identification 225
9.4.2 Using Covariance Restrictions to Achieve Identification 227
9.4.3 Subtleties Concerning Identification and E‰ciency in Linear
Systems 229
9.5 SEMs Nonlinear in Endogenous Variables 230
9.5.1 Identification 230
9.5.2 Estimation 235
9.6 Di¤erent Instruments for Di¤erent Equations 237
Problems 239
10 Basic Linear Unobserved E¤ects Panel Data Models 247
10.1 Motivation: The Omitted Variables Problem 247
10.2 Assumptions about the Unobserved E¤ects and Explanatory
Variables 251
10.2.1 Random or Fixed E¤ects? 251
10.2.2 Strict Exogeneity Assumptions on the Explanatory
Variables 252
10.2.3 Some Examples of Unobserved E¤ects Panel Data Models 254
10.3 Estimating Unobserved E¤ects Models by Pooled OLS 256
10.4 Random E¤ects Methods 257
10.4.1 Estimation and Inference under the Basic Random E¤ects
Assumptions 257
10.4.2 Robust Variance Matrix Estimator 262
10.4.3 A General FGLS Analysis 263
10.4.4 Testing for the Presence of an Unobserved E¤ect 264
10.5 Fixed E¤ects Methods 265
10.5.1 Consistency of the Fixed E¤ects Estimator 265
10.5.2 Asymptotic Inference with Fixed E¤ects 269
10.5.3 The Dummy Variable Regression 272
10.5.4 Serial Correlation and the Robust Variance Matrix
Estimator 274
10.5.5 Fixed E¤ects GLS 276
10.5.6 Using Fixed E¤ects Estimation for Policy Analysis 278
10.6 First Di¤erencing Methods 279
10.6.1 Inference 279
10.6.2 Robust Variance Matrix 282
Contents ix
10.6.3 Testing for Serial Correlation 282
10.6.4 Policy Analysis Using First Di¤erencing 283
10.7 Comparison of Estimators 284
10.7.1 Fixed E¤ects versus First Di¤erencing 284
10.7.2 The Relationship between the Random E¤ects and Fixed
E¤ects Estimators 286
10.7.3 The Hausman Test Comparing the RE and FE Estimators 288
Problems 291
11 More Topics in Linear Unobserved E¤ects Models 299
11.1 Unobserved E¤ects Models without the Strict Exogeneity
Assumption 299
11.1.1 Models under Sequential Moment Restrictions 299
11.1.2 Models with Strictly and Sequentially Exogenous
Explanatory Variables 305
11.1.3 Models with Contemporaneous Correlation between Some
Explanatory Variables and the Idiosyncratic Error 307
11.1.4 Summary of Models without Strictly Exogenous
Explanatory Variables 314
11.2 Models with Individual-Specific Slopes 315
11.2.1 A Random Trend Model 315
11.2.2 General Models with Individual-Specific Slopes 317
11.3 GMM Approaches to Linear Unobserved E¤ects Models 322
11.3.1 Equivalence between 3SLS and Standard Panel Data
Estimators 322
11.3.2 Chamberlain’s Approach to Unobserved E¤ects Models 323
11.4 Hausman and Taylor-Type Models 325
11.5 Applying Panel Data Methods to Matched Pairs and Cluster
Samples 328
Problems 332
III GENERAL APPROACHES TO NONLINEAR ESTIMATION 339
12 M-Estimation 341
12.1 Introduction 341
12.2 Identification, Uniform Convergence, and Consistency 345
12.3 Asymptotic Normality 349
Contents x
12.4 Two-Step M-Estimators 353
12.4.1 Consistency 353
12.4.2 Asymptotic Normality 354
12.5 Estimating the Asymptotic Variance 356
12.5.1 Estimation without Nuisance Parameters 356
12.5.2 Adjustments for Two-Step Estimation 361
12.6 Hypothesis Testing 362
12.6.1 Wald Tests 362
12.6.2 Score (or Lagrange Multiplier) Tests 363
12.6.3 Tests Based on the Change in the Objective Function 369
12.6.4 Behavior of the Statistics under Alternatives 371
12.7 Optimization Methods 372
12.7.1 The Newton-Raphson Method 372
12.7.2 The Berndt, Hall, Hall, and Hausman Algorithm 374
12.7.3 The Generalized Gauss-Newton Method 375
12.7.4 Concentrating Parameters out of the Objective Function 376
12.8 Simulation and Resampling Methods 377
12.8.1 Monte Carlo Simulation 377
12.8.2 Bootstrapping 378
Problems 380
13 Maximum Likelihood Methods 385
13.1 Introduction 385
13.2 Preliminaries and Examples 386
13.3 General Framework for Conditional MLE 389
13.4 Consistency of Conditional MLE 391
13.5 Asymptotic Normality and Asymptotic Variance Estimation 392
13.5.1 Asymptotic Normality 392
13.5.2 Estimating the Asymptotic Variance 395
13.6 Hypothesis Testing 397
13.7 Specification Testing 398
13.8 Partial Likelihood Methods for Panel Data and Cluster Samples 401
13.8.1 Setup for Panel Data 401
13.8.2 Asymptotic Inference 405
13.8.3 Inference with Dynamically Complete Models 408
13.8.4 Inference under Cluster Sampling 409
Contents xi
13.9 Panel Data Models with Unobserved E¤ects 410
13.9.1 Models with Strictly Exogenous Explanatory Variables 410
13.9.2 Models with Lagged Dependent Variables 412
13.10 Two-Step MLE 413
Problems 414
Appendix 13A 418
14 Generalized Method of Moments and Minimum Distance Estimation 421
14.1 Asymptotic Properties of GMM 421
14.2 Estimation under Orthogonality Conditions 426
14.3 Systems of Nonlinear Equations 428
14.4 Panel Data Applications 434
14.5 E‰cient Estimation 436
14.5.1 A General E‰ciency Framework 436
14.5.2 E‰ciency of MLE 438
14.5.3 E‰cient Choice of Instruments under Conditional Moment
Restrictions 439
14.6 Classical Minimum Distance Estimation 442
Problems 446
Appendix 14A 448
IV NONLINEAR MODELS AND RELATED TOPICS 451
15 Discrete Response Models 453
15.1 Introduction 453
15.2 The Linear Probability Model for Binary Response 454
15.3 Index Models for Binary Response: Probit and Logit 457
15.4 Maximum Likelihood Estimation of Binary Response Index
Models 460
15.5 Testing in Binary Response Index Models 461
15.5.1 Testing Multiple Exclusion Restrictions 461
15.5.2 Testing Nonlinear Hypotheses about b 463
15.5.3 Tests against More General Alternatives 463
15.6 Reporting the Results for Probit and Logit 465
15.7 Specification Issues in Binary Response Models 470
15.7.1 Neglected Heterogeneity 470
15.7.2 Continuous Endogenous Explanatory Variables 472
Contents xii
15.7.3 A Binary Endogenous Explanatory Variable 477
15.7.4 Heteroskedasticity and Nonnormality in the Latent
Variable Model 479
15.7.5 Estimation under Weaker Assumptions 480
15.8 Binary Response Models for Panel Data and Cluster Samples 482
15.8.1 Pooled Probit and Logit 482
15.8.2 Unobserved E¤ects Probit Models under Strict Exogeneity 483
15.8.3 Unobserved E¤ects Logit Models under Strict Exogeneity 490
15.8.4 Dynamic Unobserved E¤ects Models 493
15.8.5 Semiparametric Approaches 495
15.8.6 Cluster Samples 496
15.9 Multinomial Response Models 497
15.9.1 Multinomial Logit 497
15.9.2 Probabilistic Choice Models 500
15.10 Ordered Response Models 504
15.10.1 Ordered Logit and Ordered Probit 504
15.10.2 Applying Ordered Probit to Interval-Coded Data 508
Problems 509
16 Corner Solution Outcomes and Censored Regression Models 517
16.1 Introduction and Motivation 517
16.2 Derivations of Expected Values 521
16.3 Inconsistency of OLS 524
16.4 Estimation and Inference with Censored Tobit 525
16.5 Reporting the Results 527
16.6 Specification Issues in Tobit Models 529
16.6.1 Neglected Heterogeneity 529
16.6.2 Endogenous Explanatory Variables 530
16.6.3 Heteroskedasticity and Nonnormality in the Latent
Variable Model 533
16.6.4 Estimation under Conditional Median Restrictions 535
16.7 Some Alternatives to Censored Tobit for Corner Solution
Outcomes 536
16.8 Applying Censored Regression to Panel Data and Cluster Samples 538
16.8.1 Pooled Tobit 538
16.8.2 Unobserved E¤ects Tobit Models under Strict Exogeneity 540
Contents xiii
16.8.3 Dynamic Unobserved E¤ects Tobit Models 542
Problems 544
17 Sample Selection, Attrition, and Stratified Sampling 551
17.1 Introduction 551
17.2 When Can Sample Selection Be Ignored? 552
17.2.1 Linear Models: OLS and 2SLS 552
17.2.2 Nonlinear Models 556
17.3 Selection on the Basis of the Response Variable: Truncated
Regression 558
17.4 A Probit Selection Equation 560
17.4.1 Exogenous Explanatory Variables 560
17.4.2 Endogenous Explanatory Variables 567
17.4.3 Binary Response Model with Sample Selection 570
17.5 A Tobit Selection Equation 571
17.5.1 Exogenous Explanatory Variables 571
17.5.2 Endogenous Explanatory Variables 573
17.6 Estimating Structural Tobit Equations with Sample Selection 575
17.7 Sample Selection and Attrition in Linear Panel Data Models 577
17.7.1 Fixed E¤ects Estimation with Unbalanced Panels 578
17.7.2 Testing and Correcting for Sample Selection Bias 581
17.7.3 Attrition 585
17.8 Stratified Sampling 590
17.8.1 Standard Stratified Sampling and Variable Probability
Sampling 590
17.8.2 Weighted Estimators to Account for Stratification 592
17.8.3 Stratification Based on Exogenous Variables 596
Problems 598
18 Estimating Average Treatment E¤ects 603
18.1 Introduction 603
18.2 A Counterfactual Setting and the Self-Selection Problem 603
18.3 Methods Assuming Ignorability of Treatment 607
18.3.1 Regression Methods 608
18.3.2 Methods Based on the Propensity Score 614
18.4 Instrumental Variables Methods 621
18.4.1 Estimating the ATE Using IV 621
Contents xiv
18.4.2 Estimating the Local Average Treatment E¤ect by IV 633
18.5 Further Issues 636
18.5.1 Special Considerations for Binary and Corner Solution
Responses 636
18.5.2 Panel Data 637
18.5.3 Nonbinary Treatments 638
18.5.4 Multiple Treatments 642
Problems 642
19 Count Data and Related Models 645
19.1 Why Count Data Models? 645
19.2 Poisson Regression Models with Cross Section Data 646
19.2.1 Assumptions Used for Poisson Regression 646
19.2.2 Consistency of the Poisson QMLE 648
19.2.3 Asymptotic Normality of the Poisson QMLE 649
19.2.4 Hypothesis Testing 653
19.2.5 Specification Testing 654
19.3 Other Count Data Regression Models 657
19.3.1 Negative Binomial Regression Models 657
19.3.2 Binomial Regression Models 659
19.4 Other QMLEs in the Linear Exponential Family 660
19.4.1 Exponential Regression Models 661
19.4.2 Fractional Logit Regression 661
19.5 Endogeneity and Sample Selection with an Exponential Regression
Function 663
19.5.1 Endogeneity 663
19.5.2 Sample Selection 666
19.6 Panel Data Methods 668
19.6.1 Pooled QMLE 668
19.6.2 Specifying Models of Conditional Expectations with
Unobserved E¤ects 670
19.6.3 Random E¤ects Methods 671
19.6.4 Fixed E¤ects Poisson Estimation 674
19.6.5 Relaxing the Strict Exogeneity Assumption 676
Problems 678
Contents xv
20 Duration Analysis 685
20.1 Introduction 685
20.2 Hazard Functions 686
20.2.1 Hazard Functions without Covariates 686
20.2.2 Hazard Functions Conditional on Time-Invariant
Covariates 690
20.2.3 Hazard Functions Conditional on Time-Varying
Covariates 691
20.3 Analysis of Single-Spell Data with Time-Invariant Covariates 693
20.3.1 Flow Sampling 694
20.3.2 Maximum Likelihood Estimation with Censored Flow
Data 695
20.3.3 Stock Sampling 700
20.3.4 Unobserved Heterogeneity 703
20.4 Analysis of Grouped Duration Data 706
20.4.1 Time-Invariant Covariates 707
20.4.2 Time-Varying Covariates 711
20.4.3 Unobserved Heterogeneity 713
20.5 Further Issues 714
20.5.1 Cox’s Partial Likelihood Method for the Proportional
Hazard Model 714
20.5.2 Multiple-Spell Data 714
20.5.3 Competing Risks Models 715
Problems 715
References 721
Index 737 |