Economics 551-B: ECONOMETRIC METHODS II:
Professor John Rust
http://gemini.econ.yale.edu/jrust/econ551.html

This course applies the probabilistic and limit-theoretic tools (WLLN, SLLN, CLT, etc.) presented in Economics 551-A to conduct inference in a wide class of econometric models. The course will focus on applications of econometric methods to substantive problems, although we will discuss a number general ``philosophical'' issues at various points in the course.

The first issue is whether one ought to use of Bayesian or Classical methods of inference. I will briefly cover Bayesian methods which have been revitalized given recent developments in monte carlo simulation and numerical integration. Nevertheless, Bayesian methods are still computationally burdensome and heavily linked to particular parametric functional forms, limiting their applicability to semi- and nonparametric problems (discussed further below). The primary focus of this course is on classical statistical inference using large-sample asymptotics to derive approximate sampling distributions of various estimators.

The second issue is whether one ought to use parametric, semi-parametric, or non-parametric estimation methods. The issue is best framed as a trade-off between efficient estimation under strong a priori assumptions about the underlying probabilistic structure (with a consequent risk that the estimator will be inconsistent if these assumptions are violated) versus consistent estimation under weak a priori assumptions (at the cost of slower rates of convergence and/or less efficient estimation of any particular probabilistic structure). I argue that we do not face an ``all or nothing'' choice between parametric and nonparametric methods, rather the problem is to select an appropriate method from an ``estimation possibility'' frontier depending on the strength of the prior assumptions we are willing to impose in any particular problem. A convenient way to trace out this frontier is via parametric ``flexible functional forms'' that are capable of approximating general probabilistic structures arbitrarily well as the number of parameters increases. Nonlinear regression using series approximations, neural networks with variable numbers of ``hidden units'', and ``seive'' methods where the parameter space increases at an appropriate rate with sample size such as maximum likelihood estimation based on Hermite series expansions about a Gaussian kernel are examples of this approach. In fact, we will show that these ``flexible'' methods are generally the only feasible way to go about non-parametric estimation since direct estimation by optimizing an estimation criterion over an infinite-dimensional space is generally an ``ill-posed'' problem.

Nearly all of the ``well-posed'' methods rely on rules fixing the rate at which the dimension of the parameter space increases with sample size (or in the case of kernels and other smoothing methods, the rate at which ``bandwidths'' and other smoothing parameters tend to zero with sample size) or various penalty functions or other data-driven procedures for determining how many parameters to include in the estimation in order to avoid ``overfitting'' the data. In some sense these procedures are semi-automated methods for ``specification searching'', a procedure that has been discredited by Bayesian econometricians such as Edward Leamer. Paradoxically, it turns out that these sorts of specification searching procedures consistently identify the true model, whereas Bayesian methods run into serious difficulties in semi- and nonparametric contexts. Specifically, if the parameter space is infinite-dimensional the prior can completely overwhelm the data in the sense that the posterior distribution is not guaranteed to converge to a point mass at the true parameter value as the number of observations tends to infinity.

The final issue is whether one ought to be doing structural or reduced-form estimation of econometric models. I review the Haavelmo-Koopmans-Marschak-Lucas arguments for the use of structural econometric models that either have been derived from, or are consistent with, an underlying economic theory. These arguments show that structural models can be used to predict the effects of hypothetical policy or environmental changes, whereas reduced-form models are generally only capable of summarizing responses to existing or historical policy or environmental changes. On the other hand, structural models typically depend on strong, typically parametric a priori identifying assumptions whereas reduced-form models can employ semi and non-parametric estimation methods that require much weaker assumptions about the underlying structure. I discuss the identification problem and show that many commonly analyzed structural models are ``non-parametrically unidentified'' which implies that one generally cannot estimate structural models using fully non-parametric methods (although there are many cases where flexible parametric and semi-parametric methods can be used to estimate the structure). Where nonparametric methods can be useful is in specification testing, i.e. comparing the reduced-form of the structural model a nonparametric estimate of the reduced form. One possible way to resolve the identification problem is by integrating experimental and survey data. I will discuss this issue in the context of comparing structural vs. experimental predictions of the impact of job training programs.

I use Manski's (1988) ``analogy principle'' as an intuitive unifying concept motivating the main ``classical'' estimation methods. Examples of the approach include estimation of the population mean by the sample mean, or estimation of the population CDF by the sample CDF. The analogy principle is the best way to understand the seemingly bewildering array of econometric estimators most of which can be classified as extremum or M-estimators such as linear and nonlinear least squares, maximum likelihood, generalized method of moments, minimum distance, minimum chi-square, etc.

We begin the course by reviewing the theory of parametric estimation and the fundamental efficiency bounds for unbiased least squares and maximum likelihood estimators, namely the Gauss-Markov and Cramer-Rao lower bounds. We also present extensions of the C-R bound to asymptotically-unbiased LAN estimators, Hájek's (1972) asymptotic local minimax bound, and briefly discuss Bahadur's (1960,1967) ``large deviation'' bounds. I then review results on the asymptotic equivalence of a number of different nonlinear estimators including method of moments, maximum likelihood, minimum distance, and minimum chi-square in the special case of multinomial distributions. Since multinomial distributions are dense in the space of all distributions, these results can be applied to derive Chamberlain's (1987,1992) efficiency bounds for semi-parametric estimators based on conditional moment restrictions. I complete the review of parametric estimation methods with a survey model specification tests including the standard ``Holy Trinity'', Chi-square, information-matrix, Hausman-Wu, and conditional moment tests. I briefly discuss issues of optimality and power of these tests, and the more difficult issues of sequential testing, model revision, and model selection. The literature on ``model selection'' serves as a bridge in moving from parametric to semi-parametric and non-parametric estimation methods. We cover several papers showing that there exist ``automatic'' rules for ``specification searching'' over an appropriately expanding family of parametric models will result in sequence of selected models that converges to an underlying ``true'' data generating process.

The second part of the course focuses on nonparametric estimation of density and regression functions using kernels, nearest neighbor methods, and various ``sieve'' estimation methods including splines, series approximations, and neural networks. I then turn to semiparametric models and flexible ``seminonparametric'' models that represent the middle ground between parametric and non-parametric estimation methods. Some of the semiparametric estimation methods are versions of L-estimates (linear combinations of order statistics) and R-estimates (estimates dervied from rank tests). In addition, some recent semi-parametric estimators are functionals of U-statistics, so I briefly review the relevant LLN's, CLT's, and the concept of projections of a U-statistic. In order to compare the relative efficiency of the various methods, I present the Begun-Hwang-Hall-Wellner generalized version of the Cramer-Rao lower bound for semiparametric models, and applications of this result to a variety of models. In certain problems the prior assumptions are so weak (such as the median independence assumption underlying Manski's maximum score estimator) that information for the parameteric component of the model is zero. This implies that a $\sqrt{N}$-consistent estimator does not exist. I review rate of convergence results for other non-parametric and semi-paramteric estimators delineating problems for which standard $\sqrt{N}$ rates are achievable, versus problems where convergence occurs at slower rates. In certain cases one can smooth discontinuous semi-parametric objective functions in such a way to guarantee consistency while still retaining $\sqrt{N}$ (or arbitrarily close to $\sqrt{N}$) convergence rates. Examples include Powell's (1984), (1986) work on LAD and quantile estimation of the censored regression model, and Horowitz's (1992) smoothed maximum score estimator.

The final methodological module focuses on simulation estimation, which has proven very useful for avoiding the computational burden of numerical integration that previoulsy prevented insurmountable obstacles to estimation of a number of econometric models. I review several different types of simulation estimators for discrete choice problems (where simulation is used to avoid high-dimensional numerical integrations required to compute choice probabilities with many alternatives) and macro/time-series applications. The simulation estimators include simulated maximum likelihood (SML), method of simulated momemts (MSM), and method of simulated scores (MSS), and the semi-parametric minimum distance estimator of Gallant and Tauchen. We review various types of simulators including crude frequency sampling as well as various types of ``smoothed'' probability estimators such as the Geweke-Hajivassiliou-Keane (GHK) method that has the advantage of yielding objective functions that are smooth functions of the model's parameters. We will also discuss the use of antithetic variates, acceptance/rejection, importance sampling, and Gibbs sampling methods (borrowed from the literature on Bayesian pattern recognition) to reduce noise and accelerate convergence of monte carlo methods.

The methodological principles outlined above will be illustrated in a variety of applied contexts:

$\bullet$
standard linear models, including models with censoring and truncation
$\bullet$
panel data and transition/duration models
$\bullet$
static and dynamic discrete/continuous choice models

GRADES:

The goal of this class is to introduce students to state-of-the art methods as well as unsolved problems at the frontiers of current research in econometrics. My philosophy is that the best way to learn these methods and to appreciate their problems and limitations is via ``hands-on'' applications. Thus, grades in this course will be based on: 1) periodic take home problems assigned during lectures (20% of grade), 2) a midterm exam (20% of grade) 3) a final exam (20% of grade) and 3) an original research paper (30 pages maximum) due at the scheduled final exam period for this course and a 15-30 minute in-class presentation describing the topic, the data, and the econometric methods to be used (40% of grade). Most students will choose applied topics involving actual estimation of a particular econometric model, although theoretically oriented papers are also welcome.

Main Texts (choose at least one for course)

1
Hayashi, F. (2000) Econometrics Princeton University Press.

2
Ruud, P. (2000) An Introduction to Classical Econometric Theory Oxford University Press.

3
Greene, W.H. (2000) Econometric Analysis Prentice Hall.

Advanced/Specialized Texts (worth consulting but not required)

1
Handbook of Econometrics Vols. 1-4 Elsevier, North Holland.

2
Amemiya, T. (1985) Advanced Econometrics Harvard University Press.

3
Berndt, E.R. (1991) The Practice of Econometrics: Classic and Contemporary Addison Wesley.

4
Brockwell, P. and R. Davis (1991) Time Series: Theory and Methods Springer-Verlag.

5
Davidson, R. and J.G. MacKinnon (1996) Estimation and Inference in Econometrics Oxford University Press.

6
Gallant, A.R. (1997) An Introduction to Econometric Theory Princeton University Press.

7
Gelman, A. J. Carlin, H. Stern and D. Rubin (1995) Bayesian Data Analysis Chapman and Hall.

8
Manski, C.F. (1988) Analog Estimation Methods in Econometrics Chapman and Hall.

9
Poirier, D.J. (1995) Intermediate Statistics and Econometrics: A Comparative Approach MIT Press, Cambridge.

10
Rao, C.R. (1973) Linear Statistical Inference and It's Applications Wiley.

11
Serfling, R.J (1980) Approximation Theorems of Mathematical Statistics Wiley.

12
Spanos, A. (1999) Probability Theory and Statistical Inference Cambridge University Press.

13
van der Vaart, A.W. (1998) Asymptotic Statistics Cambridge Series in Statistical and Probabilistic Mathematics, Cambridge University Press.

14
van der Vaart, A.W. and J. A. Wellner (1996) Weak Convergence and Empirical Processes Springer Verlag.

15
White, H. (1984) Asymptotic Theory for Econometricians Academic Press.

Overview of Methodological Debates in Econometrics



John Rust
2001-01-09
  • Bibliography
  • Bibliography
  • Bibliography
  • Bibliography
  • Bibliography
  • Bibliography
  • Bibliography
  • Bibliography
  • Bibliography
  • Bibliography
  • Bibliography
  • Bibliography
  • Bibliography
  • Bibliography
  • Bibliography
  • Bibliography
  • Bibliography
  • Bibliography
  • Bibliography
  • Bibliography
  • About this document ...
    © John Rust 2001-01-09