No Title

Next: About this document

Spring 1998 John Rust
Economics 551b 37 Hillhouse, Rm. 27

PROBLEM SET 1

Nonlinear Estimation of Binary Choice Models

QUESTION 1 Extract data in data3.asc in the

pub/John_Rust/courses/econ551/regression/

directory on gemini.econ.yale.edu (either ftp to gemini.econ.yale.edu and login as ``anonymous'' and cd pub/John_Rust/courses/econ551/regression and get data1.asc or click on the hyperlink in the html version of this document). This data file contains n=3000 IID observations that I generated from the binary probability model:

where is some parametric model of the conditional probability of the binary variable y given x, i.e. . Two standard models for are the logit and probit models. In the logit model we have

and in the probit mode we have

where is the standard normal CDF, i.e.

where

More generally, could take the form

where F is an arbitrary continuous CDF.

1.

Show that versions of the logit and probit models can be derived from an underlying random utility model where a decision maker has utility function of the form:

and takes action y=1 if and takes action y=0 if . Derive the implied choice probability in the case where is a bivariate normal random vector with , and and and . What is the form of in the general case when has an unrestricted bivariate normal distribution with mean vector and covariance matrix ? If the utility function includes a constant term, i.e. are the , and parameters all separately identified if we only have access to data on (y,x) pairs?

2.

Derive the form of the choice probability under the same assumptions are part 1 above but when

has a bivariate Type I extreme value distribution by doing problem 7 of the 1997 Econ 551 problem set 3. By doing this you will have derived the binary logit model from first principles.

3.

Using the artificially generated data in pub/John_Rust/courses/econ551/regression/data3.asc

compute maximum likelihood estimates of the parameters of the logit and probit specifications given in equations (2) and (3) above, where is given by:

4.

Is it possible to consistently estimate

by doing nonlinear least squares estimation of the nonlinear regression formulation of the binary probability model

instead of doing maximum likelihood? If so, provide a proof of the consistency of the NLLS estimator. If not, provide a counterexample showing that the NLLS estimator is inconsistent.

5.

Estimate both the probit and logit specifications by nonlinear least squares as suggested in part (4). How do the parameter estimates and standard errors compare to the maximum likelihood estimates computed in part 3?

6.

Is there any problem of heteroscedasticity in the nonlinear regression formulation of the problem in (4)? If so, derive the form of the heteroscedasticity and, using the estimated ``first stage'' parameters from part 5 above, compute second stage ``feasible generalized least squares'' (FGLS) estimates of

7.

Are the FGLS estimates of

consistent and asymptotically normally distributed (assuming the model is correctly specified)? If so, derive the asymptotic distribution of the FGLS estimator, and if not provide a counter example showing that the FGLS estimator is inconsistent or not asymptotically normally distributed. If you conclude that the FGLS estimator is asymptotically normally distributed, is it as efficient as the maximum likelihood estimator of

? Explain your reasoning for full credit.

8.

Is it possible to determine whether the data in the file data3.asc are generated from a logit or probit model? In answering this question, consider whether you could estimate

nonparametrically via non-parametric regression. Is there any way you could use the nonparametric regression estimate of

to help discriminate between the logit and probit specifications?

About this document ...

Next: About this document

John Rust
Sat Mar 21 13:09:15 CST 1998