next up previous
Next: About this document

Spring 1998 John Rust
Economics 551b 37 Hillhouse, Rm. 27

PROBLEM SET 1

Nonlinear Estimation of Binary Choice Models

QUESTION 1 Extract data in data3.asc in the

pub/John_Rust/courses/econ551/regression/

directory on gemini.econ.yale.edu (either ftp to gemini.econ.yale.edu and login as ``anonymous'' and cd pub/John_Rust/courses/econ551/regression and get data1.asc or click on the hyperlink in the html version of this document). This data file contains n=3000 IID observations tex2html_wrap_inline62 that I generated from the binary probability model:

equation17

where tex2html_wrap_inline64 is some parametric model of the conditional probability of the binary variable y given x, i.e. tex2html_wrap_inline70. Two standard models for tex2html_wrap_inline72 are the logit and probit models. In the logit model we have

equation26

and in the probit mode we have

equation29

where tex2html_wrap_inline74 is the standard normal CDF, i.e.

displaymath50

where

displaymath51

More generally, tex2html_wrap_inline72 could take the form

displaymath52

where F is an arbitrary continuous CDF.

1.
Show that versions of the logit and probit models can be derived from an underlying random utility model where a decision maker has utility function of the form:

displaymath53

and takes action y=1 if tex2html_wrap_inline82 and takes action y=0 if tex2html_wrap_inline86 . Derive the implied choice probability tex2html_wrap_inline88 in the case where tex2html_wrap_inline90 is a bivariate normal random vector with tex2html_wrap_inline92 , tex2html_wrap_inline94 and tex2html_wrap_inline96 and tex2html_wrap_inline98 and tex2html_wrap_inline100 . What is the form of tex2html_wrap_inline64 in the general case when tex2html_wrap_inline90 has an unrestricted bivariate normal distribution with mean vector tex2html_wrap_inline106 and covariance matrix tex2html_wrap_inline108 ? If the utility function includes a constant term, i.e. tex2html_wrap_inline110 are the tex2html_wrap_inline112 , tex2html_wrap_inline106 and tex2html_wrap_inline108 parameters all separately identified if we only have access to data on (y,x) pairs?

2.
Derive the form of the choice probability under the same assumptions are part 1 above but when tex2html_wrap_inline90 has a bivariate Type I extreme value distribution by doing problem 7 of the 1997 Econ 551 problem set 3. By doing this you will have derived the binary logit model from first principles.

3.
Using the artificially generated data in pub/John_Rust/courses/econ551/regression/data3.asc

compute maximum likelihood estimates of the parameters tex2html_wrap_inline122 of the logit and probit specifications given in equations (2) and (3) above, where tex2html_wrap_inline124 is given by:

displaymath54

4.
Is it possible to consistently estimate tex2html_wrap_inline112 by doing nonlinear least squares estimation of the nonlinear regression formulation of the binary probability model

equation42

instead of doing maximum likelihood? If so, provide a proof of the consistency of the NLLS estimator. If not, provide a counterexample showing that the NLLS estimator is inconsistent.

5.
Estimate both the probit and logit specifications by nonlinear least squares as suggested in part (4). How do the parameter estimates and standard errors compare to the maximum likelihood estimates computed in part 3?

6.
Is there any problem of heteroscedasticity in the nonlinear regression formulation of the problem in (4)? If so, derive the form of the heteroscedasticity and, using the estimated ``first stage'' parameters from part 5 above, compute second stage ``feasible generalized least squares'' (FGLS) estimates of tex2html_wrap_inline112 .

7.
Are the FGLS estimates of tex2html_wrap_inline112 consistent and asymptotically normally distributed (assuming the model is correctly specified)? If so, derive the asymptotic distribution of the FGLS estimator, and if not provide a counter example showing that the FGLS estimator is inconsistent or not asymptotically normally distributed. If you conclude that the FGLS estimator is asymptotically normally distributed, is it as efficient as the maximum likelihood estimator of tex2html_wrap_inline112 ? Explain your reasoning for full credit.

8.
Is it possible to determine whether the data in the file data3.asc are generated from a logit or probit model? In answering this question, consider whether you could estimate tex2html_wrap_inline64 nonparametrically via non-parametric regression. Is there any way you could use the nonparametric regression estimate of tex2html_wrap_inline72 to help discriminate between the logit and probit specifications?




next up previous
Next: About this document

John Rust
Sat Mar 21 13:09:15 CST 1998