next up previous
Next: About this document

Spring 1997 John Rust
Economics 551b 37 Hillhouse, Rm. 27

MIDTERM EXAM: SOLUTIONS

QUESTION 1 Seemingly unrelated regression estimator is identical to OLS estimator when regressors of all equations are identical. (Therefore you can check your SUR program by comparing SUR estimates and OLS estimates. Now, you have your own SUR program which you can use to estimate models with different regressors(!).)

We know that GLS estimator of SUR model is unbiased and efficient and can be written as

displaymath612

where

displaymath613

displaymath614

If we introduce the regressor matrices stacked by equation by equation instead of observation by observation (see solution of Q4 in Problem set 3), this estimator can be rewritten as:

displaymath612

where

displaymath616

displaymath617

Suppose we have identical regressors, tex2html_wrap_inline654 .

displaymath618

Then

eqnarray74

1. tex2html_wrap_inline656 should be constant among equations.

2. SUR-GLS estimator or OLS estimator equation by equation from argument above.

3. Construct sample covariance matrix from OLS residuals.

4. See the calculation above.

5. I apologize for asking you to do Gibbs sampling for the full set of 63 securities: this isn't really feasible on most computers in a reasonable amount of time. With 63 securities it takes special programming to be able to run the Gibbs sampling program without using a great deal of memory space. Even when you do this the program still takes quite a bit of cpu time for the ordinary Pentium cpus in the Statlab. However if you choose a few securities from the data set, say 3 or 4 stocks, then Gibbs sampling runs reasonably quickly. Any student who just did this part using just a subset of securities in stockdat received full credit. All students who even attempted to do this part received generous partial credit, and nearly full credit if you developed a computer program to carry out the Gibbs sampling that looked correct and included it in your anwer, regardless of whether it actually worked for the problem using all 63 securities. I was more interested in getting students to write code for this problem rather than seeing actual output since the output will differ from person to person by the nature of the Gibbs sampling algorithm. I have written a program surgibbs.gpr that carries out Gibbs sampling for seemingly unrelated regression models and have the code set up to do Gibbs sampling with 3 stocks from stockdat so you can see how it runs. The program comes with 2 procedures, set_hp.g and get_dat.g which you can modify to use the software for other problems, including an artificial problem where you know the true tex2html_wrap_inline658 and tex2html_wrap_inline660 parameters in advance (this is useful as a check that the program is coded correctly).

QUESTION 2 Let tex2html_wrap_inline662 be the number of times which outcome k occured in all sample.

displaymath619

With

displaymath620

the joint distribution of tex2html_wrap_inline666 can be written as

displaymath621

1. Since log likelihood function is

eqnarray174

MLE tex2html_wrap_inline668 (for tex2html_wrap_inline670 ) can be derived from

displaymath622

The solution to this system is

displaymath623

2.Since

displaymath624

displaymath625

Therefore tex2html_wrap_inline668 is an unbiased estimator.

3.

eqnarray229

displaymath626

eqnarray262

For tex2html_wrap_inline674

eqnarray284

Note tex2html_wrap_inline676 since at least one of tex2html_wrap_inline678 and tex2html_wrap_inline680 must be zero.

displaymath627

eqnarray327

Therefore

displaymath628

4.

For k=l

displaymath629

eqnarray388

For tex2html_wrap_inline674

displaymath630

eqnarray416

eqnarray430

5. MLE can be proved to be efficient if it has a variance equal to Cramer-Rao lower bound. You can easily show tex2html_wrap_inline686 by verifying tex2html_wrap_inline688 .

6. Run the program dirichlet.gpr (which in turn calls a procedure setparm.g) available on the Econ 551 Web page. This program does the calculations required in part 6, i.e. it compares that posterior probability that tex2html_wrap_inline690 (a draw from the Dirichlet posterior) is within a ball of radius .01 of the true parameter tex2html_wrap_inline694 . I calculate this probability by simulation. Since this posterior probability is simulated, and since the data used to form the posterior is simulated, it makes no sense to report numbers here since they vary from run to run. The important part of this exercise is to verify that the simulated probability from the exact posterior is very close to the probability calculated from the normal approximation to the posterior.

7. First, notice that the full tex2html_wrap_inline696 vector tex2html_wrap_inline698 is not identified: you can add a constant tex2html_wrap_inline700 to each component of tex2html_wrap_inline702 and the probability in (5) will be unchanged. Therefore I impose an arbitrary normalization that tex2html_wrap_inline704 and our problem reduces to the estimation of the unrestricted tex2html_wrap_inline706 vector tex2html_wrap_inline708 . Another arbitrary but convenient normalization is:

displaymath710

Under the first normalization, one can show that there is a one to one mapping between the K-1 tex2html_wrap_inline702 parameters and the K-1 tex2html_wrap_inline718 parameters. Thus, the by the invariance of maximum likelihood, it is easy to see that tex2html_wrap_inline720 will be given by the unique solution to the K-1 system of equations

displaymath724

Under the second normalization we can get an explicit representation: tex2html_wrap_inline726 . Since tex2html_wrap_inline728 is an unbiased estimator as was verified in part 2 above, it follows from Jensen's inequality that tex2html_wrap_inline726 will be a downward biased estimator of tex2html_wrap_inline732 . Due to the presence of bias and the arbitrariness of the identifying normalization, it is difficult to determine whether the MLE tex2html_wrap_inline734 attains the generalized Cramer-Rao lower bound in finite samples. However we know that this is a regular problem, so the MLE is asymptotically unbiased and efficient, and so does attain the Cramer-Rao lower bound asymptotically. Under the first identifying normalization, tex2html_wrap_inline704 , it is easy to calculate the information matrix in this case and verify that it is the same as the inverse of the information matrix for tex2html_wrap_inline718 . This should not be surprising, since the tex2html_wrap_inline702 parameters are an inverse transformation of the tex2html_wrap_inline718 parameters in this case. I present the derivation below. The information matrix tex2html_wrap_inline744 is given by:

displaymath746

But we have

displaymath748

and

displaymath750

where tex2html_wrap_inline752 is a tex2html_wrap_inline706 vector of zeros with a 1 in the tex2html_wrap_inline758 place. Substituting this into the equation for tex2html_wrap_inline744 above we get

displaymath762

You can verify that this matrix is the same as the covariance matrix for tex2html_wrap_inline764 given above, and as already demonstrated, this covariance matrix is the inverse of the information matrix for tex2html_wrap_inline718 . So the information matrix for tex2html_wrap_inline702 is the inverse of the information matrix for tex2html_wrap_inline718 .

QUESTION 3

eqnarray477

For the first term in the last inequality, tex2html_wrap_inline772 with probability 1 by assumption. For the second term, H continuous and tex2html_wrap_inline776 with probability 1, together implies tex2html_wrap_inline778 with probability 1 by continuous mapping theorem.

Therefore tex2html_wrap_inline780 with probability 1.

QUESTION 4. The Econ 551 web has Gauss programs for computing maximum likelihood estimates of the multinomial logit model plus the shell programs for running maximum likelihood. I have also posted evalbprob.g which is the procedure for computing the likelihood and derivatives for the biniomial probit model. You can run each of these programs to get the maximum likelihood estimates for the models. I have posted the estimation results for each of these programs on the Econ 551 web page.

A.
The shell program for the estimation is tnlest.gpr the procedure for calculating likelihood and derivatives of the alternative specific version of the trinomial logit model is evalmnl1.g and the output of the program is in the file data1.est all on the Econ 551 web page. The file data1.est presents estimation results for the full sample of 2,000 data points and the subsample of the first 1,500 data points. The coefficient estimates for both samples are similar, and the standard errors are higher for the the 1,500 subsample as expected. The question about the tex2html_wrap_inline782 leads us into hypothesis testing. A tex2html_wrap_inline784 goodness of fit test statistic would be appropriate here. But I didn't expect students to know anything about this since we haven't covered this topic at this point in Econ 551. But to anticipate, if the model is correctly specified then we should have the regression equation:

displaymath786

That is, tex2html_wrap_inline788 is an indicator for the event that the decision tex2html_wrap_inline790 taken by person i is alternative j should equal the choice probability tex2html_wrap_inline796 plus an error term, tex2html_wrap_inline798 . Since tex2html_wrap_inline796 is the conditional expectation of tex2html_wrap_inline788 , the error term must have mean zero. Therefore we can ``test'' the model both in sample and out of sample by computing the mean prediction errors, and seeing how close they are to zero:

displaymath804

Under standard regularity conditions, if the null hypothesis that the model is correctly specified, tex2html_wrap_inline806 where tex2html_wrap_inline808 is the unconditional variance of tex2html_wrap_inline798 . This can be used to form a test statistic. However we need to replace the unknown true tex2html_wrap_inline694 with the maximum likelihood estimate tex2html_wrap_inline814 and then determine the asymptotic distribution of the corresponding tex2html_wrap_inline816 statistic, doing an ``Amemiya correction'' for the fact that we are using an estimated value tex2html_wrap_inline814 to construct tex2html_wrap_inline816 and we need to account for this extra estimation error in the derivation of the asymptotic distribution of the test statistic. Once we do this, we can use the tex2html_wrap_inline822 statistic as a ``moment condition'' and set up a hypothesis test to see how well this moment condition is satisfied. Indeed, since tex2html_wrap_inline824 , we can construct an entire family of moment conditions that should be zero if the model is correctly specified. We can actually construct an entire vector of such test statistics, one for each alternative tex2html_wrap_inline826 and for each instrument X that we use in forming the moment conditions. Later in Econ 551 we will show how to test all these moment conditions simultaneously using a single Chi-square goodness of fit statistic. This statistic would be the analog of the tex2html_wrap_inline782 . It is important to note that we should expect the model to fit better within sample than out-of-sample since maximum likelihood is designed to choose parameters to fit a given sample of data as well as possible. In fact, with a full set of alternative specific dummies, it is easy to show that the mean prediction error for each alternative is identically zero within sample, i.e. tex2html_wrap_inline832 . However tex2html_wrap_inline816 is not necessarily zero if we use out of sample data, i.e. if we estimate tex2html_wrap_inline814 using tex2html_wrap_inline838 for tex2html_wrap_inline840 and then construct tex2html_wrap_inline816 using tex2html_wrap_inline838 for tex2html_wrap_inline846 . The program tnlest.gpr presents the mean prediction errors (by alternative) both in-sample and out-of-sample (where in sample tex2html_wrap_inline816 was constructed from the first N=1500 observations and the out-of-sample tex2html_wrap_inline852 was constructed from tex2html_wrap_inline814 and the remaining M=500 observations in the data set. The results verify that the mean prediction errors are zero (modulo rounding error) within sample, but are non-zero out of sample. Later in the course we will post software for doing tex2html_wrap_inline784 specification tests using the residuals from the estimated choice model.

B
The Econ 551 Web page has a shell program bpest.gpr and corresponding procedure evalbp.g to carry out the maximum likelihood estimation of the binary probit model. The estimation output is in the file data2.est. The true coefficient vector that was used to generate the data was tex2html_wrap_inline860 (the coefficients for the first alternative were normalized to zero) and the covariance matrix of the tex2html_wrap_inline862 error vector is given by

displaymath631

C
The Econ 551 Web page has a Gauss file surgibbs.gpr which carrys out the Gibbs sampling for the seemingly unrelated regression model. You can use this program to do Gibbs sampling on the data-augmented version of the probit model (i.e. where we treat utilities as observed) and structure the algorithm as in Rossi and McCulloch Journal of Econometrics (1994). I gave generous partial credit to any student who attempted to do this problem, and full credit to any student who developed computer code that looks correct and included it with the answer. By the way, the true coefficient vector used to generate the data was tex2html_wrap_inline864 where these 6 coefficients are for the constant and slopes of the 2 X variables for alternatives 2 and 3, and the coefficients for alternative 1 were normalized to zero. The covariance matrix for the error terms in the probit model is given by:

displaymath632

QUESTION 5. The fact that tex2html_wrap_inline868 implies that tex2html_wrap_inline870 . But from earlier in the semester we know that the median is the solution to the minimization problem

displaymath872

Conditioning on X it follows that tex2html_wrap_inline876 is the solution to the problem

displaymath878

so unconditionally we have:

displaymath880

Under assumptions of no multicollinearity between the columns of tex2html_wrap_inline882 , tex2html_wrap_inline884 will be the unique solution to the above minimization problem and hence is uniquely identified. Assume the observations tex2html_wrap_inline886 are IID. By the uniform strong law of large numbers we have with probability 1:

displaymath888

where tex2html_wrap_inline890 is a compact set in tex2html_wrap_inline892 containing tex2html_wrap_inline884 . Since tex2html_wrap_inline896 is uniquely minimized at tex2html_wrap_inline884 , it follows that tex2html_wrap_inline900 with probability 1. For further details on the asymptotic properties of the LAD estimator (including a derivation of the asymptotic distribution of the LAD estimator), see Koenker and Bassett, ``Regression Quantiles'' in the January 1978 Econometrica.




next up previous
Next: About this document

John Rust
Mon Apr 28 11:03:56 CDT 1997