Econ 615 Assignments, Georgetown University, Fall, 2013

	Econ 615 Assignments Georgetown University Fall 2013 John Rust, Georgetown University Assignment 0 (due Tuesday Dec 3rd) Assignment 2 (due Tuesday Dec 3rd) Write a short essay comparing and contrasting and offering your views on the pros and cons to the different philosophies toward econometrics as reflected in the following two new books Kenneth I. Wolpin (2013) The Limits of Inference without Theory MIT Press Password protected temporary version here Charles F. Manski (2013 Public Policy in an Uncertain World Harvard University Press Password protected temporary version here My own comments on Manski and Wolpin's book (I was asked to do a review of Wolpin's book for the June, 2014 issue of the Journal of Economic Literature. I have a 5000-7000 word limit so that these comments will have to be dramatically edited down, but I wanted to provide an unabridged version for the students in Econ 615 to see). Assignment 3 (due Tuesday Dec 3rd) Use the Matlab/C code and data files in the distribu.zip file below to structurally estimate (using full or partial maximum likelihood estimate) the utility function parameters, and the paramters of the wage equation, probability of dying, probability of finding a job if not working and the probability of becoming unemployed. Using the estimated model and parameters, forecast the behavior and impact on individual welfare (measured as the aggregate willingness of a 20 year old, 50 year old and 60 year old to avoid (or adopt, if the change improves welfare) any of the following policy changes) the response to the following three policy changes: Increasing the age of retirement from 62 to 70 Increasing the unemployment benefits replacement rate from 20% to 30% and the retirement benefits replacement rate from 40% to 50% but at the same time increasing the payroll tax rate from 25% to 35% Abolishing unemployment benefits zip file with matlab/c code to estimate the retirement problem vint2d.c (version of vint2d.c file in the zip file above with comments done as /* comments / instead of // comments --- the Unix C compiler is fussy and does not like the latter type of commenting) Partial solutions to assignment 1 Plot of similar utility functions Plot of concentrated log-likelihood as a function of β Plot of log-likelihood as a function of θ₁ Plot of log-likelihood as a function of θ₂ Plot of log-likelihood as a function of θ₃ I have created Matlab code with artificial data resulting from a simulation program simulate_data.m that simulates observations from 500 hypothetical people followed from age 20 to their deaths at age 80 (I did not worry about modelling mortality and assumed everyone survives with probability 1 to age 80 but die with probability 1 at age 81). The decision problem is an discrete employment decision: work versus no work, that implements a simple static problem of optimal retirement behavior that I discussed in class. These consumers behave myopically and each period either work or retire depending on which decision gives them higher utility. There is a matlab function uf.m that encodes the utility function I have assumed, which has a square root function representing the utility of money income (or unemployment benefits if unemployed or pension/retirement benefits if retired) less an additive disutility of work for individuals who are working. There is also an additive disutility of searching for a job for those who are not working but decide to search for a job and return to work. I assume that if a person searches, they will be 100% successful in finding a job but they do incur the disutility of finding the job. I assume that both work disutility and the disutility of searching for a job are quadratic functions of the person's age so that there is a coefficient work_disutility_age* and another coefficient search_cost_age that are key determinants of a) when someone decides to retire, and b) if someone gets unemployed, whether or not they will try to go back to work, or decide to remain on unemployment benefits (if age 55 or younger), or retire and collect their pension (if older than 55, with 55 being the retirement age, i.e. the earliest age at which the person can collect their pension benefits. I assume that there is a 5 percent chance that a person can be involuntarily unemployed and this probability is IID over time. AT the start of each year a person makes a binary work/no work choice (with d=0 representing the decision not to work and d=1 representing the decision to work), conditional on some state variables (y,aw,e,age) where y is their wage offer they expect (and get) if they choose to work, aw is their average wage, e is their employment state, and of courage age is their age. The simulation program simulate_data.m produces the matlab data file data.dat which stores the results of simulating 500 people over their employment choices between age 20 and 80. The columns of this matrix are defined by line 87 of simulate_data.m which specifies the recursive formula for building the data.dat matrix data=[data; [i working work_state t income pension laid_off]]; so that the binary choice variable d is the Matlab variable working. the variable i is the sequence/ID number of a particular consumer, work_state is the employment status at the previous period, the e state variable above, t is the person's age, income is the state variable y and pension is the calculated pension benefit a person could receive (if older than 55), or the unemployment benefit a person would receive if age 55 or younger. Finally the laid_off variable is 1 if the person is laid off during the year, or 0 otherwise. We assume that a person makes an employment decision or intention at the start of each year, but due to unexpected events, may be laid off ex post later in that year due to unplanned bad outcomes. The probability of beinf laid off is unemp_prob and is assumed to equal 5% in this problem. Your first task is to try to struturally estimate the three parameters of the utility function theta that are set in the file setup.m. I have written a Matlab program estimate.m that estimates these parameters by Maximum likelihood using the log-likeihood function programmed in the file lfeval.m. Below is the results of running estimate.m using as starting values the true parameter values that I used to generate the data. You can see that the estimated parameters are close to the true values. Local minimum possible. Constraints satisfied. fmincon stopped because the size of the current search direction is less than twice the default value of the step size tolerance and constraints are satisfied to within the default value of the constraint tolerance. No active inequalities. Estimation converged, initial likelihood: -417.41 final likelihood: -417.202 estimated vs true parameters ans = 0.0020224 0.002 0.010182 0.01 0.5015 0.5 However the estimation program did not report standard errors for the parameters. You first task on this assignment is to calculate the estimated standard errors and use the covariance matrix for the parameters to test the hypothesis that the estimated parameters are equal to the true values above. Presumably, since this model is correctly specified we should not be able to reject this hypotheis. I suggest you add code in lfeval.m to calculate the gradient of the log-likeihood function with respect to theta and use this to calculate the Information matrix which can be used to estimate the covariance matrix of the estimated parameters. The second part of this assignment is to assume that consumers are actually dynamic decision makers and actually make decisions to maximize their expected discounted utility where you can assume that the true discount factor is beta=0.95. Write the equation for the Bellman equation and adapt the simulation program to simulate data from a population of 500 consumers who are maximizing expected discounted utility instead of behaving myopically. How does the labor supply behavior of a forward looking individual change relative to an individual who behaves myopically? Using the simulated data, adapt the routine lfeval.m to estimate the dynamic model of retirement behavior using the same true coefficients theta given above, as well as the discount factor beta, and the coefficients of the "wage equation" given in line 32 of simulate_data.m income=income_linear_termt+income_quadratic_term(t^2)+income_lagincome0+income_stdrandn; so that in addition to the utility function parameters theta and the discount parameter beta, you need to estimate the wage equation parameters alpha=(income_linear_term,income_quadratic_term,income_lag,income_std) and the unemployment probability unemp_prob. You can use the values chosen in setup.m and simulate_data.m as the true values and after simulating your dynamic model and estimating the parameters, do a similar exercise of computing the covariance matrix for the parameters and testing the joint hypothesis that all of your estimated structural parameters equal the true values. HINTS: I have updated the code and included the calculation of the gradient of the log-likelihood function in lfeval.m and I have added a new file that calculates the information matrix, information_matrix.m and I have added code in estimate.m that conducts a Wald test of the hypothesis that the estimated theta parameters equals the true values used to generate the artificial data. Note that the test rejects the null hypothesis very strongly! Your job (if you choose to accept it) is to find out why and correct any bugs in my code that may be causing the problem. Of you can ignore my code and write your own, and not worry about what bugs there may be in my code. If you are programming in Matlab, use the function interp2 to do 2-dimensional interpolation that is needed to interpolate the value functions over (y,aw) values as per my other hints in lectures. Also I have posted qgausl.m a Matlab program that generates Gaussian quadrature weights and abscissae to numerically integrate a function of one variable over a finite interval [a,b]. Finally note that Matlab has functions cdf to compute the cumulative distribution of various functions and icdf to compute the inverse CDF of various functions. These functions together with the hints I provided in lecture should enable you to write Matlab code that can calculate the dynamic labor supply/retirement problem by backward induction. Assignment 4 This assignment is optional, not for a grade. The purpose is to illustrate the structural estimation of a static game via a nested fixed point, maximum likelihood approach. I have the question and the full answer here. For the Gauss code that produced these answers and can be used to estimate static game models using this approach, see here. Assignment 5 Though I said it was hard to give a precise defintion of what we mean by a model and what the difference is between a structural model and a reduced-form model in econometrics, the best way to understand is to read empirical work and compare and constrast methologies employed, questions asked, and conclusions reached. In assignment one I would like you to read either the two labor papers by Angrist and coauthors and Robin and coauthors that were presented in the Labor Week conference here at Georgetown on Monday and Tuesday, or the two development papers, one by Townsend and coauthor (which won the Frisch Medal of the Econometric Society this year), and the other by Duflo and coauthor. All 4 papers are by leading people in the profession and represent some of the very best empirical work done in either the reduced form or the more structural econometric methodologies. However opinions may still differ about the pros and cons of different approaches and methodologies, and I want you to read one or the other pair of papers critically and write a several page analysis comparing and contrasting the papers, the methodologies and what you learned from them. There is no right or wrong answer here: just an attempt to get you into this literature and to start to encourage you to think independently about important economic research questions and how to analyze them empirically, and to what extent it is necessary or desirable or valuable to have a more or less explicit model in order to reach meaningful conclusions from data, or to make predictions or policy recommendations. I would like you to hand in this assignment by next Tuesday, at the make up class at 10am at a location to be announced. Readings: choose two read and write on the development papers or the labor papers, but not on both Development papers R. Chattopadhyah and E. Duflo (2004) Women and Policy Makers: Evidence from a Randomized Policy Experiment in India Econometrica 72-5 1409--1443. J. Kaboski and R. Townsend (2011) A Structural Evaluation of a Large Scale Quasi-Experimental Microfinance Initiative 79-5 1357--1406. Labor papers J. Angrist, P. Pathak, and C. Walters (2011) Explaining Charter School Effectiveness NBER working paper 17332. Christopher Walters (2012) A Structural Model of Charter School Choice and Academic Achievement J. Lise, C. Meghir and J. Robin (2012) Matching, Sorting and Wages Working paper. [slides] Answers by students Development papers Answer 1 Answer 2 Answer 3 Labor papers Answer 1 Answer 2 Answer 3 Answer 4 Answer 5 Answer 6 Answer 7 Answer 8

Econ 615 Assignments
Georgetown University
Fall 2013

John Rust, Georgetown University

Assignment 0 (due Tuesday Dec 3rd)

Assignment 2 (due Tuesday Dec 3rd)

Write a short essay comparing and contrasting and offering your views on the pros and cons to the different philosophies toward econometrics as reflected in the following two new books

Kenneth I. Wolpin (2013) The Limits of Inference without Theory MIT Press
Password protected temporary version here

Charles F. Manski (2013 Public Policy in an Uncertain World Harvard University Press
Password protected temporary version here

My own comments on Manski and Wolpin's book (I was asked to do a review of Wolpin's book for the June, 2014 issue of the Journal of Economic Literature. I have a 5000-7000 word limit so that these comments will have to be dramatically edited down, but I wanted to provide an unabridged version for the students in Econ 615 to see).

Assignment 3 (due Tuesday Dec 3rd)

Use the Matlab/C code and data files in the distribu.zip file below to structurally estimate (using full or partial maximum likelihood estimate) the utility function parameters, and the paramters of the wage equation, probability of dying, probability of finding a job if not working and the probability of becoming unemployed. Using the estimated model and parameters, forecast the behavior and impact on individual welfare (measured as the aggregate willingness of a 20 year old, 50 year old and 60 year old to avoid (or adopt, if the change improves welfare) any of the following policy changes) the response to the following three policy changes:

Increasing the age of retirement from 62 to 70

Increasing the unemployment benefits replacement rate from 20% to 30% and the retirement benefits replacement rate from 40% to 50% but at the same time increasing the payroll tax rate from 25% to 35%

Abolishing unemployment benefits

zip file with matlab/c code to estimate the retirement problem
vint2d.c (version of vint2d.c file in the zip file above with comments done as /* comments */ instead of // comments --- the Unix C compiler is fussy and does not like the latter type of commenting)
Partial solutions to assignment 1
Plot of similar utility functions
Plot of concentrated log-likelihood as a function of β
Plot of log-likelihood as a function of θ₁
Plot of log-likelihood as a function of θ₂
Plot of log-likelihood as a function of θ₃

I have created Matlab code with artificial data resulting from a simulation program simulate_data.m that simulates observations from 500 hypothetical people followed from age 20 to their deaths at age 80 (I did not worry about modelling mortality and assumed everyone survives with probability 1 to age 80 but die with probability 1 at age 81). The decision problem is an discrete employment decision: work versus no work, that implements a simple static problem of optimal retirement behavior that I discussed in class. These consumers behave myopically and each period either work or retire depending on which decision gives them higher utility. There is a matlab function uf.m that encodes the utility function I have assumed, which has a square root function representing the utility of money income (or unemployment benefits if unemployed or pension/retirement benefits if retired) less an additive disutility of work for individuals who are working. There is also an additive disutility of searching for a job for those who are not working but decide to search for a job and return to work. I assume that if a person searches, they will be 100% successful in finding a job but they do incur the disutility of finding the job. I assume that both work disutility and the disutility of searching for a job are quadratic functions of the person's age so that there is a coefficient work_disutility_age and another coefficient search_cost_age that are key determinants of a) when someone decides to retire, and b) if someone gets unemployed, whether or not they will try to go back to work, or decide to remain on unemployment benefits (if age 55 or younger), or retire and collect their pension (if older than 55, with 55 being the retirement age, i.e. the earliest age at which the person can collect their pension benefits. I assume that there is a 5 percent chance that a person can be involuntarily unemployed and this probability is IID over time. AT the start of each year a person makes a binary work/no work choice (with d=0 representing the decision not to work and d=1 representing the decision to work), conditional on some state variables (y,aw,e,age) where y is their wage offer they expect (and get) if they choose to work, aw is their average wage, e is their employment state, and of courage age is their age. The simulation program simulate_data.m produces the matlab data file data.dat which stores the results of simulating 500 people over their employment choices between age 20 and 80. The columns of this matrix are defined by line 87 of simulate_data.m which specifies the recursive formula for building the data.dat matrix

data=[data; [i working work_state t income pension laid_off]];

so that the binary choice variable d is the Matlab variable working. the variable i is the sequence/ID number of a particular consumer, work_state is the employment status at the previous period, the e state variable above, t is the person's age, income is the state variable y and pension is the calculated pension benefit a person could receive (if older than 55), or the unemployment benefit a person would receive if age 55 or younger. Finally the laid_off variable is 1 if the person is laid off during the year, or 0 otherwise. We assume that a person makes an employment decision or intention at the start of each year, but due to unexpected events, may be laid off ex post later in that year due to unplanned bad outcomes. The probability of beinf laid off is unemp_prob and is assumed to equal 5% in this problem.

Your first task is to try to struturally estimate the three parameters of the utility function theta that are set in the file setup.m. I have written a Matlab program estimate.m that estimates these parameters by Maximum likelihood using the log-likeihood function programmed in the file lfeval.m. Below is the results of running estimate.m using as starting values the true parameter values that I used to generate the data. You can see that the estimated parameters are close to the true values.

Local minimum possible. Constraints satisfied.

fmincon stopped because the size of the current search direction is less than
twice the default value of the step size tolerance and constraints are 
satisfied to within the default value of the constraint tolerance.


No active inequalities.
Estimation converged, initial likelihood: -417.41 final likelihood: -417.202
estimated vs true parameters

ans =

0.0020224        0.002
0.010182         0.01
0.5015           0.5

However the estimation program did not report standard errors for the parameters. You first task on this assignment is to calculate the estimated standard errors and use the covariance matrix for the parameters to test the hypothesis that the estimated parameters are equal to the true values above. Presumably, since this model is correctly specified we should not be able to reject this hypotheis. I suggest you add code in lfeval.m to calculate the gradient of the log-likeihood function with respect to theta and use this to calculate the Information matrix which can be used to estimate the covariance matrix of the estimated parameters.
The second part of this assignment is to assume that consumers are actually dynamic decision makers and actually make decisions to maximize their expected discounted utility where you can assume that the true discount factor is beta=0.95. Write the equation for the Bellman equation and adapt the simulation program to simulate data from a population of 500 consumers who are maximizing expected discounted utility instead of behaving myopically. How does the labor supply behavior of a forward looking individual change relative to an individual who behaves myopically? Using the simulated data, adapt the routine lfeval.m to estimate the dynamic model of retirement behavior using the same true coefficients theta given above, as well as the discount factor beta, and the coefficients of the "wage equation" given in line 32 of simulate_data.m
income=income_linear_term*t+income_quadratic_term*(t^2)+income_lag*income0+income_std*randn;
so that in addition to the utility function parameters theta and the discount parameter beta, you need to estimate the wage equation parameters alpha=(income_linear_term,income_quadratic_term,income_lag,income_std) and the unemployment probability unemp_prob. You can use the values chosen in setup.m and simulate_data.m as the true values and after simulating your dynamic model and estimating the parameters, do a similar exercise of computing the covariance matrix for the parameters and testing the joint hypothesis that all of your estimated structural parameters equal the true values.

HINTS: I have updated the code and included the calculation of the gradient of the log-likelihood function in lfeval.m and I have added a new file that calculates the information matrix, information_matrix.m and I have added code in estimate.m that conducts a Wald test of the hypothesis that the estimated theta parameters equals the true values used to generate the artificial data. Note that the test rejects the null hypothesis very strongly! Your job (if you choose to accept it) is to find out why and correct any bugs in my code that may be causing the problem. Of you can ignore my code and write your own, and not worry about what bugs there may be in my code.

If you are programming in Matlab, use the function interp2 to do 2-dimensional interpolation that is needed to interpolate the value functions over (y,aw) values as per my other hints in lectures. Also I have posted qgausl.m a Matlab program that generates Gaussian quadrature weights and abscissae to numerically integrate a function of one variable over a finite interval [a,b]. Finally note that Matlab has functions cdf to compute the cumulative distribution of various functions and icdf to compute the inverse CDF of various functions. These functions together with the hints I provided in lecture should enable you to write Matlab code that can calculate the dynamic labor supply/retirement problem by backward induction.

Assignment 4

This assignment is optional, not for a grade. The purpose is to illustrate the structural estimation of a static game via a nested fixed point, maximum likelihood approach. I have the question and the full answer here. For the Gauss code that produced these answers and can be used to estimate static game models using this approach, see here.

Assignment 5

Though I said it was hard to give a precise defintion of what we mean by a model and what the difference is between a structural model and a reduced-form model in econometrics, the best way to understand is to read empirical work and compare and constrast methologies employed, questions asked, and conclusions reached. In assignment one I would like you to read either the two labor papers by Angrist and coauthors and Robin and coauthors that were presented in the Labor Week conference here at Georgetown on Monday and Tuesday, or the two development papers, one by Townsend and coauthor (which won the Frisch Medal of the Econometric Society this year), and the other by Duflo and coauthor. All 4 papers are by leading people in the profession and represent some of the very best empirical work done in either the reduced form or the more structural econometric methodologies. However opinions may still differ about the pros and cons of different approaches and methodologies, and I want you to read one or the other pair of papers critically and write a several page analysis comparing and contrasting the papers, the methodologies and what you learned from them. There is no right or wrong answer here: just an attempt to get you into this literature and to start to encourage you to think independently about important economic research questions and how to analyze them empirically, and to what extent it is necessary or desirable or valuable to have a more or less explicit model in order to reach meaningful conclusions from data, or to make predictions or policy recommendations. I would like you to hand in this assignment by next Tuesday, at the make up class at 10am at a location to be announced.

Readings: choose two read and write on the development papers or the labor papers, but not on both

Development papers
R. Chattopadhyah and E. Duflo (2004) Women and Policy Makers: Evidence from a Randomized Policy Experiment in India Econometrica 72-5 1409--1443.
J. Kaboski and R. Townsend (2011) A Structural Evaluation of a Large Scale Quasi-Experimental Microfinance Initiative 79-5 1357--1406.
Labor papers
J. Angrist, P. Pathak, and C. Walters (2011) Explaining Charter School Effectiveness NBER working paper 17332.
Christopher Walters (2012) A Structural Model of Charter School Choice and Academic Achievement
J. Lise, C. Meghir and J. Robin (2012) Matching, Sorting and Wages Working paper. [slides]