代写SIPA INAF U8145 Spring 2024 Problem Set 4: An Evaluation of the PROGRESA Program in Mexico调试R语言程序
- 首页 >> OS编程SIPA INAF U8145
Spring 2024
Problem Set 4: An Evaluation of the PROGRESA Program in Mexico
Due Fri. April 19, 11:59pm, in a single pdf file on Courseworks
1. Overview of the program
PROGRESA (subsequently called Oportunidades and then Prospera) was a massive program of the Mexican
government to reduce poverty, improve health, and increase educational attainment in the country, initially in rural
areas and now in urban areas as well. In a sample of 506 villages, the phase-in was randomly assigned: 320 villages (the “treatment” villages) got the program in May 1998, and 186 villages (the “control” villages) did not get the program
until December 2000. In this problem set, you will evaluate the impact of the program, taking advantage of this randomized design.
1.1. Read about the program. Read the summary and Chapter 1 of the report, “PROGRESA and Its Impacts on the Welfare of Rural Households in Mexico,” by Emmanuel Skoufias of the International Food Policy Research Institute
(IFPRI), the research organization that did the main evaluation of the PROGRESA program
(http://ebrary.ifpri.org/cdm/ref/collection/p15738coll2/id/80436). You might also read the New York Times article about the program by Alan Kruegerhere. Two other useful references are (1) Parker, Rubalcava and Teruel,
“Evaluating Conditional Schooling and Health Programs”, Handbook of Development Economics, edited by T. Paul Schultz and John Strauss, 2007,here; and (2) Parker and Todd, “Conditional Cash Transfers: The Case of
Progresa/Oportunidades,” Journal of Economic Literature, 2017,here.
1.2. It has been said that the goal of the PROGRESA program has been to “break the vicious cycle of inter-
generational transmission of poverty.” One possible explanation of this cycle is that poor households are credit
constrained. Explain in your own words how credit constraints may generate an inter-generational poverty trap for poor families.
1.3. If credit constraints are the primary cause of the inter-generational transmission of poverty, why would cash grants to poor families help to break this cycle?
1.4. If credit constraints were the only cause of low educational attainment among children in poor households, would the conditions placed on the cash grants – namely, that children attend school and regular medical check-ups – be
necessary? Why or why not?
1.5. The average monthly benefit between Nov. 1998 and Oct. 1999 for a household participating in the program was 238 pesos (in constant July 2000 pesos), about US$25. Of the total, 109 pesos on average were for the educational
grant and 119 pesos were for the nutritional supplements. Average total monthly consumption for the participant households over the same period was 1220 pesos, about $129. The grant thus made up about 19.5% of total
consumption. Based on this figure, would you have expected the program to have had a significant impact on household behavior?
1.6. It has been said that “The PROGRESA program is an effective way to reduce overcrowding in Mexico City.” Explain the logic of the statement with reference to the Harris-Todaro model. Do you agree or disagree with the statement?
2. Descriptive portrait of schooling attendance and child work in the poor rural areas of Mexico
2.1. Get the data. The file ps4.dta on Courseworks contains data at the individual level for 27,588 children ages 6-16 (in 1997 or 1998) in the 505 villages in the evaluation sample (one control village was dropped in the process of data cleaning). The variables are the following:
state: id code for state
town: id code for town
village: id code for village
id: id code for household
ind: identification number for individual child
program: whether village has program (=1 if village has program, =0 otherwise)
poor: whether household is poor enough to qualify for program (=1 if qualifies, =0 otherwise)
male: whether child is male (=1 if male, =0 if female)
age97: age of child in Nov. 1997
age98: age of child in Nov. 1998
enroll97: whether child enrolled in school in 1997-98 academic year (=1 if enrolled, =0 otherwise)
enroll98: whether child enrolled in school in 1998-99 academic year (=1 if enrolled, =0 otherwise)
grade97: last grade completed as of beginning of 1997-98 academic year
grade98: last grade completed as of beginning of 1998-99 academic year
continued98: whether child who completed primary school in 1997 continued on to secondary school (=1 if completed primary school in 1997 and continued on, =0 if completed primary school in 1997 and did not continue on)
work97: whether child worked (part-time, full-time, paid, unpaid) in week previous to Nov. 1997 survey (=1 if worked, =0 otherwise)
work98: whether child worked (part-time, full-time, paid, unpaid) in week previous to Nov. 1998 survey (=1 if worked, =0 otherwise)
2.2. Consider only children in the 185 “control” villages (program=0). Calculate the fraction (for all control villages together) of children that attended school in 1997, separately for each age level from 6 to 16. (Hint: use the “by” option of the summarize command, i.e. “by age97: summarize enroll97.”)
2.3. Considering the same children as in part 2.2, calculate the fraction of children that worked in the week prior to the survey in 1997, separately for each age level from 8 to 16. (The survey did not ask whether children younger than 8
had worked.) Report your findings from 2.2 and 2.3 together in a simple table.
2.4. At what age does the rate of school attendance begin to drop and the fraction of children working begin to rise in these villages? Note that the PROGRESA program gives monetary transfers to households with children in grades 3- 9, which correspond to ages 9-15. Suppose that you wanted to the program to have the maximum impact on raising school attendance rate per dollar spent. Would this be the range of grades and ages that you would target? Why or why not?
3. Testing randomization
3.1. Calculate the means and the standard errors of the following variables for children ages 6-16 in 1997, separately
for the group of treatment villages and the group of control villages: (a) age97, (b) grade97, (c) enroll97. Use the
“collapse” command, which will convert the individual-level dataset to a dataset with averages, standard deviations
and counts, i.e. “collapse (mean) age97mean=age97 … (sd) age97sd=age97 … (count) age97count=age97 … ,
by(program)”. (Note that in place of the ellipses (…) you will put other variables (i.e. enroll97mean=enroll97).) After collapsing the dataset, calculate the standard errors of the means from the standard deviations, as discussed in class.
3.2. Calculate the difference in means for the treatment and control groups for the three variables from part 3.1, and the standard error of this difference. Calculate the 95% confidence interval for the difference in means around your estimate. Does zero lie outside of the 95% confidence interval? In other words, can we reject the hypothesis that the means for the treatment and control groups are the same? Your written answer should include the formula you used to calculate the confidence intervals.
3.3. Test the null hypothesis that the means for age97, grade97, and enroll97 are equal using the Stata “ttest” command. (The syntax is e.g. “ttest age97, by(program) unequal reverse.” ) Check that the confidence intervals you constructed
in 3.2 are correct. (Note that you should focus on the reported statistics for Ha: diff != 0, which corresponds to the two-tailed test of the hypothesis that the means are equal against the alternative hypothesis that they are not equal.)
3.4. Report the results of your calculations in 3.1-3.3 in a simple table. (You can make the table in Excel or Word; you do not have to use Stata to make the table.) One way to organize the table would be to put the three variables age97, grade97, and enroll97 on different rows, and then make columns for the means for the control group, the standard errors for the control group, the means for the treatment group, the standard errors for the treatment group, the differences in means, the standard errors on the differences, the lower bounds of the confidence intervals, and the upper bounds of the confidence intervals, the t-statistic for the difference-in-means test, and the p-value for the difference-in-means test.
3.5. Does randomization appear to have been successful? Explain.
3.6. Bonus question (3 points extra credit). If one is testing a large number of variables separately for pre-treatment differences in means, it is possible that some variables will have significant differences even if randomization was carried out successfully. In such cases is one should test the joint hypothesis that all the differences in means are zero. There are several ways to do this in Stata, but perhaps the simplest is to use the “sureg” command (including the
“small” and “dfk” options, with “program” as the right-hand side variable in three equations) followed by the “test” command. Implement this procedure, testing the joint hypothesis that the differences in means of age97, grade97, and enroll97 are zero.
4. Evaluating the impact of the program
4.1. Assuming that the randomization process was indeed carried out successfully, consider the following four null
hypotheses: (a) the program had no effect on the school attendance rate of children of primary-school age (ages 6-11 in 1998), (b) the program had no effect on the fraction of children of primary-school age who worked, (c) the program had no effect on the school attendance rate of children of secondary-school age (ages 12-16 in 1998), (d) the program had no effect on the fraction of children of secondary-school age who worked. Test these hypotheses following the same procedure as in part 3.3 above. (Hint: you may find the “if” option for the ttest command useful, i.e. “ttest enroll98 if age98>=6 & age98<=11, by(program) unequal”.) Can we reject any of these hypotheses? Which ones? At what level of confidence?
4.2. Test the null hypothesis that the program had no effect on the likelihood of students finishing primary school to continue on to secondary school (as indicated by the variable continued98.) Can we reject the hypothesis? At what level of confidence?
4.3. Repeat your calculations from 4.1-4.2, but separately for boys and girls.
4.4. Report your results from 4.1-4.3 in a table. It is recommended that you organize the table similarly to the table in 3.4, with the values of enroll98, work98 and continued98 for different subgroups on different rows on the left-hand side, and the same columns as in the table for 3.4. Again, you can make the table in Excel or Word.
4.5. Provide a verbal summary of your results reported in your table for 4.4. For which subgroups of the population does the program appear to have had the largest effects? For which subgroups do the effects appear to have been smallest? Which variables appear to have been most affected? Which variables appear to have been least affected? To what do you attribute these findings? Feel free to speculate in response to this last question.
4.6. The PROGRESA program has become a model for social policy in Latin America. Argentina, Brazil, Honduras, and Nicaragua are either considering implementing, or have already implemented, similar programs. Based on your findings in this problem set, do you support this set of policy initiatives? Why or why not?
To turn in: written answers as requested above, tables as requested above, your Stata code.