代写Complete the following set of tasks using the 2003 Chinese General Social Survey (CGSS2003) data.代
- 首页 >> Algorithm 算法Complete the following set of tasks using the 2003 Chinese General Social Survey (CGSS2003) data. Note: You can copy and paste your Stata commands and results.
1. Previous research has shown that demographic and socioeconomic factors may affect individual income. Those factors include gender, age, age squared, marital status, education, hukou status, employment status, etc. Use multiple linear regression to investigate the determinants of income.
a) Write down the regression model, i.e., β0 + β1X1 + β2X2 + β3X3 + …+ βnXn + u (Xis are independent variables).
b) Create the variables you need. Summarize descriptive statistics (mean/percentage, standard deviation, etc.) in a table.
c) Use Stata to run the regression model. Explain the goodness of fit (use R squared) and significance (use F test) of the model.
d) Holding other factors fixed, what is the difference in monthly income between men and women? Test whether this difference is statistically significant.
e) Holding other factors fixed, what are economic returns to education? Test whether this difference is statistically significant.
f) Based on the findings at question “e”, can we conclude that that education will lead to economic inequality? If not, why? Explain your reason and make further analysis to support your argument.
g) Does the economic return to education differ by gender? First draw a scatterplot to show the relationship between man and women. Then use regress to test the question and visualize your results. Remember to state clearly your hypothesis.
h) There are some outliers in variable “income”. Please try two different coding strategies to deal with the outlier problem and run your regression in question “f” using different coding strategies. Compare the results and explain which one you think would be a better strategy.