讲解data留学生、R设计讲解、辅导system、R编程语言讲解
- 首页 >> 其他 Assignment #4
Download the data from the following link. The data was obtained from Kaggle.
https://www.dropbox.com/sh/33f7zxx8ve6u0qk/AAC2RPmMy_iCiV_KTrW4D205a?dl=0
NOTE: This is not a team project. Do it by yourself. Submit your answers and the R-code used for your
analysis by Dec 18, 23:59PM. You can freely update your answer before the due. No late submission will be
accepted.
NOTE 2: Make your document formatted as follows: Times New Roman, 12-point font, double-spaced only
(not 1.5), 1-inch margins all around 8.5 x 11-inch paper (or A4), and the pages must be numbered. The
document should not exceed 10 pages including tables and figures. You can use a 10-point font and singlespace
for tables. The score will be determined not only by the accuracy and completeness of answers but also
by the presentation quality of the document (Do not simply screenshot the results of R output!)
Why are our best and most experienced employees leaving prematurely? Try to predict which valuable
employees will leave next. Fields in the dataset include:
Satisfaction Level
Last evaluation
Number of projects
Average monthly working hours
Time spent at the company (years)
Whether they have had a work accident
Whether they have had a promotion in the last 5 years
Departments (column sales)
Salary
Whether the employee has left
Q1. Load the data to your R system. How many variables and observations are in the data?
Q2. Generate the descriptive statistics for each variable.
Q3. Explore the relationships between variables. Can you find any interesting relationship?
Q4. Compare the employees who has left and who has remained. Visualize the comparison with a histogram
by factors.
Q5. Develop the best regression model to explain the employees leaving. Estimate the parameters of your
model. Can you find any meaningful result?
Q6. Develop the best classification model to predict the employees leaving. What is the sensitivity and the
specificity of your model?
Q7. What would you suggest the HR managers do to increase Satisfaction Level of employees? Conduct the
appropriate analysis to derive your suggestions.
Q8. Find other managerial insights from the data.
Q9. (Bonus question) Describe your efforts to be a Susan-like student for this course.
Download the data from the following link. The data was obtained from Kaggle.
https://www.dropbox.com/sh/33f7zxx8ve6u0qk/AAC2RPmMy_iCiV_KTrW4D205a?dl=0
NOTE: This is not a team project. Do it by yourself. Submit your answers and the R-code used for your
analysis by Dec 18, 23:59PM. You can freely update your answer before the due. No late submission will be
accepted.
NOTE 2: Make your document formatted as follows: Times New Roman, 12-point font, double-spaced only
(not 1.5), 1-inch margins all around 8.5 x 11-inch paper (or A4), and the pages must be numbered. The
document should not exceed 10 pages including tables and figures. You can use a 10-point font and singlespace
for tables. The score will be determined not only by the accuracy and completeness of answers but also
by the presentation quality of the document (Do not simply screenshot the results of R output!)
Why are our best and most experienced employees leaving prematurely? Try to predict which valuable
employees will leave next. Fields in the dataset include:
Satisfaction Level
Last evaluation
Number of projects
Average monthly working hours
Time spent at the company (years)
Whether they have had a work accident
Whether they have had a promotion in the last 5 years
Departments (column sales)
Salary
Whether the employee has left
Q1. Load the data to your R system. How many variables and observations are in the data?
Q2. Generate the descriptive statistics for each variable.
Q3. Explore the relationships between variables. Can you find any interesting relationship?
Q4. Compare the employees who has left and who has remained. Visualize the comparison with a histogram
by factors.
Q5. Develop the best regression model to explain the employees leaving. Estimate the parameters of your
model. Can you find any meaningful result?
Q6. Develop the best classification model to predict the employees leaving. What is the sensitivity and the
specificity of your model?
Q7. What would you suggest the HR managers do to increase Satisfaction Level of employees? Conduct the
appropriate analysis to derive your suggestions.
Q8. Find other managerial insights from the data.
Q9. (Bonus question) Describe your efforts to be a Susan-like student for this course.