讲解Observational Data、辅导R程序、R编程讲解、辅导Modelling Prediction
- 首页 >> Algorithm 算法Modelling Prediction and Causality with Observational Data University of Leeds
Arnold KF, Gilthorpe MS (Leeds) Page | 1
ASSIGNMENT 1
The government has commissioned a study into the performance of A&E departments across
England. Of top priority is creating a model (1) that ‘red flags’ A&E departments – i.e., identifies
those with an unusually high average weekly number of A&E attendances per capita. Such a model
can be used to predict potential ‘red flag’ A&E departments, which will be put on a ‘watch list’ for
special measures and will warrant further investigation into why unusually elevated attendance
rates occur. The long-term aim is to curb such excessive use of A&E facilities by finding ways to
deal with as many health issues preventatively as possible via general practitioners or pharmacies.
The government also wants a model (2) that predicts the average weekly number of A&E
attendances in each department to help target resources and aid resource planning across
departments. This model may utilise as an outcome either A&E attendances or A&E attendances
per capita; as a researcher, you must consider which is the more appropriate to meet the
government’s needs and justify which you have selected.
To create the two models, you have been provided the dataset ‘AEdata 2018.csv’, which comprises
the following variables:
AEn: average weekly number of A&E attendances
AEpc: average weekly number of A&E attendances per capita (average weekly number of A&E
attendances / population)
Flag: ‘red flag’ department (0 = no, 1 = yes)
Area.Type: area type (0 = rural, 1 = urban)
Area.Size: area size (km2
)
Pop: population (in thousands)
Pop.dens: population density (population / area size)
GPn: number of GP practices
PHARMn: number of pharmacies
PHARMpc: number of pharmacies per capita (number of pharmacies / population)
You must write a formal report (1000 words maximum) that summarises your findings for both
models. Your report should explain which covariates you consider for your models to predict each
of your outcomes, and why you consider these. Your report must also include basic summary
statistics of the data you have been provided, in addition to a more detailed explanation of the two
models you generate. You must justify: (a) which fit criteria you use to select your models; (b)
which continuous outcome you have chosen to model (i.e. A&E attendances or A&E attendances
per capita); and (c) all other decisions you have taken as a researcher to arrive at the final models
you report. You should discuss the strengths and weaknesses of what you have done and propose
potential improvements for future modelling.
You should pay special attention to presentational issues; how you present your findings is (nearly)
as important as the findings themselves. You might explore the public domain for similar
documents. You must provide clarity of language. The report may have tables and figures in the
main text, but these must not exceed two of each (and will not count towards the word limit).
Attach your annotated R code as an appendix, and include (Harvard style) citations to justify
decisions you make or to place your work in the wider context in a bibliography (these will not
count towards the word limit).
Marks out of 50 will be awarded as per the following criteria: (a) clarity in the development and
justification of your predictive models; (b) detailed exploration and justification of the criteria used
for assessment of model fit; (c) well-structured presentation and clear language that explains and
discusses all you have done, including appropriate use of citations and appendices; (d) discussion
of the strengths, limitations, and future recommendations.