代做MATH2110、Java/Python程序语言代写
- 首页 >> C/C++编程 1 MATH2110
The University of Nottingham
SCHOOL OF MATHEMATICAL SCIENCES
SPRING SEMESTER 2025
MATH2110 - STATISTICS 3
Coursework 2
Deadline: 3pm, Friday 2/5/2025
Your neat, clearly-legible solutions should be submitted electronically as a pdf file via the MATH2110 Moodle
page by the deadline indicated there. As this work is assessed, your submission must be entirely your own
work (see the University’s policy on Academic Misconduct).
Submissions up to five working days late will be subject to a penalty of 5% of the maximum mark per working
day.
Deadline extensions due to Support Plans and Extenuating Circumstances can be requested according to
School and University policies, as applicable to this module. Because of these policies, solutions (where
appropriate) and feedback cannot normally be released earlier than 10 working days after the main cohort
submission deadline.
The page limit is 8 pages and the minimum font size is 11.
THE DATA
As a medical statistician of the 19th century, your task is to assess associations between the fertility of different
Swiss regions and certain social parameters. The goal is to identify the most influential variables, select the
best model, and make predictions using it. You have data for 47 regions with the following variables:
• Fertility, standardised fertility measure.
• Agriculture, percentage of males involved in agriculture as occupation
• Examination, percentage draftees receiving highest mark on army examination
• Education, percentage education beyond primary school for draftees.
• Catholic, percentage of catholic.
• Infant.Mortality, normalised proportion of live births who live less than 1 year.
You can load the data by running the 𝑅 command data(swiss). The only packages that may be used are
“BayesFactor” and “MASS”.
MATH2110 Turn Over
2 MATH2110
THE TASKS
First divide the data into a training set (70% - 33 observations) and a test set (30% - 14 observations). All the
fitting and selection should be done using exclusively the train set. To avoid having correlations during the
train/test division, use the function sample() to randomly choose both groups.
All modelling should be using Bayesian Normal linear models and use priors:
𝛽|𝜎2 ∼ 𝑁 (0, 100Ip
)
𝜎
2 ∼ 𝐼𝐺(2, 2),
where Ip
is the 𝑝 × 𝑝 identity matrix and 𝐼𝐺 denotes the inverse-gamma distribution.
1. Consider the relationship between Examination and Fertility.
• Perform an exploratory analysis of the relationship between Examination and Fertility.
• Fit a Bayesian Normal linear model with Fertility as the dependent variable and Examination as the
independent variable.
• Write down the selected model posterior.
• Sample 10 sets of parameters from the posterior distribution and plot the resulting linear model for
each set of sampled parameters.
[20 marks]
2. Consider the relationship between Catholic and Fertility.
• Perform an exploratory analysis of the relationship between Catholic and Fertility.
• Create a new variable Catholic.Transform = (Catholic − 𝛼)2
for a suitable choice of 0 ≤ 𝛼 ≤ 100.
• Fit a Bayesian Normal linear model with Fertility as the dependent variable and Catholic.Transform
as the independent variable.
• Write down the selected model posterior.
• Using the posterior mean for the parameters of the linear model consider the model fit.
[25 marks]
3. Use Bayes Factors to determine which of the models in 1 and 2 best fits the data. [5 marks]
4. Consider general linear models for modelling Fertility as a function of the covariates.
• Perform model selection to choose a model and justify your choice of model.
• Write down the selected model posterior.
• Draw samples from the corresponding posterior.
• Present histograms (using function hist()) for the samples of each parameter.
• Compute estimates of the parameters and compare them.
• Make predictions for the Fertility values in the test set.
• Compare these with the real values.
[50 marks]
MATH2110 End
The University of Nottingham
SCHOOL OF MATHEMATICAL SCIENCES
SPRING SEMESTER 2025
MATH2110 - STATISTICS 3
Coursework 2
Deadline: 3pm, Friday 2/5/2025
Your neat, clearly-legible solutions should be submitted electronically as a pdf file via the MATH2110 Moodle
page by the deadline indicated there. As this work is assessed, your submission must be entirely your own
work (see the University’s policy on Academic Misconduct).
Submissions up to five working days late will be subject to a penalty of 5% of the maximum mark per working
day.
Deadline extensions due to Support Plans and Extenuating Circumstances can be requested according to
School and University policies, as applicable to this module. Because of these policies, solutions (where
appropriate) and feedback cannot normally be released earlier than 10 working days after the main cohort
submission deadline.
The page limit is 8 pages and the minimum font size is 11.
THE DATA
As a medical statistician of the 19th century, your task is to assess associations between the fertility of different
Swiss regions and certain social parameters. The goal is to identify the most influential variables, select the
best model, and make predictions using it. You have data for 47 regions with the following variables:
• Fertility, standardised fertility measure.
• Agriculture, percentage of males involved in agriculture as occupation
• Examination, percentage draftees receiving highest mark on army examination
• Education, percentage education beyond primary school for draftees.
• Catholic, percentage of catholic.
• Infant.Mortality, normalised proportion of live births who live less than 1 year.
You can load the data by running the 𝑅 command data(swiss). The only packages that may be used are
“BayesFactor” and “MASS”.
MATH2110 Turn Over
2 MATH2110
THE TASKS
First divide the data into a training set (70% - 33 observations) and a test set (30% - 14 observations). All the
fitting and selection should be done using exclusively the train set. To avoid having correlations during the
train/test division, use the function sample() to randomly choose both groups.
All modelling should be using Bayesian Normal linear models and use priors:
𝛽|𝜎2 ∼ 𝑁 (0, 100Ip
)
𝜎
2 ∼ 𝐼𝐺(2, 2),
where Ip
is the 𝑝 × 𝑝 identity matrix and 𝐼𝐺 denotes the inverse-gamma distribution.
1. Consider the relationship between Examination and Fertility.
• Perform an exploratory analysis of the relationship between Examination and Fertility.
• Fit a Bayesian Normal linear model with Fertility as the dependent variable and Examination as the
independent variable.
• Write down the selected model posterior.
• Sample 10 sets of parameters from the posterior distribution and plot the resulting linear model for
each set of sampled parameters.
[20 marks]
2. Consider the relationship between Catholic and Fertility.
• Perform an exploratory analysis of the relationship between Catholic and Fertility.
• Create a new variable Catholic.Transform = (Catholic − 𝛼)2
for a suitable choice of 0 ≤ 𝛼 ≤ 100.
• Fit a Bayesian Normal linear model with Fertility as the dependent variable and Catholic.Transform
as the independent variable.
• Write down the selected model posterior.
• Using the posterior mean for the parameters of the linear model consider the model fit.
[25 marks]
3. Use Bayes Factors to determine which of the models in 1 and 2 best fits the data. [5 marks]
4. Consider general linear models for modelling Fertility as a function of the covariates.
• Perform model selection to choose a model and justify your choice of model.
• Write down the selected model posterior.
• Draw samples from the corresponding posterior.
• Present histograms (using function hist()) for the samples of each parameter.
• Compute estimates of the parameters and compare them.
• Make predictions for the Fertility values in the test set.
• Compare these with the real values.
[50 marks]
MATH2110 End