辅导essay建模、辅导AI编程、辅导神经网络算法
- 首页 >> 其他You are given some historical data describing the scores attained by a random sample of 50
Scottish school children in a reading test. You know that the original scores were recorded on
a continuous scale (and constrained to take positive values) but the values in the data set are
recorded only to within certain intervals. The data are shown below.
Table 1: Table of scores
Interval Number of scores
20 - 30 1
30 - 40 8
40 - 50 19
50 - 60 15
60 - 70 3
70 - 80 4
Denote these data as y. Numerous recent studies on other groups have suggested that the
distribution of scores in the test may be modelled by a Gamma(α, β) distribution with shape
parameter α = 20 and rate parameter β = 0.5. You wish to fit a Gamma distribution to
the (censored) scores in the table to investigate whether the historical data are consistent with
current beliefs. As you do not wish the recent data to prejudice your analysis you assign
non-informative, independent priors π(α) ∝ 1 and π(β) ∝ β
−1
.
1. Describe how you could use data augmentation and Markov chain Monte Carlo methods
to sample from π(α, β, x|y) where x ∈ R
50 denotes the precise scores of the 50 children
in the sample. You may wish to use a mixture of Gibbs and Metropolis methods in your
algorithm. [4]
2. Implement your algorithm in R and use it to investigate π(α, β|y). Comment on the
extent to which the historical data set confirms or contradicts the findings of the more
recent analyses. You should present both univariate and bivariate summaries of the
posterior distribution. Use your algorithm to estimate the posterior probability that:
• the highest score achieved is greater than 75;
• the lowest score achieved is less than 25.
[6]
1
3. Investigate whether your Markov chain mixes well and discuss features of the posterior
distribution that may impact on the mixing of the chain in this case. [3]
4. Discuss any assumptions that are made in your analysis. [2]
Your findings should be presented in the form of a short report, which should:
• have a clear and logical structure;
• include an introduction and clearly stated conclusions that can be understood by any
numerate scientist;
• include detail of your mathematical calculations so that your results could be reproduced
by another statistician;
• include clearly labelled and correctly referenced tables and diagrams, as appropriate;
• include the R code you used in an appendix (you do not need to explain individual
R commands but some comments should be included to indicate the purpose of each
section of code);
• include citation and referencing for any material (books, papers, websites etc) used.
Notes
• This assignment counts for 15% of the course assessment.
• You may have face-to-face discussions with me or your colleagues, but your report
must be your own work. Plagiarism is a serious academic offence and carries a range of
penalties, some very serious. Copying a friend’s report or code, or copying text into your
report from another source (such as a book or website) without citing and referencing
that source, is plagiarism. Collusion is also a serious academic offence. You must not
share a copy of your report (as a hard copy or in electronic form) or your computer code
with anyone else. Penalties for plagiarism or collusion can include voiding of your mark
for the course.
• Computer Labs will run at each campus, during which you may work on this assignment
and ask questions. To benefit most from these labs, please spend time working on
the assignment beforehand.
• Your report should be submitted through Turnitin by Friday, March 30, 17:00
(GMT). A link to the submission page is available through the ‘Assessment’ section
of the course Vision page. Please use the submission link appropriate for the
campus where you are studying (Edinburgh or Malaysia). For late submissions 2
marks will be deducted for each day (or part of a day) late. Submissions that are
more than 5 days late will receive 0 marks