Subject CS1B讲解、R语言讲解、辅导Statistics留学生、R编程设计辅导
- 首页 >> 其他 INSTITUTE AND FACULTY OF ACTUARIES
Curriculum 2019
Subject CS1B – Actuarial Statistics
Time allowed: 1 hour 45 minutes
INSTRUCTIONS TO THE CANDIDATE
1. You have 1 hour and 45 minutes to complete this examination paper.
2. At the end of the examination you should upload a Microsoft Word file including your
answers
together with sufficient R code for the Examiners to work out how you arrived at your
answers.
3. Mark allocations are shown in brackets.
4. Attempt all 3 questions, beginning your answer to each question on a new page of your
Word file.
5. The CSV file datatrain.csv accompanies this exam paper.
The filename of your Word file must include your ARN, and the paper sat (e.g. "9000000 CS1B" and each
page of the file should contain your ARN as a header or footer.
Please note that the content of this booklet is confidential and you are not to discuss or reveal the
content under any circumstances nor are they to be used in a further attempt at the examination.
If you encounter any issues during the examination please contact the Online Education team at
online_exams@actuaries.org.uk T. 0044 (0) 1865 268 255Question 1
In a medical study conducted to test the suggestion that daily exercise has the effect of
lowering blood pressure, a sample of eight patients with high blood pressure was selected.
Their blood pressure was measured initially and then again a month later after they had
participated in an exercise programme. The results are shown in the table below:
Patient 1 2 3 4 5 6 7 8
Before 155 152 146 153 146 160 139 148
After 145 147 123 137 141 142 140 138
(i) Derive a two-sided 90% confidence interval for the mean difference in the patients’
blood pressure before and after participating in an exercise programme. [8]
(ii) Perform a suitable t-test for the null hypothesis that the mean difference in the
patients’ blood pressure before and after participating in an exercise programme is
less than or equal to 10 units, against the alternative that it is greater than 10 units.
Use a significance level of 1%. [7]
[Total 15]
Question 2
A Bayesian credibility model is used to model annual claim numbers, denoted by X, for the
coming year. These are assumed to have a Poisson distribution with mean, where itself is
modelled by a gamma distribution with parameters = 100 and = 1.
(i) (a) Implement M = 1000 Monte Carlo repetitions of a credibility analysis to
estimate the distribution of the posterior mean of parameter using the
credibility factor Z = 1/( + 1), in the case where the number of past claims
is known only for the last one year.
(b) Provide the histogram of the 1000 Monte Carlo posterior mean estimates
calculated in part (i)(a).
[15]
(ii) (a) Calculate the mean and variance of the Monte Carlo posterior mean estimates
from part (i).
(b) Compare the Monte Carlo mean and variance obtained in part (ii)(a) with
those obtained from samples of size 1000 drawn from a
Gamma(+ 1) distribution. Round your results to three decimal
places. [12]
(iii) Comment on your findings in parts (i) and (ii). [3]
[Total 30]Question 3
A general insurance company is building a generalised linear model (GLM) to analyse claim
numbers for a motor insurance policy. For every policy in the past three years, the company
has collected the number of reported claims and the following data:
Age: Age of policyholder, a number between 18 and 100.
Car group: A code representing the car group, a value between 1 and 20.
Area: A description of the area where the policyholder lives.
No claim discount: A number representing the level of no claim discount, a value
between 0 and 5.
Gender: Gender of the policyholder (male or female).
The data set has been loaded into the session for you already as “datatrain.csv”. Typing the name
of the variable will let you see this.
(i) Explain what error structure could be used in this GLM, including in your answer the
R code used to justify your choice. [5]
(ii) Fit a GLM that treats Age as linear factor and all other four factors as categorical
variables. Your answer should show the coefficient, standard error, and p-value of
each parameter estimate in the model. [10]
(iii) Describe the association between gender and the number of reported claims based on
your output from part (ii). Your answer should include a numerical comparison of this
association between male and female policyholders. [10]
(iv) Comment on the fit of the model fitted in part (ii), based on the deviance value of the
model, with reference to the suitability of the model. [10]
The company considers using a more complex model, including a power transformation of
factor age.
(v) (a) Add a variable representing the power of age squared to the data.
(b) Fit an appropriate model including age squared as an explanatory
variable in addition to the variables used in part (ii).
(c) Comment on whether age squared is associated with the number of
reported claims and on whether its inclusion improves the model fitted in part
(ii), based on your output from part (v)(b). [20]
[Total 55]
END OF PAPER
Curriculum 2019
Subject CS1B – Actuarial Statistics
Time allowed: 1 hour 45 minutes
INSTRUCTIONS TO THE CANDIDATE
1. You have 1 hour and 45 minutes to complete this examination paper.
2. At the end of the examination you should upload a Microsoft Word file including your
answers
together with sufficient R code for the Examiners to work out how you arrived at your
answers.
3. Mark allocations are shown in brackets.
4. Attempt all 3 questions, beginning your answer to each question on a new page of your
Word file.
5. The CSV file datatrain.csv accompanies this exam paper.
The filename of your Word file must include your ARN, and the paper sat (e.g. "9000000 CS1B" and each
page of the file should contain your ARN as a header or footer.
Please note that the content of this booklet is confidential and you are not to discuss or reveal the
content under any circumstances nor are they to be used in a further attempt at the examination.
If you encounter any issues during the examination please contact the Online Education team at
online_exams@actuaries.org.uk T. 0044 (0) 1865 268 255Question 1
In a medical study conducted to test the suggestion that daily exercise has the effect of
lowering blood pressure, a sample of eight patients with high blood pressure was selected.
Their blood pressure was measured initially and then again a month later after they had
participated in an exercise programme. The results are shown in the table below:
Patient 1 2 3 4 5 6 7 8
Before 155 152 146 153 146 160 139 148
After 145 147 123 137 141 142 140 138
(i) Derive a two-sided 90% confidence interval for the mean difference in the patients’
blood pressure before and after participating in an exercise programme. [8]
(ii) Perform a suitable t-test for the null hypothesis that the mean difference in the
patients’ blood pressure before and after participating in an exercise programme is
less than or equal to 10 units, against the alternative that it is greater than 10 units.
Use a significance level of 1%. [7]
[Total 15]
Question 2
A Bayesian credibility model is used to model annual claim numbers, denoted by X, for the
coming year. These are assumed to have a Poisson distribution with mean, where itself is
modelled by a gamma distribution with parameters = 100 and = 1.
(i) (a) Implement M = 1000 Monte Carlo repetitions of a credibility analysis to
estimate the distribution of the posterior mean of parameter using the
credibility factor Z = 1/( + 1), in the case where the number of past claims
is known only for the last one year.
(b) Provide the histogram of the 1000 Monte Carlo posterior mean estimates
calculated in part (i)(a).
[15]
(ii) (a) Calculate the mean and variance of the Monte Carlo posterior mean estimates
from part (i).
(b) Compare the Monte Carlo mean and variance obtained in part (ii)(a) with
those obtained from samples of size 1000 drawn from a
Gamma(+ 1) distribution. Round your results to three decimal
places. [12]
(iii) Comment on your findings in parts (i) and (ii). [3]
[Total 30]Question 3
A general insurance company is building a generalised linear model (GLM) to analyse claim
numbers for a motor insurance policy. For every policy in the past three years, the company
has collected the number of reported claims and the following data:
Age: Age of policyholder, a number between 18 and 100.
Car group: A code representing the car group, a value between 1 and 20.
Area: A description of the area where the policyholder lives.
No claim discount: A number representing the level of no claim discount, a value
between 0 and 5.
Gender: Gender of the policyholder (male or female).
The data set has been loaded into the session for you already as “datatrain.csv”. Typing the name
of the variable will let you see this.
(i) Explain what error structure could be used in this GLM, including in your answer the
R code used to justify your choice. [5]
(ii) Fit a GLM that treats Age as linear factor and all other four factors as categorical
variables. Your answer should show the coefficient, standard error, and p-value of
each parameter estimate in the model. [10]
(iii) Describe the association between gender and the number of reported claims based on
your output from part (ii). Your answer should include a numerical comparison of this
association between male and female policyholders. [10]
(iv) Comment on the fit of the model fitted in part (ii), based on the deviance value of the
model, with reference to the suitability of the model. [10]
The company considers using a more complex model, including a power transformation of
factor age.
(v) (a) Add a variable representing the power of age squared to the data.
(b) Fit an appropriate model including age squared as an explanatory
variable in addition to the variables used in part (ii).
(c) Comment on whether age squared is associated with the number of
reported claims and on whether its inclusion improves the model fitted in part
(ii), based on your output from part (v)(b). [20]
[Total 55]
END OF PAPER