STAT 440讲解、辅导R code、辅导bootstrap、讲解hypothesis 讲解数据库SQL|解析Java程序

2020.02.12 - 首页 >> Java编程

STAT 440
Homework #3
Goal: Importance sampling, bootstrap, hypothesis testing
For this homework, submit your R code for this assignment electronically
via Canvas. Note the deadline on Canvas. In addition to the R code, you
must also turn in your typed-in answers. Please submit your R code
and the typed up answers as separate documents. Points will be
deducted if R code is very poorly formatted.
1. Monte Carlo basic
The Weibull distribution is often used to model extreme values. A
Weibull random variable with shape parameter k and scale parameter
λ has density given by
f(x) = kλ xλ(k 1)
exp · xλk , for x > 0 (1)
(a) Find the expected value for the random variable X if X ～ W eibull(k = 3, λ = 5) by using Monte Carlo. The R command for generat?ing from a Weibull distribution is rweibull. Use a Monte Carlo
sample size of 1000.
(b) Find the Monte Carlo standard error for your Monte Carlo es?timate from above. Report a 95% confidence interval for your
estimate.
(c) Now find the Monte Carlo estimate of the expected value of X
using a sample size of 100, 000. Again, report your estimate along
with its Monte Carlo standard error and a 95% confidence inter?val. How do the new estimate and confidence interval compare
to those for a sample size of 1000?
(d) Find P(X > 5) using Monte Carlo. Again, report the Monte
Carlo standard error and a 95% confidence interval.
2. Importance sampling
The Pareto pdf is
f(x) = βαβ xβ+1 , a < x < ∞, α > 0, β > 0 (2)
1
Suppose X ～ Pareto(α = 3, β = 5). Use importance sampling to
estimate E(X) and E(X2
). Use as the importance function (proposal
distribution), Gamma(15, 0.25). Generate 10,000 samples from the
Gamma(15, 0.25) pdf to obtain your estimate. Use the Gamma pdf
parameterization, f(x) 1
Γ(α)βα xα 1e
x/β. You may use the R command
rgamma to generate draws from the Gamma pdf. Be careful to make
sure the parameterization of the Gamma pdf in R is the same as the
one you are using: in R the second parameter is by default 1/β.
(a) Clearly describe your algorithm in “pseudo-code”, i.e. write out
your algorithm for this problem briefly, systematically, in words,
filling in mathematical details where necessary.
(b) Report your estimates along with the Monte Carlo standard er?rors of your estimates.
3. Now re-use the samples you just obtained (from the Gamma pdf)
above to estimate E(Y ) and E(Y 2
) for Y ～ Pareto(α = 5, β = 7).
Report your Monte Carlo standard errors. Notice how you can change
the distribution you are interested in without generating any new sam?ples.
4. Re-estimate E(X) and E(X2
) for X ～ Pareto(α = 3, β = 5) using a
different importance function. You are welcome to choose any impor?tance function you like, but you have to select one that works better
than the importance function I provided above. Think about what it
means to “work better.”
(a) Report your estimates along with Monte Carlo standard errors.
(b) Justify why you think the importance function you are using here
is better than the one I provided (the Gamma pdf).
5. Inference for Poisson expectation (λ) using maximum likelihood. Sup?pose you have 50 independent realizations of (observations from) a
Poisson(λ) distribution.
(a) Find the MLE of λ using the data provided on Canvas:
hw3 prob5 dat.txt.
(b) Find the 95% confidence interval for λ using standard asymptotic
theory (Central Limit Theorem).
2
(c) Now suppose you want to do a hypothesis test for the null hypoth?esis that λ = 3 versus the alternative the λ < 3. Use standard
asymptotic theory to conduct the hypothesis test and find the
p-value. What is your conclusion?
(d) Now find the exact p-value for the above test using Monte Carlo
(you will no longer use an asymptotic approximation). Report
your Monte Carlo estimate of the p-value and report the Monte
Carlo standard error for this estimate.
6. Bootstrapping
Estimate the mean μ of a population based on a random sample from
that population. Download the samples “hw3 prob6 dat.txt” on Can?vas. The following are different ways to estimate the sampling distri?bution of your estimate.
(a) Write down the estimate, the sample mean Xˉn. Calculate an
estimate of its standard error. Use standard asymptotic theory
(the Central Limit Theorem) and report a 95% confidence interval
for μ based on the standard error estimate.
(b) Now find the approximate sampling distribution of Xˉn using a
non-parametric bootstrap with B = 1000 bootstrap replications.
You should display a clearly labeled histogram of the sampling
distribution of Xˉn.
(c) Using the bootstrap replication from the previous part, estimate
the standard error of your estimate.
(d) You can now calculate approximate 95% confidence intervals for
μ in two ways: (i) use the bootstrap estimate of standard er?ror and use usual asymptotic theory to calculate a 95% confi-
dence interval for μ based on this estimate, and (ii) use a the
25.th and 95.5th percentiles from your bootstrap samples. You
will need to use the command quantiles, for e.g. if your boot?strap samples of Xˉn are in the vector samplemeanboot, type
quantiles(samplemeanboot, c(0.025, 0.0975)).
(e) Now find the approximate sampling distribution of Xˉn using a
parametric bootstrap with B = 1000 bootstrap replications. As?sume that the samples come from a Normal distribution. You
should display a clearly labeled histogram of the sampling distri?bution of Xˉn. 3
(f) Using the parametric bootstrap replications above, estimate the
standard error of your estimate.
(g) Calculate approximate 95% confidence intervals for μ in two ways,
as described in part (d).
7. Estimating correlation
(a) Estimate the correlation ρ between midterm grades and home?work grades. The data may be downloaded from Canvas as is
called “hw3 prob7 dat.csv”. Use the command
data = read.csv(“hw3 prob7 dat.csv”) to read the data.
Use the sample correlation, ?ρ for which the R command is cor(·, ·).
(b) Use the nonparametric bootstrap with B = 1000 replicates to
estimate the sampling distribution of ?ρ. You should display a
clearly labeled histogram as well as report the bootstrap estimate
of the standard error of ?ρ.
(c) What is your conclusion based on your work above? Please be
as detailed as possible. (For instance, is there a relationship?
What kind of relationship? Is the relationship significant? How
strong/weak?)