讲解CDF留学生、R编程语言调试、讲解R、辅导Monte Carlo
- 首页 >> OS编程 Problem sheet 5
The next question is optional, yet it will be a bonus to your final score.
2. If the density function is compactly supported, standard kernel density estimators are invalid/inconsistent at or
near boundary points. Below we introduce a new estimator that automatically adapts to the (possibly unknown)
boundaries of the support of the density without requiring specific data modification or additional tuning parameters.
Let be a random sample, where is a continuous random variable with a smooth CDF over its possibly
unknown support for . If and , then is supported on the whole real line. The
density function is , where the derivative is interpreted as a one-sided derivative at a boundary
point of . Let be the empirical distribution function. For every , solve
where is a bandwidth, and denotes a nonnegative, symmetric and continuous kernel function supported on
. The proposed boundary adaptive density estimator is then defined as
.
X1,…, Xn Xi
[a, b] a < b a = ∞ b = ∞ Xi
f (x) = d
(Xi ≤ x)
[a, b] F(x) = (1/n)∑ni=1
I(Xi ≤ x) x ∈β(x) = (β0(x), β1(x),2(x))= argminβ(β0,β1,β2)n∑i=1{F(Xi) β0 β1(Xi x) β2(Xi x)2}2K(Xi xh ),
h > 0 K[1,1]f ba(x) = β
1(x)
1(a) Write an R function to implement the above estimator, with as an input parameter. Use the triangular
kernel , .
(b) Conduct simulation study based on exponential distribution with density function if and
if . This density function is then supported on . We want to estimate the density at
, where is the boundary, is a near-boundary point and is an interior point. Take the sample
size . For various choices of the bandwidth , conduct 5000 Monte Carlo repetitions, and report the
average estimation error and the corresponding standard deviation at .
(c) Use 5-fold cross-validation to select the bandwidth for the above estimator. Again, for exponential distribution,
take and conduct 5000 Monte Carlo repetitions to compare the performance of this estimator with
standard kernel density estimator (bandwidth chosen by default) at .
h > 0
K(u) = max{0,1 |u |} u ∈f (x) = ex x ≥ 0
f (x) = 0 x < 0 [0,∞)
x ∈ {0,0.5,1} 0 0.5 1
n = 1000 h
| f ba(x) f (x)| x ∈ {0,0.5,1}
n = 1000
x ∈ {0,0.5,1}
The next question is optional, yet it will be a bonus to your final score.
2. If the density function is compactly supported, standard kernel density estimators are invalid/inconsistent at or
near boundary points. Below we introduce a new estimator that automatically adapts to the (possibly unknown)
boundaries of the support of the density without requiring specific data modification or additional tuning parameters.
Let be a random sample, where is a continuous random variable with a smooth CDF over its possibly
unknown support for . If and , then is supported on the whole real line. The
density function is , where the derivative is interpreted as a one-sided derivative at a boundary
point of . Let be the empirical distribution function. For every , solve
where is a bandwidth, and denotes a nonnegative, symmetric and continuous kernel function supported on
. The proposed boundary adaptive density estimator is then defined as
.
X1,…, Xn Xi
[a, b] a < b a = ∞ b = ∞ Xi
f (x) = d
(Xi ≤ x)
[a, b] F(x) = (1/n)∑ni=1
I(Xi ≤ x) x ∈β(x) = (β0(x), β1(x),2(x))= argminβ(β0,β1,β2)n∑i=1{F(Xi) β0 β1(Xi x) β2(Xi x)2}2K(Xi xh ),
h > 0 K[1,1]f ba(x) = β
1(x)
1(a) Write an R function to implement the above estimator, with as an input parameter. Use the triangular
kernel , .
(b) Conduct simulation study based on exponential distribution with density function if and
if . This density function is then supported on . We want to estimate the density at
, where is the boundary, is a near-boundary point and is an interior point. Take the sample
size . For various choices of the bandwidth , conduct 5000 Monte Carlo repetitions, and report the
average estimation error and the corresponding standard deviation at .
(c) Use 5-fold cross-validation to select the bandwidth for the above estimator. Again, for exponential distribution,
take and conduct 5000 Monte Carlo repetitions to compare the performance of this estimator with
standard kernel density estimator (bandwidth chosen by default) at .
h > 0
K(u) = max{0,1 |u |} u ∈f (x) = ex x ≥ 0
f (x) = 0 x < 0 [0,∞)
x ∈ {0,0.5,1} 0 0.5 1
n = 1000 h
| f ba(x) f (x)| x ∈ {0,0.5,1}
n = 1000
x ∈ {0,0.5,1}