Home > Software design >  Put loop in data frame and then represent in geom_line()
Put loop in data frame and then represent in geom_line()

Time:06-13

"Setting the seed at 747, generate m=500 samples of dimension n=880 from a population X, with Exponential distribution of expected value 1/λ=1/0.52, i.e. X∼Exp(λ=0.52).

For each of the generated samples, construct an approximate confidence interval for λ. Consider the confidence level 1−α=0.99.Indicate the mean of the amplitude of the m=500 confidence intervals obtained"

In this exercise I did:

m_ic <- function(seed, m, n, lambda, gama) {
  set.seed(seed)
  return(mean(replicate(m, (2*(qnorm((1 gama)/2)/sqrt(n)))/(mean(rexp(n ,lambda))))))
}

m_ic(seed=747, m=500, n=880, lambda=0.52, gama=0.99)

But what if n∈{100,200,300,…,4000} for example, how could i do the loop and put it in a data frame? And how could i after represent it in a plot like geom_line() or geom_point()?

CodePudding user response:

You can use sapply() to provide a sequence of different values of n to your m_ic() function; save these in a vector and plot, like this:

n_vals = seq(100,4000,100)
m_ic_values = sapply(n_val, \(n) m_ic(seed=747,m=500,n=n, lambda=0.52, gama=0.99))

ggplot(NULL, aes(x=n_vals, y=m_ic_values))  
  geom_point()   
  geom_line()

Output: m_ic_plot

CodePudding user response:

Your function m_ic is computing Normal confidence intervals but the exponential distribution is far from normal and a better confidence interval are Gamma intervals as you can see here. The function gamma_ic below computes these intervals and then the code to compute its amplitude is repeated in a sapply loop.

First the intervals's mean amplitude with n = 880.

gamma_ic <- function(x, conf = 0.95){
  n <- length(x)
  qlo <- (1 - conf)/2
  qhi <- 1 - (1 - conf)/2
  qq <- qgamma(c(qlo, qhi), n, n)/mean(x)
  c(lower = qq[1], upper = qq[2])
}

n <- 880
m <- 500
lambda <- 0.52

set.seed(747)
x <- replicate(m, rexp(n, rate = lambda))
ci <- apply(x, 2, gamma_ic, conf = 0.99)
mean(apply(ci, 2, diff))
#> [1] 0.09059922

Created on 2022-06-12 by the reprex package (v2.0.1)

Now the amplitudes for n from 100 to 4000 with increments of 100.

n_vec <- seq(100L, 4000L, by = 100L)

ampl <- sapply(n_vec, \(n) {
  set.seed(747)
  x <- replicate(m, rexp(n, rate = lambda))
  ci <- apply(x, 2, gamma_ic, conf = 0.99)
  mean(apply(ci, 2, diff))
})

ampldata <- data.frame(n = n_vec, amplitude = ampl)

library(ggplot2)

ggplot(ampldata, aes(n, amplitude))  
  geom_line()  
  geom_point()  
  theme_bw()

Created on 2022-06-12 by the reprex package (v2.0.1)

  •  Tags:  
  • r
  • Related