Home > other >  Function that calculates a table with confidence interval and mean for any data frame
Function that calculates a table with confidence interval and mean for any data frame

Time:10-21

Im trying to write a function that takes three arguments, X, Y, Z, where X is a numeric vector, Y is factor vector that groups X and Z is alpha for the confidence interval. I've written the code outside the function in order to learn and to try if it works properly first. This is what I got so far:

data(chickwts)

groups <- as.factor(chickwts[,2])
nr_groups <- levels(groups)

result <- matrix(0, nrow = length(nr_groups), ncol = 4)
colnames(result) <- c("Lower CI", "Mean", "Upper CI", "Obs.")
rownames(result) <- levels(groups)

result[,4] <- table(chickwts[,2])

group_test <- by(chickwts[, 1], groups, t.test)

for (i in levels(groups)) {
  tva <- (group_test[[i]])
  result[,1] <- tva$conf.int
  result[,3] <- tva$conf.int
  result[,2] <- tva$estimate
  
}

Calling result does not give me exactly what I want:

result
          Lower CI     Mean Upper CI Obs.
casein    297.8875 328.9167 297.8875   12
horsebean 359.9458 328.9167 359.9458   10
linseed   297.8875 328.9167 297.8875   12
meatmeal  359.9458 328.9167 359.9458   11
soybean   297.8875 328.9167 297.8875   14
sunflower 359.9458 328.9167 359.9458   12

As you can see all values in the columns except Obs. are wrong. I am trying to write the code in base R and without any packages. Can someone please let me know how to proceed and what I am doing wrong? Thanks!

CodePudding user response:

data(chickwts)

groups <- as.factor(chickwts[,2])
nr_groups <- levels(groups)

result <- matrix(0, nrow = length(nr_groups), ncol = 4)
colnames(result) <- c("Lower CI", "Mean", "Upper CI", "Obs.")
rownames(result) <- levels(groups)

result[,4] <- table(chickwts[,2])

group_test <- by(chickwts[, 1], groups, t.test)

for (i in levels(groups)) {
  tva <- (group_test[[i]])
  result[i, 1] <- tva$conf.int[1]
  result[i, 3] <- tva$conf.int[2]
  result[i, 2] <- tva$estimate
  
}

Returns:

          Lower CI     Mean Upper CI Obs.
casein    282.6440 323.5833 364.5226   12
horsebean 132.5687 160.2000 187.8313   10
linseed   185.5610 218.7500 251.9390   12
meatmeal  233.3083 276.9091 320.5099   11
soybean   215.1754 246.4286 277.6818   14
sunflower 297.8875 328.9167 359.9458   12

CodePudding user response:

Here is a lapply way.

data(chickwts)

groups <- factor(chickwts[, 2])
group_test <- by(chickwts[, 1], groups, t.test)

result <- lapply(group_test, \(tva){
  res <- numeric(3)
  res[1] <- tva$conf.int[1]
  res[3] <- tva$conf.int[2]
  res[2] <- tva$estimate
  res
})
result <- do.call(rbind, result)
colnames(result) <- c("Lower CI", "Mean", "Upper CI")
result <- cbind(result, `Obs.` = table(chickwts[, 2]))

result
#          Lower CI     Mean Upper CI Obs.
#casein    282.6440 323.5833 364.5226   12
#horsebean 132.5687 160.2000 187.8313   10
#linseed   185.5610 218.7500 251.9390   12
#meatmeal  233.3083 276.9091 320.5099   11
#soybean   215.1754 246.4286 277.6818   14
#sunflower 297.8875 328.9167 359.9458   12
  • Related