I want to apply my function to every row of a data.table
:
set.seed(13579)
cat1N <- 10
cat2N <- 15
cat3N <- 7
group = c(rep("Group1", cat1N), rep("Group1", cat1N), rep("Group1", cat1N),
rep("Group2", cat2N), rep("Group2", cat3N)) # policyID
year = c(rep(2015, cat1N), rep(2016, cat1N), rep(2017, cat1N),
rep(2016, cat2N),
rep(2017, cat3N))
category = c(rep("cat1", cat1N/2), rep("cat2", cat1N/2), rep("cat1", cat1N/2), rep("cat2", cat1N/2), rep("cat1", cat1N/2), rep("cat2", cat1N/2),
rep("cat2", 7), rep("cat3", 8), rep("cat3", 3), rep("cat1", 4)) # plan
value = c(abs(rnorm(cat1N)*100), abs(rnorm(cat1N)*100), abs(rnorm(cat1N)*100),
abs(rnorm(cat2N)*100), abs(rnorm(cat3N)*100))
require("data.table")
testData <- data.table(group = group,
year = year,
category = category,
value = value)
I aggregated the data as follows:
cohort = c("group" ,"category", "year")
testAgg <- testData[, group := group][, .(values = .(.SD)), by = cohort]
> testAgg
group category year values
1: Group1 cat1 2015 <data.table[5x1]>
2: Group1 cat2 2015 <data.table[5x1]>
3: Group1 cat1 2016 <data.table[5x1]>
4: Group1 cat2 2016 <data.table[5x1]>
5: Group1 cat1 2017 <data.table[5x1]>
6: Group1 cat2 2017 <data.table[5x1]>
7: Group2 cat2 2016 <data.table[7x1]>
8: Group2 cat3 2016 <data.table[8x1]>
9: Group2 cat3 2017 <data.table[3x1]>
10: Group2 cat1 2017 <data.table[4x1]>
and want to use the mapply
function to apply the same function over every row:
calculateCI <- function(value){
avg <- mean(value)
s <- sqrt(var(value))
n <- length(value)
error <- qnorm(0.975)*s/sqrt(n)
lower <- avg - error
upper <- avg error
return(c(lower, upper))
}
> testAgg[, 'lowerCI' := mapply(calculateCI, values[1])[1]]
Warning message:
In mean.default(value) : argument is not numeric or logical: returning NA
> testAgg[, 'upperCI' := mapply(calculateCI, values[1])[2]]
Warning message:
In mean.default(value) : argument is not numeric or logical: returning NA
What is wrong with my mapply? and how can I fix it?
The idea is to calculate confidence intervals for values
CodePudding user response:
We don't need mapply
, can do this with lapply
testAgg[, c("lowerCI", "upperCI") := transpose(lapply(values,
function(x) calculateCI(x$value)))]
-output
> testAgg
group category year values lowerCI upperCI
<char> <char> <num> <list> <num> <num>
1: Group1 cat1 2015 <data.table[5x1]> 64.958526 149.68502
2: Group1 cat2 2015 <data.table[5x1]> 13.234171 176.35595
3: Group1 cat1 2016 <data.table[5x1]> 43.562119 72.14915
4: Group1 cat2 2016 <data.table[5x1]> 39.377224 102.42184
5: Group1 cat1 2017 <data.table[5x1]> 34.121206 138.28066
6: Group1 cat2 2017 <data.table[5x1]> -1.573475 124.55888
7: Group2 cat2 2016 <data.table[7x1]> 46.608221 133.32852
8: Group2 cat3 2016 <data.table[8x1]> 41.979619 106.16266
9: Group2 cat3 2017 <data.table[3x1]> 147.817873 171.19777
10: Group2 cat1 2017 <data.table[4x1]> 30.109861 115.44993
Or the same option with Map
testAgg[, c("lowerCI", "upperCI") := transpose(Map(function(x)
calculateCI(x$value), values))]