db = tibble(
x = runif(1000, 1, 10),
t = rpois(1000, 5),
Group = rpois(1000, 5)
)
for (i in c(1:1000) {
db$l[i] <- mean(db$x[(db$t < db$t[i]) & (db$group == db$group[i])])
}
How would you run this for
faster? A combination of mutate()
and map()
should be faster, but
f <- function(lim) mean(a$x[a$t < lim])
db %>% group_by(Group) %>%
mutate(l = map_dbl(t, function(y) mean(db$x[db$t < y])))
does not recognize that db
would be grouped by = Group
.
CodePudding user response:
You could try
set.seed(20211212)
library(dplyr)
library(purrr)
db %>%
group_by(group) %>%
mutate(l = map_dbl(t, ~mean(db$x[db$t < .x])))
This returns
# A tibble: 1,000 x 4
# Groups: group [14]
x t group l
<dbl> <int> <int> <dbl>
1 2.53 6 6 5.48
2 1.23 4 8 5.37
3 4.51 3 3 5.40
4 2.45 8 3 5.49
5 7.23 6 7 5.48
6 8.11 5 5 5.35
7 2.14 4 1 5.37
8 3.17 4 4 5.37
9 2.69 3 7 5.40
10 7.85 5 5 5.35
# ... with 990 more rows
But in your case, I believe you want
db %>%
group_by(group) %>%
mutate(l = map_dbl(t, ~mean(x[t < .x])))
which returns
# A tibble: 1,000 x 4
# Groups: group [14]
x t group l
<dbl> <int> <int> <dbl>
1 2.53 6 6 5.85
2 1.23 4 8 4.64
3 4.51 3 3 4.23
4 2.45 8 3 5.19
5 7.23 6 7 5.44
6 8.11 5 5 5.77
7 2.14 4 1 5.52
8 3.17 4 4 4.69
9 2.69 3 7 5.73
10 7.85 5 5 5.77
# ... with 990 more rows