Here's simple loop in base R:
# prep
x <- sort(round(10 * rnorm(10)))
res.sd <- NULL
res.var <- NULL
res.mad <- NULL
#loop
for(i in -20:20){
x[10] <- i
res.sd <- c(res.sd, sd(x))
res.var <- c(res.var, var(x))
res.mad <- c(res.mad, mad(x))
}
I would like to rewrite this in dplyr
CodePudding user response:
We can use a combination of purrr and tibble.
library(tidyverse)
set.seed(0)
x <- sort(round(10 * rnorm(10)))
map_dfr(-20:20, ~ tibble(
res.sd = sd(c(x[-10], .)),
res.var = var(c(x[-10], .)),
res.mad = mad(c(x[-10], .))
))
#> # A tibble: 41 × 3
#> res.sd res.var res.mad
#> <dbl> <dbl> <dbl>
#> 1 11.7 138. 15.6
#> 2 11.6 134. 15.6
#> 3 11.4 130. 15.6
#> 4 11.2 126. 15.6
#> 5 11.1 122. 15.6
#> 6 10.9 119. 15.6
#> 7 10.8 116. 14.8
#> 8 10.6 113. 14.1
#> 9 10.5 110. 13.3
#> 10 10.4 108. 12.6
#> # … with 31 more rows
Created on 2022-01-09 by the reprex package (v2.0.1)
CodePudding user response:
If I am understanding your code correctly, it does the following:
- Generate 10 random numbers stored in
x
. - In a loop that iterates over
-20:20
, replace the 10th value ofx
with the iterated value. - Calculate the SD, variance, and median absolute deviation of the modified vector, and store these calculations.
As ekoam points out, this type of operation is ill-suited to dplyr's intended purpose. That said, the ability to store list-columns makes this possible (albeit inefficient, since it requires storing multiple copies of the x
vector). The following will produce equivalent results to your code, if you add set.seed(0)
before the first line to control randomization.
set.seed(0)
df <- tibble(
x = list(sort(round(10 * rnorm(10)))),
y = -20:20
) %>%
rowwise() %>%
mutate(
res.sd = sd(c(x[-10], y)),
res.var = var(c(x[-10], y)),
res.mad = mad(c(x[-10], y))
)
# A tibble: 41 × 5
# Rowwise:
x y res.sd res.var res.mad
<list> <int> <dbl> <dbl> <dbl>
1 <dbl [10]> -20 11.7 138. 15.6
2 <dbl [10]> -19 11.6 134. 15.6
3 <dbl [10]> -18 11.4 130. 15.6
4 <dbl [10]> -17 11.2 126. 15.6
5 <dbl [10]> -16 11.1 122. 15.6
6 <dbl [10]> -15 10.9 119. 15.6
7 <dbl [10]> -14 10.8 116. 14.8
8 <dbl [10]> -13 10.6 113. 14.1
9 <dbl [10]> -12 10.5 110. 13.3
10 <dbl [10]> -11 10.4 108. 12.6
Alternately, we could get a little clever with the lapply
and sapply
, and then store the result in a tibble. Note that there is almost no repeated code here:
set.seed(0)
x <- sort(round(10 * rnorm(10)))
y <- -20:20
lapply(list(sd = sd, var = var, mad = mad), function(func) {
sapply(y, function(j) {
func(c(x[-10], j))
})
}) %>%
as_tibble()
# A tibble: 41 × 3
sd var mad
<dbl> <dbl> <dbl>
1 11.7 138. 15.6
2 11.6 134. 15.6
3 11.4 130. 15.6
4 11.2 126. 15.6
5 11.1 122. 15.6
6 10.9 119. 15.6
7 10.8 116. 14.8
8 10.6 113. 14.1
9 10.5 110. 13.3
10 10.4 108. 12.6
# … with 31 more rows