Home > Blockchain >  R - multiply dataframe by vectors with different sizes, tidyverse/pipe style
R - multiply dataframe by vectors with different sizes, tidyverse/pipe style

Time:05-26

I have the following code:

dfl <- group_split(mtcars, am)

v <- list()

v[[1]] <- seq(1:6)

v[[2]] <- seq(1:8)

x <- list()

for(i in 1:2){
  x[[i]] <- list()
for (j in 1:length(v[[i]])){
  x[[i]][[j]] <- dfl[[i]]$qsec*v[[i]][[j]]
}
}

x[[1]] <- reduce(x[[1]], bind_cols)
x[[2]] <- reduce(x[[2]], bind_cols)

I wish to achieve this same output, but in tidyverse style and all within a single pipeline without breaking the pipe. This has proved to be more difficult because vectors v[[1]] and v[[2]] are of different length. I tried using the map2 function, but it fails for this precise reason. What would be the best way to approach this?

CodePudding user response:

I'm not exactly sure what you are trying to return - it looks like you just want multiples of the qsec column.. You can do that like this:

mtcars %>% 
  group_split(am) %>%
  map2(.x=.,.y=list(1:6,1:8),~as.data.frame(sapply(.y, \(y) .x$qsec*y)))

Output (same as your x[[1]] and x[[2]])

[[1]]
      V1    V2    V3    V4     V5     V6
1  19.44 38.88 58.32 77.76  97.20 116.64
2  17.02 34.04 51.06 68.08  85.10 102.12
3  20.22 40.44 60.66 80.88 101.10 121.32
4  15.84 31.68 47.52 63.36  79.20  95.04
5  20.00 40.00 60.00 80.00 100.00 120.00
6  22.90 45.80 68.70 91.60 114.50 137.40
7  18.30 36.60 54.90 73.20  91.50 109.80
8  18.90 37.80 56.70 75.60  94.50 113.40
9  17.40 34.80 52.20 69.60  87.00 104.40
10 17.60 35.20 52.80 70.40  88.00 105.60
11 18.00 36.00 54.00 72.00  90.00 108.00
12 17.98 35.96 53.94 71.92  89.90 107.88
13 17.82 35.64 53.46 71.28  89.10 106.92
14 17.42 34.84 52.26 69.68  87.10 104.52
15 20.01 40.02 60.03 80.04 100.05 120.06
16 16.87 33.74 50.61 67.48  84.35 101.22
17 17.30 34.60 51.90 69.20  86.50 103.80
18 15.41 30.82 46.23 61.64  77.05  92.46
19 17.05 34.10 51.15 68.20  85.25 102.30

[[2]]
      V1    V2    V3    V4    V5     V6     V7     V8
1  16.46 32.92 49.38 65.84 82.30  98.76 115.22 131.68
2  17.02 34.04 51.06 68.08 85.10 102.12 119.14 136.16
3  18.61 37.22 55.83 74.44 93.05 111.66 130.27 148.88
4  19.47 38.94 58.41 77.88 97.35 116.82 136.29 155.76
5  18.52 37.04 55.56 74.08 92.60 111.12 129.64 148.16
6  19.90 39.80 59.70 79.60 99.50 119.40 139.30 159.20
7  18.90 37.80 56.70 75.60 94.50 113.40 132.30 151.20
8  16.70 33.40 50.10 66.80 83.50 100.20 116.90 133.60
9  16.90 33.80 50.70 67.60 84.50 101.40 118.30 135.20
10 14.50 29.00 43.50 58.00 72.50  87.00 101.50 116.00
11 15.50 31.00 46.50 62.00 77.50  93.00 108.50 124.00
12 14.60 29.20 43.80 58.40 73.00  87.60 102.20 116.80
13 18.60 37.20 55.80 74.40 93.00 111.60 130.20 148.80

CodePudding user response:

You can also use outer.

library(dplyr)
library(purrr)

mtcars %>% 
  group_split(am) %>% 
  map2(list(6, 8), ~ as.data.frame(outer(.x$qsec, seq.int(.y))))

CodePudding user response:

A direct interpretation would be

v <- list()
v[[1]] <- seq(1:6)
v[[2]] <- seq(1:8)
names(v) <- levels(factor(mtcars$am))
v["0"]

mtcars %>% 
  group_by(am) %>% 
  group_map(function(value, key) {
    t(tcrossprod(
      v[[as.character(key)]],
      value$qsec
    ))
  }) %>% 
  identity()

Now, a few thing: am is a numeric, so to index it, we used names property. Hence why the key has to be converted with as.character. Also, note that identity does nothing, but it is a way to end pipes, so as to be able to comment-out parts without having to remove the leading %>%.

  • Related