Home > Enterprise >  How to avoid matrix/dataframe being piped (%>%) into list as an element in R
How to avoid matrix/dataframe being piped (%>%) into list as an element in R

Time:04-30

I want to create a list of matrices of correlations and covariances from a dataframe. I tried piping the dataframe into the list, using the magrittr pipe operator (%>%) as shown in the example below. The problem is that the dataframe itself gets inserted as the first list element. I am aware that the pipe operator default behavior is to inject the object into the first position in the function it gets piped into. However, I am curious to know whether there is an easy way to pipe the matrix/dataframe into functions in a list, while avoiding the insertion of the dataframe itself?

Code example:

library(magrittr) # alternatively 'dplyr'

matrix(1:27, ncol = 6) %>% as.data.frame() %>%
list(
    a = list(
        cor(.[,1:3]),
        cov(.[,1:3])
    ),
    b = list(
        cor(.[,4:6]),
        cov(.[,4:6])
    )
)

Output:

[[1]]                          # I want to avoid inserting the 
A data.frame: 4 × 4            # dataframe as an element
V1  V2  V3  V4
<int>   <int>   <int>   <int>
1   5   9   13
2   6   10  14
3   7   11  15
4   8   12  16

$a
   A matrix: 2 × 2 of type dbl
   V1   V2
   V1   1   1
   V2   1   1

   A matrix: 2 × 2 of type dbl
   V1   V2
   V1   1.666667    1.666667
   V2   1.666667    1.666667

$b
   A matrix: 2 × 2 of type dbl
   V3   V4
   V3   1   1
   V4   1   1

   A matrix: 2 × 2 of type dbl
   V3   V4
   V3   1.666667    1.666667
   V4   1.666667    1.666667

CodePudding user response:

So, after I posted, I realised I hadn't really considered to check if there are other operators that might work. According to magrittr's introduction page, "the “exposition” pipe, %$% exposes the names within the left-hand side object to the right-hand side expression." It seems it wasn't necessarily intended for this purpose, but I replaced %>% with %$%, and now it works! (I am still unaware of potentioal drawback to using %$%, so any comments on this is appreciated.)

library(magrittr)

matrix(1:16, ncol = 4) %>% as.data.frame() %$%
list(
    a = list(
        cor(.[,1:2]),
        cov(.[,1:2])
    ),
    b = list(
        cor(.[,3:4]),
        cov(.[,3:4])
    )
)

# $a
#     A matrix: 2 × 2 of type dbl
#     V1    V2
#     V1    1   1
#     V2    1   1

#     A matrix: 2 × 2 of type dbl
#     V1    V2
#     V1    1.666667    1.666667
#     V2    1.666667    1.666667

# $b
#     A matrix: 2 × 2 of type dbl
#     V3    V4
#     V3    1   1
#     V4    1   1

#     A matrix: 2 × 2 of type dbl
#     V3    V4
#     V3    1.666667    1.666667
#     V4    1.666667    1.666667

CodePudding user response:

Just add extract to you pipeline to get the result without the first element:

library(magrittr)

matrix(1:30, ncol = 6) %>% as.data.frame() %>%
list(
    a = list(
        cor(.[,1:3]),
        cov(.[,1:3])
    ),
    b = list(
        cor(.[,4:6]),
        cov(.[,4:6])
    )
) %>%
  extract(-1)
#> $a
#> $a[[1]]
#>    V1 V2 V3
#> V1  1  1  1
#> V2  1  1  1
#> V3  1  1  1
#> 
#> $a[[2]]
#>     V1  V2  V3
#> V1 2.5 2.5 2.5
#> V2 2.5 2.5 2.5
#> V3 2.5 2.5 2.5
#> 
#> 
#> $b
#> $b[[1]]
#>    V4 V5 V6
#> V4  1  1  1
#> V5  1  1  1
#> V6  1  1  1
#> 
#> $b[[2]]
#>     V4  V5  V6
#> V4 2.5 2.5 2.5
#> V5 2.5 2.5 2.5
#> V6 2.5 2.5 2.5

Created on 2022-04-29 by the reprex package (v2.0.1)


Addendum

An alternative to extract is [:

matrix(1:30, ncol = 6) %>% as.data.frame() %>%
list(
    a = list(
        cor(.[,1:3]),
        cov(.[,1:3])
    ),
    b = list(
        cor(.[,4:6]),
        cov(.[,4:6])
    )
) %>%
  `[`(-1)

CodePudding user response:

You could pipe it into an anonymous function to tell R explicitly where to use your dataframe in constructing the output list:

library(tidyverse)

matrix(1:30, ncol = 6)  %>%  as.data.frame()  %>%
  (function(df) {
    list(a = list(cor(df[, 1:3]),
                  cov(df[, 1:3])),
         b = list(cor(df[, 4:6]),
                  cov(df[, 4:6])))
  })
#> $a
#> $a[[1]]
#>    V1 V2 V3
#> V1  1  1  1
#> V2  1  1  1
#> V3  1  1  1
#> 
#> $a[[2]]
#>     V1  V2  V3
#> V1 2.5 2.5 2.5
#> V2 2.5 2.5 2.5
#> V3 2.5 2.5 2.5
#> 
#> 
#> $b
#> $b[[1]]
#>    V4 V5 V6
#> V4  1  1  1
#> V5  1  1  1
#> V6  1  1  1
#> 
#> $b[[2]]
#>     V4  V5  V6
#> V4 2.5 2.5 2.5
#> V5 2.5 2.5 2.5
#> V6 2.5 2.5 2.5

It's perhaps one of the limitations of the magrittr pipe that (as far as I'm aware) it seems to be not possible to tell it not to pass LHS in as first argument - unless presumably it's explicitly used in same level of function elsewhere (as in lm examples)? As shocking to the system as the migration to the new pipe (|>) is, maybe its added elements will overcome this issue in the future.

Created on 2022-04-29 by the reprex package (v2.0.1)

  • Related