Home > Software engineering >  How pull two (or more) variables with |>?
How pull two (or more) variables with |>?

Time:11-30

I have the df:

df_1 <- data.frame(
  x = replicate(
    n = 10, expr = runif(n = 30, min = 20, max = 100)
  )
)

This works:

df_1 |> 
  (`$`)("x.1")

But:

df_1 |> 
  (`$`)("x.1", "x.2")

or

df_1 |> 
  (`$`)("x.1", "x.2", "x.3)

Doesn't work. I tried use %in%, [, [[ and c(), but nothing.

CodePudding user response:

Does this solve your problem?

library(tidyverse)
df_1 <- data.frame(
  x = replicate(
    n = 10, expr = runif(n = 30, min = 20, max = 100)
  )
)

df_1 |>
  (`$`)("x.1")
#>  [1] 83.58172 81.79497 66.68859 58.34478 70.15752 44.11395 29.70086 91.86101
#>  [9] 75.56647 60.79799 95.70545 67.72019 30.15366 61.11601 35.67833 96.67384
#> [17] 83.88540 41.60109 38.37747 22.11303 93.25775 37.57238 72.21758 46.34140
#> [25] 73.44505 34.72846 85.05045 31.63732 81.01025 39.87864

# To get the same output you can use the dplyr::pull() function
df_1 |>
  pull(x.1)
#>  [1] 83.58172 81.79497 66.68859 58.34478 70.15752 44.11395 29.70086 91.86101
#>  [9] 75.56647 60.79799 95.70545 67.72019 30.15366 61.11601 35.67833 96.67384
#> [17] 83.88540 41.60109 38.37747 22.11303 93.25775 37.57238 72.21758 46.34140
#> [25] 73.44505 34.72846 85.05045 31.63732 81.01025 39.87864

# But I suspect you want to use dplyr::select() to return a dataframe, e.g.
df_1 |>
  select(x.1, x.2)
#>         x.1      x.2
#> 1  83.58172 64.61144
#> 2  81.79497 36.32027
#> 3  66.68859 92.98443
#> 4  58.34478 82.00191
#> 5  70.15752 98.49730
#> 6  44.11395 83.41177
#> 7  29.70086 69.75196
#> 8  91.86101 95.15369
#> 9  75.56647 76.11292
#> 10 60.79799 94.12680
#> 11 95.70545 65.51677
#> 12 67.72019 32.07121
#> 13 30.15366 42.48538
#> 14 61.11601 46.95375
#> 15 35.67833 75.01294
#> 16 96.67384 78.32378
#> 17 83.88540 78.87134
#> 18 41.60109 85.47559
#> 19 38.37747 67.45953
#> 20 22.11303 95.04220
#> 21 93.25775 99.16017
#> 22 37.57238 50.44622
#> 23 72.21758 72.40720
#> 24 46.34140 23.84739
#> 25 73.44505 43.64801
#> 26 34.72846 57.57502
#> 27 85.05045 91.69576
#> 28 31.63732 55.02272
#> 29 81.01025 65.64208
#> 30 39.87864 75.25885

Created on 2021-11-30 by the reprex package (v2.0.1)

Edit

With only base R functions:

df_1 <- data.frame(
  x = replicate(
    n = 10, expr = runif(n = 30, min = 20, max = 100)
  )
)

df_1 |>
  subset(select = c(x.1, x.2))
#>         x.1      x.2
#> 1  24.17058 92.61567
#> 2  62.89304 69.88825
#> 3  52.20762 52.49215
#> 4  73.30312 71.52102
#> 5  53.19998 93.78352
#> 6  45.48347 47.47517
#> 7  38.05251 29.89043
#> 8  24.39346 59.00774
#> 9  88.82060 88.35271
#> 10 51.70573 22.48755
#> 11 94.19495 63.22420
#> 12 22.50503 82.86535
#> 13 87.79638 74.29959
#> 14 49.74795 81.19595
#> 15 50.64456 89.15560
#> 16 68.63600 62.50104
#> 17 69.48949 20.65006
#> 18 55.96310 57.99721
#> 19 39.03249 81.15098
#> 20 25.72585 30.26928
#> 21 36.81509 54.93060
#> 22 45.97520 57.81248
#> 23 29.86099 96.38645
#> 24 92.30959 36.34898
#> 25 83.84972 22.61482
#> 26 22.46339 46.44790
#> 27 90.03101 37.56431
#> 28 52.51286 31.61707
#> 29 97.08071 49.03669
#> 30 33.11007 62.06193

Created on 2021-11-30 by the reprex package (v2.0.1)

CodePudding user response:

The $ operator cannot handle two column names at once, but the "[" operator can. Try:

df_1 |> 
       

It does need to be in the form of a character vector just as it would if you were trying:

df_1[ c("x.1", "x.2") ]

If you tried with:

df_1 |> (`[`)( "x.1", "x.2" )

The "x.1" would be interpreted as a rowname and the "x.2" as a column name. That's just the way "[" works.

CodePudding user response:

If the first argument of a function is the one you are after, such as is the case with subset pointed out by @jared_mammot, pipes are great.

Anonymous (lambda) functions can be used in those cases where we need df_1 to be, say the 2nd or 3rd argument, or if a function is not vectorized. Naturally here not preferable but for educational purposes:

df_1 <- data.frame(x = replicate(n = 10, expr = runif(n = 30, min = 20, max = 100)))

One - using [

df_1 |>
  (\(., ...) .[, c(...)])("x.1", "x.2")

One using [[

df_1 |>
  (Vectorize(\(., x) .[[x]], c("x")))(c("x.1", "x.2"))

One using %in%

df_1 |>
  names() %in% c("x.1", "x.2") |>
  (\(.) subset(df_1, select = .))()

One using $

df_1 |>
  (\(.) cbind.data.frame(.$"x.1", .$"x.2"))()

One using c

df_1 |>
  (\(., ...) data.frame(c(.)[c(...)]))("x.1", "x.2")

CodePudding user response:

bselect = `[` # make [ work with pipeOp

mtcars[1:10, ] |> 
     bselect(, 'mpg') # add `drop=FALSE` if you need a 1 column data.frame
[1] 21.0 21.0 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2

mtcars[1:10, ] |> 
   bselect(, c('mpg', 'cyl'))
                   mpg cyl
Mazda RX4         21.0   6
Mazda RX4 Wag     21.0   6
Datsun 710        22.8   4
Hornet 4 Drive    21.4   6
Hornet Sportabout 18.7   8
Valiant           18.1   6
Duster 360        14.3   8
Merc 240D         24.4   4
Merc 230          22.8   4
Merc 280          19.2   6

  •  Tags:  
  • r
  • Related