It seems like I need a refresher on how to select using different tidyverse. Why do the map/mutate selections behave differently than the dplyr select?
Question 1: How are we supposed to map/mutate every column except one (named). Why does '-"ID"' not work like in select? It is completely counterintuitive to me.
library(tidyverse)
df <- data.frame(ID = 1:10,
a = 1:10,
b = 1:10,
d = 1:10,
e = 1:10)
df %>% map_at(-(1:3), as.character) # this works
df %>% select(-"b") # this works
df %>% map_at(-"b", as.character) # Why does this not work?
df %>% map_if(colnames(.)!="b", as.character) # there has to be a better way
df %>% rowwise() %>% mutate(Sum = sum(-ID)) # I get why this does not work, but how do I do this?
df %>% rowwise() %>% mutate(Sum = sum(select(., -ID))) # This way I loose the rowwise operator
df %>% mutate(Sum = apply(df %>% select(-ID), 1, sum)) # there has to be a better way
How about mapping everywhere except the last column?
df %>% map_at(-ncol(.), as.character) # is there a better way?
Question 2: How do I select columns (by name) with the ':' operator?
df %>% select(a:e)
df %>% map_at(a:e, as.character) # doesn`t work. How do I do this?
df %>% rowwise() %>% mutate(Sum = sum(a:e)) # misleading - Does it just use a? Why no error?
df %>% rowwise() %>% mutate(Sum = sum(a,b,d,e)) # this wont work with hundrets of variables
Thank you!
CodePudding user response:
According to the docs (?map_at
) the .at
argument takes
A character vector of names, positive numeric vector of positions to include, or a negative numeric vector of positions to exlude.
Hence -c(1:3)
works as it is a negative numeric vector of positions to exclude. But you can't do -"b"
. For this case you have to use vars()
from tidyselect
.
Using vars()
you could fix the two non-working example of your first question and the first of your second question:
library(tidyverse)
df %>% map_at(vars(-b), as.character)
#> $ID
#> [1] "1" "2" "3" "4" "5" "6" "7" "8" "9" "10"
#>
#> $a
#> [1] "1" "2" "3" "4" "5" "6" "7" "8" "9" "10"
#>
#> $b
#> [1] 1 2 3 4 5 6 7 8 9 10
#>
#> $d
#> [1] "1" "2" "3" "4" "5" "6" "7" "8" "9" "10"
#>
#> $e
#> [1] "1" "2" "3" "4" "5" "6" "7" "8" "9" "10"
df %>% map_at(vars(-ncol(.)), as.character)
#> $ID
#> [1] "1" "2" "3" "4" "5" "6" "7" "8" "9" "10"
#>
#> $a
#> [1] "1" "2" "3" "4" "5" "6" "7" "8" "9" "10"
#>
#> $b
#> [1] "1" "2" "3" "4" "5" "6" "7" "8" "9" "10"
#>
#> $d
#> [1] "1" "2" "3" "4" "5" "6" "7" "8" "9" "10"
#>
#> $e
#> [1] 1 2 3 4 5 6 7 8 9 10
df %>% map_at(vars(a:e), as.character)
#> $ID
#> [1] 1 2 3 4 5 6 7 8 9 10
#>
#> $a
#> [1] "1" "2" "3" "4" "5" "6" "7" "8" "9" "10"
#>
#> $b
#> [1] "1" "2" "3" "4" "5" "6" "7" "8" "9" "10"
#>
#> $d
#> [1] "1" "2" "3" "4" "5" "6" "7" "8" "9" "10"
#>
#> $e
#> [1] "1" "2" "3" "4" "5" "6" "7" "8" "9" "10"
Similarly for your question using rowwise
. You can't pass several columns to sum
using e.g. a:e
. Instead you have to wrap inside c_across
:
df %>% rowwise() %>% mutate(Sum = sum(c_across(a:e)))
#> # A tibble: 10 × 6
#> # Rowwise:
#> ID a b d e Sum
#> <int> <int> <int> <int> <int> <int>
#> 1 1 1 1 1 1 4
#> 2 2 2 2 2 2 8
#> 3 3 3 3 3 3 12
#> 4 4 4 4 4 4 16
#> 5 5 5 5 5 5 20
#> 6 6 6 6 6 6 24
#> 7 7 7 7 7 7 28
#> 8 8 8 8 8 8 32
#> 9 9 9 9 9 9 36
#> 10 10 10 10 10 10 40
And a second option for the last operation would be to use rowSums
with across
:
df %>% mutate(Sum = rowSums(across(a:e)))
#> ID a b d e Sum
#> 1 1 1 1 1 1 4
#> 2 2 2 2 2 2 8
#> 3 3 3 3 3 3 12
#> 4 4 4 4 4 4 16
#> 5 5 5 5 5 5 20
#> 6 6 6 6 6 6 24
#> 7 7 7 7 7 7 28
#> 8 8 8 8 8 8 32
#> 9 9 9 9 9 9 36
#> 10 10 10 10 10 10 40