Home > other >  Unselect variables in dplyr
Unselect variables in dplyr

Time:11-04

I have a data set of 800 variables and I am trying to dichotomise the variables 91 to 166. However I do not wish to get rid of the others like the code below suggests. Is there a way to "unselect" the variables?

Thanks!

dt_test <- dt %>%
  select(91:166) %>%
  dicho(dich.by = 2)

CodePudding user response:

According to ?dicho, we can use unquoted names of columns

... - Optional, unquoted names of variables that should be selected for further processing. Required, if x is a data frame (and no vector) and only selected variables from x should be processed. You may also use functions like : or tidyselect's select-helpers. See 'Examples'

library(dplyr)
library(sjmisc)
iris %>% 
    dicho(!!! rlang::syms(names(.)[1:4]), dich.by = 2)

-ouptut

  Sepal.Length Sepal.Width Petal.Length Petal.Width    Species Sepal.Length_d Sepal.Width_d Petal.Length_d Petal.Width_d
1            5.1         3.5          1.4         0.2     setosa              1             1              0             0
2            4.9         3.0          1.4         0.2     setosa              1             1              0             0
3            4.7         3.2          1.3         0.2     setosa              1             1              0             0
4            4.6         3.1          1.5         0.2     setosa              1             1              0             0
5            5.0         3.6          1.4         0.2     setosa              1             1              0             0
...

Or as it is just numeric index of columns, even specifying the index should work

iris %>%
     dicho(1:4, dich.by = 2)

CodePudding user response:

This dichotomizes variables 5 through 8 of the built in anscombe data frame suffixing _2 to the original name. If you want to simply overwrite the original columns instead of generating new columns then omit the .names argument.

anscombe %>% mutate(across(5:8, ~  (. > mean(.)), .names = "{col}_2"))

giving:

   x1 x2 x3 x4    y1   y2    y3    y4 y1_2 y2_2 y3_2 y4_2
1  10 10 10  8  8.04 9.14  7.46  6.58    1    1    0    0
2   8  8  8  8  6.95 8.14  6.77  5.76    0    1    0    0
3  13 13 13  8  7.58 8.74 12.74  7.71    1    1    1    1
4   9  9  9  8  8.81 8.77  7.11  8.84    1    1    0    1
5  11 11 11  8  8.33 9.26  7.81  8.47    1    1    1    1
6  14 14 14  8  9.96 8.10  8.84  7.04    1    1    1    0
7   6  6  6  8  7.24 6.13  6.08  5.25    0    0    0    0
8   4  4  4 19  4.26 3.10  5.39 12.50    0    0    0    1
9  12 12 12  8 10.84 9.13  8.15  5.56    1    1    1    0
10  7  7  7  8  4.82 7.26  6.42  7.91    0    0    0    1
11  5  5  5  8  5.68 4.74  5.73  6.89    0    0    0    0

CodePudding user response:

You can just pass the columns you want to dichotomize into the data argument of dicho instead of using select.

install.packages("sjmisc")
library(sjmisc)
dt_test <- data.frame(matrix(nrow = 100, ncol = 10))
dt_test[,1] <- rep(c('a', 'b', 'c'), length.out = 100)
for(i in 2:10) {`
  dt_test[,i] <- runif(100, 0, 5)`
}

dt_test <- dt_test %>% dicho(dt_test[,2:5], dich.by = 2)`
  • Related