df1 <- data.frame(x1_modhigh_2020 = 1,
x2_modhigh_2030 = 1,
x1_low_2020 = 1,
x2_low_2030 = 1,
x1_high_2020 = 1,
x2_high_2030 = 1)
In a for-loop I want to select columns based on whether they contain 'low', 'modhigh' or 'high' and do some operations on them. My method of selecting columns is:
library(dplyr)
df1 %>% dplyr::select(contains("low")) # this works
df1 %>% dplyr::select(contains("modhigh")) # this works
df1 %>% dplyr::select(contains("high")) # does not work. This also select `modhigh`
How can I modify the selection of high
so that modhigh
does not get selected as well
CodePudding user response:
Using matches
you can use regex syntax (rather than contains
, which does not allow the use of regex), here for example the pipe |
, which is a regex metacharacter signifying alternation:
df1 %>%
select(matches("_high|low"))
x1_low_2020 x2_low_2030 x1_high_2020 x2_high_2030
1 1 1 1 1
CodePudding user response:
I would also use the matches
selection helper proposed by @Chris, but if you are interested in alternatives:
# dplyr
dplyr::select(df1, grep("_high|low", colnames(df1)))
# base R
df1[, grep("_high|low", colnames(df1))]
Both result in
x1_low_2020 x2_low_2030 x1_high_2020 x2_high_2030
1 1 1 1