I have example data as follows:
dat <- mtcars
dat$add <- dat[,1]
dat$last <- dat[,1]
names(dat) <- c("fraction_low", "fraction_medium", "fraction_high", "fraction_low_2000", "fraction_medium_2000", "fraction_high_2000","fraction_low_2001", "fraction_medium_2001", "fraction_high_2001","fraction_low_comb", "fraction_medium_comb", "fraction_high_comb", "last")
I would like to remove all columns/variable names that have fraction
and low
, medium
, or high
in their name, except the columns/variable names that also have comb
in the name.
I found a lot of good answers for removing columns with a certain pattern here. For example:
library(dplyr)
dat %>% select( -( contains("fraction") & ( (contains("low") | contains("medium") | contains("high") ) )))
But how should I implement and exception to such a pattern?
Desired output:
desired_output <- dat[,10:13]
fraction_low_comb fraction_medium_comb fraction_high_comb last
Mazda RX4 4 4 21.0 21.0
Mazda RX4 Wag 4 4 21.0 21.0
Datsun 710 4 1 22.8 22.8
Hornet 4 Drive 3 1 21.4 21.4
CodePudding user response:
We may use
library(dplyr)
dat %>%
select(-matches(c("fraction", "low", "high", "medium")), contains("comb"))
Or may use
dat %>%
select(-matches("fraction_(low|medium|high)_(?!comb)|fraction_(low|medium|high)$",
perl = TRUE))