removing columns based on segment of column names-CodePudding

I have a dataframe that has multiple columns (close to 100) I don't need that have "CNT" in the middle. Below is a short example:

id   drink  drink_CNT_v2   sage_CNT_v5
1      12        23             12
2      14        32             13
3      15        12             12
4      16        12             43
5      20        50             23

I want to remove all variables that have CNT in the middle. Does anyone know how I could do that. I tried using mutate in tidyverse, but that didn't work.

CodePudding user response：

We could use contains in select

library(dplyr)
df2 <- df1 %>%
   select(-contains("_CNT_"))

-output

data

df1 <- structure(list(id = 1:5, drink = c(12L, 14L, 15L, 16L, 20L), 
    drink_CNT_v2 = c(23L, 32L, 12L, 12L, 50L), sage_CNT_v5 = c(12L, 
    13L, 12L, 43L, 23L)), class = "data.frame", row.names = c(NA, 
-5L))

CodePudding user response：

In base R, with grepl:

df[!grepl("CNT", colnames(df))]

Also works with select (use grep):

df %>% 
  select(-grep("CNT", names(.)))