Using R to delete column with names of that have certain naming sequence-CodePudding

I am trying to use the select function in dplyr to get rid of column names with a certain pattern.

Specifically, say I have variable names:

p1_first...1

p1_middle...2

p1_last...3

p1_first...7

p1_middle...9

p1_last...11

I want to delete the columns ending with "...7", "...9", & "...11", keeping those with "...1", "...2", "...3". Therefore, the resulting dataset would retain names:

p1_first...1

p1_middle...2

p1_last...3

I have tried the following (w no success):

 data %>% 
  select(-c(num_range("p1_", 7:11)
            ))

Any help is appreciated- thanks!

CodePudding user response：

library(dplyr)
df %>% 
  select(-ends_with(c("...7", "...9", "...11")))

Example:

my_colnames <- c("p1_first...1", "p1_middle...2", "p1_last...3", 
"p1_first...7", "p1_middle...9", "p1_last...11")

df <- mtcars[,1:6]

colnames(df) <- my_colnames
df %>% 
  select(-ends_with(c("...7", "...9", "...11")))

 
                    p1_first...1 p1_middle...2 p1_last...3
Mazda RX4                   21.0             6       160.0
Mazda RX4 Wag               21.0             6       160.0
Datsun 710                  22.8             4       108.0
Hornet 4 Drive              21.4             6       258.0
Hornet Sportabout           18.7             8       360.0
Valiant                     18.1             6       225.0
Duster 360                  14.3             8       360.0
Merc 240D                   24.4             4       146.7
Merc 230                    22.8             4       140.8
Merc 280                    19.2             6       167.6
Merc 280C                   17.8             6       167.6
Merc 450SE                  16.4             8       275.8
Merc 450SL                  17.3             8       275.8...

CodePudding user response：

You could use the matches function with some regex:

data %>%
  select(-matches('\\.(7|9|11)$'))

CodePudding user response：

You can specify the numbers you want to drop with a sequence which might help if you have a more complex situation than just 3 columns. Note for dplyr::select() you need to provide a character vector so it must be converted with as.character().

library(tidyverse)

d <- data.frame(p1_first...1 = 1:5, p1_middle...2 = 1:5, p1_last...3 = 1:5, p1_first...7 = 1:5, p1_middle...9 = 1:5, p1_last...11 = 1:5)

d %>% select(-ends_with(as.character(seq(7, 11, 2))))
#>   p1_first...1 p1_middle...2 p1_last...3
#> 1            1             1           1
#> 2            2             2           2
#> 3            3             3           3
#> 4            4             4           4
#> 5            5             5           5

^{Created on 2022-02-11 by the reprex package (v2.0.1)}