I am trying to use the select function in dplyr to get rid of column names with a certain pattern.
Specifically, say I have variable names:
p1_first...1
p1_middle...2
p1_last...3
p1_first...7
p1_middle...9
p1_last...11
I want to delete the columns ending with "...7", "...9", & "...11", keeping those with "...1", "...2", "...3". Therefore, the resulting dataset would retain names:
p1_first...1
p1_middle...2
p1_last...3
I have tried the following (w no success):
data %>%
select(-c(num_range("p1_", 7:11)
))
Any help is appreciated- thanks!
CodePudding user response:
library(dplyr)
df %>%
select(-ends_with(c("...7", "...9", "...11")))
Example:
my_colnames <- c("p1_first...1", "p1_middle...2", "p1_last...3",
"p1_first...7", "p1_middle...9", "p1_last...11")
df <- mtcars[,1:6]
colnames(df) <- my_colnames
df %>%
select(-ends_with(c("...7", "...9", "...11")))
p1_first...1 p1_middle...2 p1_last...3
Mazda RX4 21.0 6 160.0
Mazda RX4 Wag 21.0 6 160.0
Datsun 710 22.8 4 108.0
Hornet 4 Drive 21.4 6 258.0
Hornet Sportabout 18.7 8 360.0
Valiant 18.1 6 225.0
Duster 360 14.3 8 360.0
Merc 240D 24.4 4 146.7
Merc 230 22.8 4 140.8
Merc 280 19.2 6 167.6
Merc 280C 17.8 6 167.6
Merc 450SE 16.4 8 275.8
Merc 450SL 17.3 8 275.8...
CodePudding user response:
You could use the matches
function with some regex:
data %>%
select(-matches('\\.(7|9|11)$'))
CodePudding user response:
You can specify the numbers you want to drop with a sequence which might help if you have a more complex situation than just 3 columns. Note for dplyr::select()
you need to provide a character
vector
so it must be converted with as.character()
.
library(tidyverse)
d <- data.frame(p1_first...1 = 1:5, p1_middle...2 = 1:5, p1_last...3 = 1:5, p1_first...7 = 1:5, p1_middle...9 = 1:5, p1_last...11 = 1:5)
d %>% select(-ends_with(as.character(seq(7, 11, 2))))
#> p1_first...1 p1_middle...2 p1_last...3
#> 1 1 1 1
#> 2 2 2 2
#> 3 3 3 3
#> 4 4 4 4
#> 5 5 5 5
Created on 2022-02-11 by the reprex package (v2.0.1)