I have a dataframe with 50 columns having column names that have this structure.
X..Age <- c(23, 34, 24, 10)
..Region <- c("A", "B","C","D")
X.span.style..display.none..Salary <- c(100,200, 300, 400)
X.....code <- c(14, 12, 13, 15)
DF <- data.frame(X..Age, ..Region, X.span.style..display.none..Salary, X.....code)
I want to remove the strings X..
, ..
, X.span.style..display.none..
& X.....
from the column names. How do I go about this?
CodePudding user response:
Use gsub and the regexp "|" to group all those unwanted into a single gsub
df <- data.frame(X..Age=NA, ..Region=NA, X.span.style..display.none..Salary=NA, X.....code=NA,col_I_want=NA,col_to_keep=NA)
remove_pattern <- c("X..","..","X.span.style..display.none..","X.....")
remove_pattern <- paste0(remove_pattern,collapse="|")
names(df) <- gsub(remove_pattern,"",names(df))
CodePudding user response:
As an answer instead of a comment under Soren's answer following the same principle, but using a generalised pattern instead (delete everything that comes before ..
):
names(DF) <- gsub(".*\\.\\.", "", names(DF))
DF
Age Region Salary code
1 23 A 100 14
2 34 B 200 12
3 24 C 300 13
4 10 D 400 15
CodePudding user response:
As they become the column names of the data.frame you would need some kind of name to replace them
names(DF) <- NULL
removes the names and set them to "V1", "V2",...
You could also use that to pass an array with your new names like
names(DF) <- c("col_name1","col_name2","col_name3","col_name4")
Or if you want to completely remove them then maybe use a matrix
instead of a data.frame
like this:
DF <- matrix(c(X..Age, ..Region, X.span.style..display.none..Salary, X.....code),ncol=4)
CodePudding user response:
You could use str_extract
to capture everything after the last ..
, e.g.:
names(DF) <- stringr::str_extract(names(DF), "[^..] $")
Output:
[1] "Age" "Region" "Salary" "code"