Home > Mobile >  removing numbers and characters from column names r
removing numbers and characters from column names r

Time:10-07

I'm trying to remove specific numbers and characters from the column names in a data frame in R but am only able to remove the numbers, have tried different manners but still keep the characters at the end.

Each column is represented as letters and then a number in parenthesis; e.g. ASE (232)

DataFrame

Subject ASE (232) ASD (121) AFD (313)
   1        1.1.     1.2     1.3

Desired Data Frame

Subject ASE ASD AFD
   1    1.1 1.2 1.3

Code

colnames(data)<-gsub("[A-Z] ([0-9] )","",colnames(data))

CodePudding user response:

You can do this:

sub("(\\w ).*", "\\1", colnames(data))

This uses backreference \\1 to "remember" any series of alphanumeric characters \\w and replaces the whole string in sub's replacement argument with just that remembered bit.

CodePudding user response:

We may change the code to match one or more space (\\s ) followed by the opening parentheses (\\(, one or more digits (\\d ) and other characters (.*) and replace with blank ("")

colnames(data) <- sub("\\s \\(\\d .*", "", colnames(data))
colnames(data)
[1] "Subject" "ASE"     "ASD"     "AFD"    

Or another option is trimws from base R

trimws(colnames(data), whitespace = "\\s \\(.*")
[1] "Subject" "ASE"     "ASD"     "AFD"    

In the OP's, code, it is matching an upper case letter followed by space and the ( is a metacharacter, which is not escaped. , thus in regex mode, it captures the digits (([0-9] )). But, this don't match the pattern in the column names, because after a space, there is a (, which is not matched, thus it returns the same string

gsub("[A-Z] ([0-9] )","",colnames(data))
[1] "Subject"   "ASE (232)" "ASD (121)" "AFD (313)"

data

data <- structure(list(Subject = 1L, `ASE (232)` = "1.1.", `ASD (121)` = 1.2, 
    `AFD (313)` = 1.3), class = "data.frame", row.names = c(NA, 
-1L))

CodePudding user response:

We could use word from stringr package along with rename_with:

library(stringr)
library(dplyr)
data %>% 
  rename_with(~word(., 1))
  Subject  ASE ASD AFD
1       1 1.1. 1.2 1.3
  • Related