Home > OS >  gsub not working for dataframe's variable R
gsub not working for dataframe's variable R

Time:10-12

I donot understand what I am doing wrong. I have a dataframe and one of the variables looks like this.

ss <- c("F00020 " ,  "F13975 "  , "F13976 " ,  "F15334 " ,  "F12490 "  , "F09787 "  , "F14675 "  , 
  "F12129 " ,  "F04641 " ,  "F04680 " ,  "F04715 " ,  "F04753 " ,  "F08868 " ,  "F14031 "  ,
 "F14033 " ,  "F12585 " ,  "F14663 ") 

I want to omit the extra blank spaces.

gsub("[[:space:]]","",ss)

The above code works but if I directly call the variable from the dataframe it's not working.

gsub("[[:space:]]","",df$Variable)

I also checked the type of the vector/variables, both are same as a character vector. So what is happening here?

CodePudding user response:

I cannot reproduce your error:

ss <- c("F00020 " ,  "F13975 "  , "F13976 " ,  "F15334 " ,  "F12490 "  , "F09787 "  , "F14675 "  , 
        "F12129 " ,  "F04641 " ,  "F04680 " ,  "F04715 " ,  "F04753 " ,  "F08868 " ,  "F14031 "  ,
        "F14033 " ,  "F12585 " ,  "F14663 ") 

gsub("[[:space:]]","",ss)

[1] "F00020" "F13975" "F13976" "F15334" "F12490" "F09787" "F14675" "F12129" "F04641" "F04680" "F04715"
[12] "F04753" "F08868" "F14031" "F14033" "F12585" "F14663"

df <- data.frame(Variable = ss)
gsub("[[:space:]]","",df$Variable)

[1] "F00020" "F13975" "F13976" "F15334" "F12490" "F09787" "F14675" "F12129" "F04641" "F04680" "F04715"
[12] "F04753" "F08868" "F14031" "F14033" "F12585" "F14663"

CodePudding user response:

An easy solution for your use case is with trimws:

trimws(ss)
 [1] "F00020" "F13975" "F13976" "F15334" "F12490" "F09787" "F14675" "F12129" "F04641" "F04680" "F04715"
[12] "F04753" "F08868" "F14031" "F14033" "F12585" "F14663"

Yes, as noted by others, your solution does work too, just as this, shorter, one does:

sub("\\s", "", ss) # no `gsub` needed **iff** there's always just one whitespace per string (in whatever position)
  • Related