Home > other >  How to remove "()" from data frame cells?
How to remove "()" from data frame cells?

Time:02-25

I am trying to clean my data from a data frame's cells. I want to remove some strings, but gsub somehow omits "()". My code:

getridof <- c("(a)", "(40X)", "(5X)", "(10X_a)", "(10X)", "(_)")

for (i in 1:length(getridof)) {
  df2$Sample <- gsub(getridof[i], "", df2$Sample)  
}

but "()" is left in cells after executing the script?

CodePudding user response:

A possible solution, but I am not sure whether you only want to remove parentheses:

library(tidyverse)

getridof <- c("(a)", "(40X)", "(5X)", "(10X_a)", "(10X)", "(_)")

getridof %>% 
  str_remove("^\\(") %>% 
  str_remove("\\)$") 

#> [1] "a"     "40X"   "5X"    "10X_a" "10X"   "_"

CodePudding user response:

This uses reduce and the fixed = TRUE argument of gsub:

library(purrr)
data <- c("(a)100", "(40X)33", "nothing")

getridof <- c("(a)", "(40X)", "(5X)", "(10X_a)", "(10X)", "(_)")

purrr::reduce(getridof,
              ~gsub(.y, "", .x, fixed = TRUE),
              .init = data)

# [1] "100"     "33"      "nothing" 

CodePudding user response:

Using gsub:

gsub("[()]", "", getridof)

[1] "a"     "40X"   "5X"    "10X_a" "10X"   "_"  

Using stringr:

library(stringr)
str_remove_all(getridof, "[()]")

[1] "a"     "40X"   "5X"    "10X_a" "10X"   "_"

CodePudding user response:

adding argument fixed = TRUE did the job

df2$Sample <- gsub(getridof[i], "", df2$Sample, fixed = TRUE)
  • Related