Home > Software design >  Removing special characters in a string in R
Removing special characters in a string in R

Time:12-29

I have a data frame where I want to remove all the round brackets that are found in a specific column. In this case in column Z. The code that I have written only removes some of the round brackets. It removes the first brackets that it encounters and leaves out the rest. Could I get some assistance as to how I could go about to remove all the round brackets. Thanks

library(tidyverse)
FRUIT <- data.frame("X" = c(1,2,3), 
                 "Y" = c("A1", "A2", "A3"),
                 "Z" = c('[LEMON(ORANGE)xxx[LEMON(GRAPE)]', "[PEAR(APPLE)]xxxORANGE(APPLE)", "PEACHxxx[APR(ICOT)]"), 
                 stringsAsFactors = FALSE)

The output datafarme is

X Y   Z
1 A1   [LEMON(ORANGE)xxx[LEMON(GRAPE)]
2 A2   [PEAR(APPLE)]xxxORANGE(APPLE)
3 A3   PEACHxxx[APR(ICOT)]

The code that I have tried to remove the brackets is below:

fruit2<-FRUIT%>%
  mutate(Z=str_remove(Z,"\\("))%>%
  mutate(Z=str_remove(Z,"\\)"))

The problem with this code is that it does not remove all the brackets. It only removes the ones that it encounters first and leaves out the rest. The out put is below:

 X Y   Z
 1 A1   [LEMONORANGExxx[LEMON(GRAPE)]
2 A2   [PEARAPPLE]xxxORANGE(APPLE)
3 A3   PEACHxxx[APRICOT]

My desired output is :

 X Y   Z
 1 A1   [LEMONORANGExxx[LEMONGRAPE]
2 A2   [PEARAPPLE]xxxORANGEAPPLE
3 A3   PEACHxxx[APRICOT]

CodePudding user response:

You just need to replace str_remove with str_remove_all and you'll get the desired output:

FRUIT %>%
  mutate(Z = str_remove_all(Z, "\\(")) %>%
  mutate(Z = str_remove_all(Z, "\\)"))

#>   X  Y                           Z
#> 1 1 A1 [LEMONORANGExxx[LEMONGRAPE]
#> 2 2 A2   [PEARAPPLE]xxxORANGEAPPLE
#> 3 3 A3           PEACHxxx[APRICOT]
  • Related