Home > Net >  How can I remove inner parentheses from an R string?
How can I remove inner parentheses from an R string?

Time:11-22

I am processing strings in R which are supposed to contain zero or one pair of parentheses. If there are nested parentheses I need to delete the inner pair. Here is an example where I need to delete the parentheses around big bent nachos but not the other/outer parentheses.

test <- c(
  "Record ID", 
  "What is the best food? (choice=Nachos)", 
  "What is the best food? (choice=Tacos (big bent nachos))", 
  "What is the best food? (choice=Chips with stuff)", 
  "Complete?"
) 

I know I can kill all the parentheses with the stringr package using str_remove_all():

test |>
  stringr::str_remove_all(stringr::fixed(")")) |> 
  stringr::str_remove_all(stringr::fixed("("))

but I don't have the RegEx skills to pick the inner parentheses. I found a SO post that is close but it removes the outer parentheses and I cant untangle it to remove the inner.

CodePudding user response:

Here is a solution using gsub from base R. It is broken down into 2 steps for readability and debugging.

test <- c(
   "Record ID", 
   "What is the best food? (choice=Nachos)", 
   "What is the best food? (choice=Tacos (big bent nachos))", 
   "What is the best food? (choice=Chips with stuff)", 
   "Complete?"
) 

test <- gsub("(\\(.*)\\(", "\\1", test)
# ( \\(.*  ) - first group starts with '(' then zero or more characters following that first '('
#  \\(       - middle part look of a another '('

#  "\\1" replace the found group with the part from the first group

test <-gsub("\\)(.*\\))", "\\1", test)
#similer to first part
test

[1] "Record ID"                                            
[2] "What is the best food? (choice=Nachos)"               
[3] "What is the best food? (choice=Tacos big bent nachos)"
[4] "What is the best food? (choice=Chips with stuff)"     
[5] "Complete?"  

CodePudding user response:

Assuming there be at most one nested parentheses, we could use a gsub() approach:

output <- gsub("\\(\\s*(.*?)\\s*\\(.*?\\)(.*?)\\s*\\)", "(\\1\\2)", test)
output

[1] "Record ID"                                       
[2] "What is the best food? (choice=Nachos)"          
[3] "What is the best food? (choice=Tacos)"           
[4] "What is the best food? (choice=Chips with stuff)"
[5] "Complete?"

Data:

test <- c(
  "Record ID", 
  "What is the best food? (choice=Nachos)", 
  "What is the best food? (choice=Tacos (big bent nachos))", 
  "What is the best food? (choice=Chips with stuff)", 
  "Complete?"
)

CodePudding user response:

Here you go.

test |>
  stringr::str_replace_all("(\\().*\\(", "\\1") |> # remove inner open brackets
  stringr::str_remove_all("\\)(?=.*\\))") # remove inner closed brackets
[1] "Record ID"                                       
[2] "What is the best food? (choice=Nachos)"          
[3] "What is the best food? (big bent nachos)"        
[4] "What is the best food? (choice=Chips with stuff)"
[5] "Complete?"
  • Related