Home > other >  R: How to remove a part of string, which with a specific start and end, in a R dataframe?
R: How to remove a part of string, which with a specific start and end, in a R dataframe?

Time:05-07

I have a data frame like this:

df = data.frame(order = c(1,2,3), info = c("an apple","a banana[12],","456[Ab]"))

I want to clean up to remove the [] and content inside []. So that the result of df$info will be "an apple" "a banana" "456"

Please help...

CodePudding user response:

Use gsub:

df$info <- gsub("\\[.*?\\]", "", df$info)

CodePudding user response:

1.) This will give the expected output with removing also the comma:

library(dplyr)
library(stringr)

df %>% 
  mutate(info = str_trim(str_replace_all(info, "(\\[.*\\])\\,?", "")))
  order     info
1     1 an apple
2     2 a banana
3     3      456

2.) This will remove brackets and their content:

\\[....match [

.*....any following characters

\\]... match ]

library(dplyr)
library(stringr)

df %>% 
  mutate(info = str_replace_all(info, "\\[.*\\]$", ""))
  order      info
1     1  an apple
2     2 a banana,
3     3       456
  •  Tags:  
  • r
  • Related