Home > Enterprise >  Create Regex to remove string and character from row values
Create Regex to remove string and character from row values

Time:07-07

I have a column in my dataframe that looks like this:

branching_loc <- c("([preliminary_arm_1][antibiotic_arm] = '1') and [was_review_done]='1'",
                   "[preliminary_arm_1][antibiotic_arm]  = '1' and [was_review_done]=='1'",
                   "[preliminary_arm_1][antibiotic_arm]  = '1' and [was_review_done]=='1'",
                   "[preliminary_arm_1][antibiotic_arm]  = '1' and [was_review_done]=='1'")
                                                 
df <- data.frame(branching_loc)

Now I did like to remove [preliminary_arm_1] only from this row values. I am struggling with creating a regex expression in R language to do this. Kindly assist

CodePudding user response:

A possible solution:

library(tidyverse)

df %>% 
  mutate(branching_loc = str_remove(branching_loc, "\\[preliminary_arm_1\\]"))

#>                                        branching_loc
#> 1 ([antibiotic_arm] = '1') and [was_review_done]='1'
#> 2 [antibiotic_arm]  = '1' and [was_review_done]=='1'
#> 3 [antibiotic_arm]  = '1' and [was_review_done]=='1'
#> 4 [antibiotic_arm]  = '1' and [was_review_done]=='1'

CodePudding user response:

gsub option:

branching_loc <- c("([preliminary_arm_1][antibiotic_arm] = '1') and [was_review_done]='1'",
                   "[preliminary_arm_1][antibiotic_arm]  = '1' and [was_review_done]=='1'",
                   "[preliminary_arm_1][antibiotic_arm]  = '1' and [was_review_done]=='1'",
                   "[preliminary_arm_1][antibiotic_arm]  = '1' and [was_review_done]=='1'")

df <- data.frame(branching_loc)

library(dplyr)
df %>%
  mutate(branching_loc = gsub("[preliminary_arm_1]", "", branching_loc, fixed = TRUE))
#>                                        branching_loc
#> 1 ([antibiotic_arm] = '1') and [was_review_done]='1'
#> 2 [antibiotic_arm]  = '1' and [was_review_done]=='1'
#> 3 [antibiotic_arm]  = '1' and [was_review_done]=='1'
#> 4 [antibiotic_arm]  = '1' and [was_review_done]=='1'

Created on 2022-07-06 by the reprex package (v2.0.1)

CodePudding user response:

To get a somewhat tidier output, without the presumably unwanted parentheses:

df %>%
  mutate(branching_loc = gsub("^\\(?\\[\\w \\]|\\)(?=\\sand)", "", branching_loc, perl = TRUE))
                                       branching_loc
1   [antibiotic_arm] = '1' and [was_review_done]='1'
2 [antibiotic_arm]  = '1' and [was_review_done]=='1'
3 [antibiotic_arm]  = '1' and [was_review_done]=='1'
4 [antibiotic_arm]  = '1' and [was_review_done]=='1'
  • Related