Home > Enterprise >  Replace a string that has in it in R
Replace a string that has in it in R

Time:12-21

I am trying to replace a string that includes in it. However, the function I used failed to do that. Here is a sample dataset.

testdata <- data.frame(id = c(1,2,3),
                   v1 = c("A B", "C","D"),
                   v2 = c("N","M","A B"),
                   v3 = c("D","E","T"))

> testdata
  id  v1  v2 v3
1  1 A B   N  D
2  2   C   M  E
3  3   D A B  T

This function below did nothing.

testdata %>% 
  mutate_all(funs(str_replace(., "A B", ""))) 

I would like to remove A B anywhere in the dataframe.

How can I reach the desired dataset below

> testdata
  id  v1  v2 v3
1  1  NA   N  D
2  2   C   M  E
3  3   D  NA  T

CodePudding user response:

Use na_if if we want to replace a fixed string "A B"

library(dplyr)
testdata <- testdata %>% 
     mutate(across(where(is.character), ~ na_if(.x,  "A B")))

-output

testdata
 id   v1   v2 v3
1  1 <NA>    N  D
2  2    C    M  E
3  3    D <NA>  T

Or if we are checking for the symbol

library(stringr)
testdata <- testdata %>%
   mutate(across(where(is.character),
     ~ case_when(str_detect(.x, fixed(" "), negate = TRUE) ~ .x)))
testdata
  id   v1   v2 v3
1  1 <NA>    N  D
2  2    C    M  E
3  3    D <NA>  T

CodePudding user response:

You need to escape the with \\


library(dplyr)
library(stringr)

testdata |> 
  mutate(across(everything(), ~str_remove(., "A\\ B")))

#>   id v1 v2 v3
#> 1  1     N  D
#> 2  2  C  M  E
#> 3  3  D     T

Created on 2022-12-20 with reprex v2.0.2

CodePudding user response:

A base R option to get rid of strings containing would be:

testdata[] <- lapply(testdata, \(x) ifelse(grepl("\\ ", x), NA, x))

Result:

testdata
#>   id   v1   v2 v3
#> 1  1 <NA>    N  D
#> 2  2    C    M  E
#> 3  3    D <NA>  T

CodePudding user response:

I think the problem is that is being interpreted as a regular expression, so you'll need to add two \\ to escape that character. Then, you can use str_replace. Oh, and mutate_all has been superseded by mutate(across(...)).

library(dplyr)
library(stringr)

testdata %>% 
  mutate(across(everything(), ~ str_replace(.x, "A\\ B", replacement = NA_character_))) 

#  id   v1   v2 v3
#1  1 <NA>    N  D
#2  2    C    M  E
#3  3    D <NA>  T

CodePudding user response:

Try this:

 for(i in 2:length(testdata))
 {
   testdata[,i]<- ifelse(testdata[,i]=="A B", NA ,testdata[,i])   
 }
  • Related