Home > Software engineering >  Replace strings other than matched with "TBD" in a dataframe in r
Replace strings other than matched with "TBD" in a dataframe in r

Time:04-13

Considering the example dataframe:

df <- data.frame(A = seq(1,10,1), B = c("Type A", "9383", "Type B", "Duplicate", "No",
                                        "Type B", "No", "others", "Type A", "Duplicate"))

Lets say I have already made some mutations to the datafram as below:

library(dplyr)
df <- df %>% mutate(A = paste(.$A, "hours"))

I want to add another line of mutate to change elements in column B that do not match vector plan_types to "TBD".

plan_types <- c("Duplicate", "Type A", "Type B", "No")

Desired output will be:

> df
          A         B
1   1 hours    Type A
2   2 hours       TBD
3   3 hours    Type B
4   4 hours Duplicate
5   5 hours        No
6   6 hours    Type B
7   7 hours        No
8   8 hours       TBD
9   9 hours    Type A
10 10 hours Duplicate

CodePudding user response:

We may use replace

library(dplyr)
df <-  df %>%
    mutate(B = replace(B, ! B %in% plan_types, "TBD"))

Or in base R

df$B[! df$B %in% plan_types] <- "TBD"

CodePudding user response:

Another strategy is to use str_detect from stringr package: Before we have to create the pattern:

library(dplyr)
library(stringr)

pattern <- paste(plan_types, collapse = '|')

df %>% 
  mutate(A = paste(.$A, "hours")) %>% 
  mutate(B = ifelse(str_detect(B, pattern), B, "TBD"))
          A         B
1   1 hours    Type A
2   2 hours       TBD
3   3 hours    Type B
4   4 hours Duplicate
5   5 hours        No
6   6 hours    Type B
7   7 hours        No
8   8 hours       TBD
9   9 hours    Type A
10 10 hours Duplicate

CodePudding user response:

Probably better ways but I'll post anyway.

df <- df %>% mutate(B = if_else(B %in% plan_types, B, "TBD"))
  •  Tags:  
  • r
  • Related