Home > database >  How can I detect a word in a column variable and mutate it in a new column in R using dplyr?
How can I detect a word in a column variable and mutate it in a new column in R using dplyr?

Time:10-22

I have a data frame that looks like this :

var
A_CAT
B_DOG
A_CAT
F_HORSE
GEORGE_DOG
HeLeN_CAT

and I want to look like this :

var var_new
A_CAT CAT
B_DOG DOG
A_CAT CAT
F_HORSE HORSE
GEORGE_DOG DOG
HeLeN_CAT CAT

How can I do this in R ?

library(tidyverse)
var = c("A_CAT","B_DOG","A_CAT","F_HORSE","GEORGE_DOG","HeLeN_CAT")
df = tibble(var);df

CodePudding user response:

df %>%
   mutate(var_new = str_remove(var, '. _'))

# A tibble: 6 × 2
  var        var_new
  <chr>      <chr>  
1 A_CAT      CAT    
2 B_DOG      DOG    
3 A_CAT      CAT    
4 F_HORSE    HORSE  
5 GEORGE_DOG DOG    
6 HeLeN_CAT  CAT    

CodePudding user response:

Using R base sub

> df$var_new <- sub(".*_(.*)$", "\\1", df$var)
> df
# A tibble: 6 × 2
  var        var_new
  <chr>      <chr>  
1 A_CAT      CAT    
2 B_DOG      DOG    
3 A_CAT      CAT    
4 F_HORSE    HORSE  
5 GEORGE_DOG DOG    
6 HeLeN_CAT  CAT    

CodePudding user response:

We could use str_extract to extract the desired srings, by using (?i) we could make the search case insensitive:

librar(dplyr)
library(stringr)

df %>% 
  mutate(var_new = str_extract(var, "(?i)CAT|Dog|Horse"))
         var var_new
1      A_CAT     CAT
2      B_DOG     DOG
3      A_CAT     CAT
4    F_HORSE   HORSE
5 GEORGE_DOG     DOG
6  HeLeN_CAT     CAT
  • Related