How to use regex with the same prefix but different suffix?-CodePudding

Let's say that My data has the following structure:

Data<-structure(list(Date = structure(c(17955, 17955, 17954, 17954, 
17953), class = "Date"), name = c("QLD to SA", "QLD.NSW to SA.NSW", 
"QLD to SA", "QLD.NSW to SA.NSW", "QLD to SA"), value = c(-2.33611657245688, 
-1.48768446629906, -2.36699803453011, -1.46423011205677, -2.32284554692339
)), row.names = c(NA, -5L), class = c("tbl_df", "tbl", "data.frame"
))

I want to create a new column called Group. This column depends on the regular expressions in my column name. What I want is an ouput like this:

 Date       name              value Group
2019-02-28 QLD to SA         -2.34 QLD  
2019-02-28 QLD.NSW to SA.NSW -1.49 QLD-NSW  
2019-02-27 QLD to SA         -2.37 QLD  
2019-02-27 QLD.NSW to SA.NSW -1.46 QLD-NSW  
2019-02-26 QLD to SA         -2.32 QLD

I think that something like this could function:

Data%>%mutate(Group=case_when(
    str_detect(name, regex("QLD", ignore_case=TRUE)) ~ "QLD",
    str_detect(name, regex("^QLD.NSW", ignore_case=TRUE)) ~ "QLD-NSW",
                         T ~ "number"))

It fails because the column group recognize only the first case QLD and stop it for the second QLD-NSW

CodePudding user response：

You just need to swap the two lines of code starting with str_detect.

Please find below a reprex.

Reprex

Code

library(tidyverse)

Data%>%mutate(Group=case_when(
  str_detect(name, regex("^QLD\\.NSW", ignore_case=TRUE)) ~ "QLD-NSW",
  str_detect(name, regex("QLD", ignore_case=TRUE)) ~ "QLD",
  T ~ "number"))

Output

#> # A tibble: 5 x 4
#>   Date       name              value Group  
#>   <date>     <chr>             <dbl> <chr>  
#> 1 2019-02-28 QLD to SA         -2.34 QLD    
#> 2 2019-02-28 QLD.NSW to SA.NSW -1.49 QLD-NSW
#> 3 2019-02-27 QLD to SA         -2.37 QLD    
#> 4 2019-02-27 QLD.NSW to SA.NSW -1.46 QLD-NSW
#> 5 2019-02-26 QLD to SA         -2.32 QLD

^{Created on 2022-03-22 by the reprex package (v2.0.1)}