Home > other >  Multiple necessary conditions in grepl
Multiple necessary conditions in grepl

Time:04-07

I need to create a cathegorical variable (new_var) based on some conditions. The variable containing those conditions (var) is in text format such as:

var new_var
L: 06:00-22:00 (A) A
L: 00:00-07:59 (D), 07:59-23:59 (A) MIXED
L-V: 08:00-21:00 (A), 07:59-23:59 (A) A
S: 08:00-19:00 (D) D

So, the conditions for creating the new variable are between brakets. Can be A, D, or MIXED (A & D).

I tried the following code:

var = as.character(c('L: 06:00-22:00 (A)', 'L: 00:00-07:59 (D), 07:59-23:59 (A)', 'L-V: 08:00-21:00 (A), 07:59-23:59 (A)', 'S: 08:00-19:00 (D)'))
df<- as.data.frame(var)


   df<- df%>% 
 mutate(new_var = case_when(
   grepl("(D).*(A)", df$var) ~ "MIXED", 
   grepl("(A)", df$var) ~ "A", 
   grepl("(D)", df$var) ~ "D",
   T ~ "N/A"))

But creates the newvar imprecisely, with errors.

CodePudding user response:

Does it have to be using tidyverse piping?

If not, here are some steps you could try:

x <- c('L: 06:00-22:00 (A)', 'L: 00:00-07:59 (D), 07:59-23:59 (A)', 'L-V: 08:00-21:00 (A), 07:59-23:59 (A)', 'S: 08:00-19:00 (D)')

m <- gregexpr('(?<=\\()[AD](?=\\))', x, perl = TRUE)
regmatches(x, m)
[[1]]
[1] "A"

[[2]]
[1] "D" "A"

[[3]]
[1] "A" "A"

[[4]]
[1] "D"

new_var <- lapply(regmatches(x, m), unique)

new_var[sapply(new_var, length) > 1] <- 'MIXED'
unlist(new_var)
  •  Tags:  
  • r
  • Related