I have a data frame in R. I have this working splendidly at the moment as a test of my initial regex. For reference, I have dplyr
and magrittr
installed, largely for other reasons, and I am following some project-wide conventions as far as whitespace and closing parentheses are concerned:
frame %<>% mutate(
columnA = case_when(
grepl("WXYZ *[1-9]{1,2}", columnB) == TRUE ~'HOORAY'
)
)
The thing is, I would like to replace 'HOORAY' with whatever grepl
actually found. Right now, I am of course searching for strings containing WXYZ followed by any number of spaces (0 included) and then a single- or double-digit integer.
If, for example, grepl
found the string "WXYZ 22", I want the corresponding entry in columnA to be written as "WXYZ 22". But then if it finds "WXYZ5" later, I want it to write "WXYZ5" in its own corresponding entry.
I want, in pseudocode TRUE ~ <what grepl found>
.
Can I do this with case_when
? If so, is there a better way?
CodePudding user response:
If the case_when
structure is necessary, this solution using stringr
works:
grepl("WXYZ *[1-9]{1,2}", columnB) ~ str_extract(columnB, "WXYZ *[1-9]{1,2}")
Depending on what the bigger problem setup looks like, you could also just do:
mutate(columnA = str_extract(columnB, "WXYZ *[1-9]{1,2}"))
Note that columnA would be NA
for situations where it fails to match. Also note that while grep
expects the pattern first and then the target string, stringr
functions expect the opposite.