I haven't been able to find any answers to this specific problem:
I have a factor variable with over 500 levels, that I need to relevel to just 2 levels (1/0.)
Many of the levels start with the same character string e.g. "Woman's mother or sister:"
Is there a way to use starts_with to relevel all of these levels at the same time, instead of doing one by one as I have been doing with this code:
levels(DF1$MedicalCondition)[levels(DF1$MedicalCondition) == "Woman's mother or sister: sister"] <- "1"
Any help appreciated, thank you!
CodePudding user response:
tidyselect::starts_with
is specifically written for use on column names within dplyr
-type functions, but you can use the base R startsWith
:
levels(DF1$MedicalCondition)[
startsWith(levels(DF1$MedicalCondition), "Woman's mother or sister")
] <- "1"
You can also use general regex patterns with grepl
or stringr::str_detect
, which can be very powerful.