Suppose I have the following data frame:
> survey
workshop gender q1 q2 q3 q4
1 R Female 4 3 4 5
2 SPSS Male 3 4 3 4
3 <NA> <NA> 3 2 NA 3
4 SPSS Female 5 4 5 3
5 STATA Female 4 4 3 4
6 SPSS Female 5 4 3 5
I would like to change "Female" to "F" and change "Male" to "M". Of course I can remake a column of c(F,M,,F,F,F), but is it possible to set a command function take "female" and "male" as inputs and outputs "F" and "M"?
CodePudding user response:
We can use substring
in base R
survey$gender <- substring(survery$gender, 1, 1)
Or with sub
survey$gender <- sub("^(.).*", "\\1", survey$gender)
CodePudding user response:
Explicit replacement in base R
survey[["gender"]][survey[["gender"] == "Male"] <- "M"
survey[["gender"]][survey[["gender"] == "Female"] <- "F"
Using tidyverse packages
require(stringr)
require(dplyr)
survey <- mutate(survey, gender = str_sub(gender, 1))
## You might also consider encoding gender as a factor:
survey <- mutate(survey, gender = factor(gender))
CodePudding user response:
A couple extra options for posterity:
#option 1
survey$gender <- as.character(factor(survey$gender,
levels = c("Female", "Male"),
labels = c("F", "M")))
#option 2
survey$gender <- unname(sapply(survey$gender,
\(x) switch(x, "Female" = "F", "Male" = "M")))
#option 3
survey$gender <- ifelse(survey$gender == "Male", "M",
ifelse(survey$gender == "Female", "F", NA))