Home > Software engineering >  How to combine two columns using some conditions?
How to combine two columns using some conditions?

Time:11-30

I know that this is not the big trouble but I'm juts new to this. I have this output obtained from merging two dataframes. Each one has a column that corresponds to the sex for each participant of an event.

Sex.x Sex.y
M M
F F
F M
M M
F F
M M
NA M
F F

Desired output: the two columns mixed in one that has "?" when their two values doesn't match and that conserves the only value if there is a NA in the adjacent cell.

F_Sex
M
F
?
M
F
M
M
F

I was trying to do it with dplyr package but I just get to this code. I know I need to use if_else but after many tries, I have nothing.

all_data1 <- all_data %>% unite(F_sexo, c(sexo.x, sexo.y), sep = "-", remove = TRUE)

Thanks a lot in advance.

CodePudding user response:

Here is one idea. Use coalesce first to get the rows with only one NA to have the correct sex. And then use an ifelse to change those rows with different sexes to ?.

Notice that if you have a row with both columns are NA, this solution will return NA. Please make sure this is the behavior you want.

library(dplyr)

dat2 <- dat %>%
  mutate(Sex = coalesce(.$Sex.x, .$Sex.y)) %>%
  mutate(Sex = ifelse(Sex.x != Sex.y & !is.na(Sex.x) & !is.na(Sex.y), "?", Sex))
dat2
#   Sex.x Sex.y Sex
# 1     M     M   M
# 2     F     F   F
# 3     F     M   ?
# 4     M     M   M
# 5     F     F   F
# 6     M     M   M
# 7  <NA>     M   M
# 8     F     F   F

DATA

dat <- read.table(text = "Sex.x  Sex.y 
M  M                
F  F
F  M
M  M
F  F
M  M
NA M
 F F", header = TRUE)

CodePudding user response:

Check this solution. The data is assigned as df.

df %>% mutate(F_sex = case_when(Sex.x == Sex.y ~ Sex.x,
                                TRUE ~"?"))

or

df %>% mutate(F_sex = case_when(is.na(Sex.x) ~ Sex.y,
                                is.na(Sex.y) ~ Sex.x,
                                Sex.x == Sex.y ~ Sex.x,
                                TRUE ~"?"))
  • Related