How to mearge two column in a table based on certain condition in R-CodePudding

I have a table in R like this:

      x     Y
1     2     1
2     1     1
3    NA     1
4     2    NA
5     1     2
6     2     2
7     1     1

and what I'm hoping to do is make a new column called xy which bases on if there is a 1 exist in either x or y.

For example, if x is 1 and y is 2 then the xy should be 1 ; if x is NAand y is 1 then the xyshould be 1. If both x and y is 2 then xyshould be 2.

The priority of the categorical variables 1, 2 and NA is 1>2>NA. In short what my desired output looks like this:

      x     Y     XY
1     2     1     1
2     1     1     1
3    NA     1     1
4     2    NA     2
5     NA   NA     NA
6     2     2     2
7     1     1     1

I'm new to R and trying to trim my data. Thank you for your help! I'm really appreciated:)

CodePudding user response：

Just a simple case_when, too long for a comment. Let me know if it works:

library(dplyr)

df %>%
  mutate(XY = case_when(
    x == 1 | Y == 1 ~ 1,
    x == 2 | Y == 2 ~ 2,
    TRUE ~ NA
    )
  )

CodePudding user response：

Try this

library(dplyr)

df |> rowwise() |>
   mutate(z1 = coalesce(c_across(x) , 0) , z2 = coalesce(c_across(Y) , 0)) |>
   mutate(XY = case_when(any(c_across(z1:z2) == 1) ~ 1 , any(c_across(z1:z2) == 2) ~ 2)) |>
   select(-z1 , -z2) |> ungroup() -> ans

output

# A tibble: 7 × 3
      x     Y    XY
  <int> <int> <dbl>
1     2     1     1
2     1     1     1
3    NA     1     1
4     2    NA     2
5    NA    NA    NA
6     2     2     2
7     1     1     1

data

df <- structure(list(x = c(2L, 1L, NA, 2L, NA, 2L, 1L), Y = c(1L, 1L, 
1L, NA, NA, 2L, 1L)), row.names = c("1", "2", "3", "4", "5", 
"6", "7"), class = "data.frame")