Home > Software engineering >  How to sum across rows with all NAs to be 0/NA
How to sum across rows with all NAs to be 0/NA

Time:05-17

I have a dataframe:

dat <- data.frame(X1 = c(0, NA, NA),
                  X2 = c(1, NA, NA),
                  X3 = c(1, NA, NA),
                  Y1 = c(1, NA, NA),
                  Y2 = c(NA, NA, NA),
                  Y3 = c(0, NA, NA))

I want to create a composite score for X and Y variables. This is what I have so far:

clean_dat <- dat %>% rowwise() %>% mutate(X = sum(c(X1, X2, X3), na.rm = T),
                                          Y = sum(c(Y1, Y2, Y3), na.rm = T))

However, I want the composite score for the rows with all NAs (i.e. rows 2 and 3) to be 0 in the column X and Y. Does anyone know how to do this?

Edit: I'd like to know how I can make X and Y in rows 2 and 3 NA too.

Thanks so much!

CodePudding user response:

By default, sum or rowSums return 0 when we use na.rm = TRUE and when all the elements are NA. To prevent this either use an if/else or case_when approach i.e. determine whether there are any non-NA elements with if_any, then take the rowSums of the concerned columns within case_when (by default the TRUE will return NA)

library(dplyr)
dat %>% 
  mutate(X = case_when(if_any(starts_with('X'), complete.cases) 
    ~ rowSums(across(starts_with('X')), na.rm = TRUE)),
   Y = case_when(if_any(starts_with('Y'), complete.cases) ~ 
    rowSums(across(starts_with('Y')), na.rm = TRUE)) )

-output

  X1 X2 X3 Y1 Y2 Y3  X  Y
1  0  1  1  1 NA  0  2  1
2 NA NA NA NA NA NA NA NA
3 NA NA NA NA NA NA NA NA
  • Related