Home > OS >  aggregate function in R, sum of NAs are 0
aggregate function in R, sum of NAs are 0

Time:07-24

I saw a list of questions asked in stack overflow, regarding the following, but never got a satisfactory answer. I will follow up on the following question Blend of na.omit and na.pass using aggregate?

> test <- data.frame(name = rep(c("A", "B", "C"), each = 4),
  var1 = rep(c(1:3, NA), 3),
  var2 = 1:12,
  var3 = c(rep(NA, 4), 1:8))

> test
   name var1 var2 var3
1     A    1    1   NA
2     A    2    2   NA
3     A    3    3   NA
4     A   NA    4   NA
5     B    1    5    1
6     B    2    6    2
7     B    3    7    3
8     B   NA    8    4
9     C    1    9    5
10    C    2   10    6
11    C    3   11    7
12    C   NA   12    8

When I try out the given solution, instead of mean I try to find out the sum

aggregate(. ~ name, test, FUN = sum, na.action=na.pass, na.rm=TRUE)

the solution doesn't work as usual. Accordingly, it converts NA to 0, So the sum of NAs is 0. It displays it as 0 instead of NaN.

Why doesn't the following work for FUN=sum.And how to make it work?

CodePudding user response:

Create a lambda function with a condition to return NaN when all elements are NA

aggregate(. ~ name, test, FUN = function(x) if(all(is.na(x))) NaN
     else sum(x, na.rm = TRUE), na.action=na.pass)

-output

  name var1 var2 var3
1    A    6   10  NaN
2    B    6   26   10
3    C    6   42   26

It is an expected behavior with sum and na.rm = TRUE. According to ?sum

the sum of an empty set is zero, by definition.

> sum(c(NA, NA), na.rm = TRUE)
[1] 0
  • Related