Home > Enterprise >  How do I average values using group_by and summarise where some entries are NA?
How do I average values using group_by and summarise where some entries are NA?

Time:10-04

I have the following dummy data.

Country <- c("Afghanistan", "Afghanistan", "Afghanistan", "Albania", "Albania", "Albania")
Year <- c(2001, 2002, 2003, 2001, 2002, 2003)
Count <- c(15, 18, NA, 12, 17, 19)

I want to find the mean of count by country. My code so far is:

df %>% 
  group_by(Country, .drop = FALSE) %>% 
  summarise(Average_Count = mean(Count))

But this returns an NA value for Afghanistan, when I'm hoping for a value of 16.5.

I tried changing the .drop argument in the group_by function as you can see, but perhaps haven't done it right .. or maybe it's an issue with the mean argument given one of the values is NA? Any help appreciated!

CodePudding user response:

Just na.rm=TRUE inside mean()

> df %>% 
    group_by(Country, .drop = FALSE) %>% 
    summarise(Average_Count = mean(Count, na.rm=TRUE))
# A tibble: 2 × 2
  Country     Average_Count
  <chr>               <dbl>
1 Afghanistan          16.5
2 Albania              16
  • Related