Home > Back-end >  Difference between my aggregate() output and tidyverse output
Difference between my aggregate() output and tidyverse output

Time:09-22

I am using a dataset of flights. I try to calculate the average arrival and departure delay for different airports (origin). I tried it with the aggregate function:

average_delay <- aggregate(cbind(arr_delay,dep_delay) ~ origin, FUN = mean, data = flights)
print(average_delay, digits = 3)

After that I tried it also with tidyverse:

library(tidyverse)
average_delay_tidy = flights %>%
  group_by(origin) %>% 
  summarise(arr_delay = mean(arr_delay, na.rm = TRUE),
            dep_delay = mean(dep_delay, na.rm = TRUE)) %>%
  mutate_if(is.numeric, round, digits = 3) %>%
  print

However I got different outputs:

Aggregate:

|origin|arr_delay |dep_delay |
:------|:--------:|:--------:|
|EWR   | 9.56     | **15.0**     |
|JFK   | 5.85     | **12.0**     |
|LGA   | 6.11     | 10.3     |  

Tidyverse:

  <chr>      <dbl>     <dbl>
|origin|arr_delay |dep_delay |
:------|:--------:|:--------:|
|EWR   | 9.56     | **15.1**     |
|JFK   | 5.85     | **12.1**     |
|LGA   | 6.11     | 10.3     | 

De difference is small, however, I do not understand how it is possible? Can someone explain to me why these outputs are different? Is it due to the NA values?

Thanks in advance!

CodePudding user response:

The difference is so small that it seemed to be related to rounding as tibble does some print formatting for display, but in aggregate, if there are NAs, specify the na.action argument which by default is na.omit i.e. it will remove the row completely if there is a single NA element in the row. Change it to NULL

aggregate(cbind(arr_delay,dep_delay) ~ origin, FUN = mean, 
    na.rm = TRUE,  data = flights,
        na.action = NULL)

-output

 origin arr_delay dep_delay
1    EWR  9.107055  15.10795
2    JFK  5.551481  12.11216
3    LGA  5.783488  10.34688

To confirm if there are NA elements

 colSums(is.na(flights[c('arr_delay', 'dep_delay')]))
arr_delay dep_delay 
     9430      8255 
  • Related