Home > Software engineering >  R How to Groupby and apply different summary function to different field in a dataset
R How to Groupby and apply different summary function to different field in a dataset

Time:10-16

I'm working on a dataset which I want to group by 'GpsProvider', and apply the count function on 'BookingID' and sum function on "Distance Travelled".

GpsService_Bokn_Dist <- truck_log %>%
select(!Minimum_kms_to_be_covered_in_a_day) %>%
group_by(GpsProvider) %>%
summarise(across(BookingID, count), across(TRANSPORTATION_DISTANCE_IN_KM, sum))

I also tried this line of code but got another error

GpsService_Bokn_Dist <- truck_log %>%
select(!Minimum_kms_to_be_covered_in_a_day) %>%
group_by(GpsProvider) %>%
summarise(across(BookingID, n()), across(TRANSPORTATION_DISTANCE_IN_KM, sum()))

the error in the code is from the summarise function downward.

CodePudding user response:

You have to provide always a reproducible example of your data for others to be able to help, you can simulate one or dput yours

library(dplyr)

truck_log <- data.frame(
  booking_id = 1:20,
  gps_provider = sample(1:3, 20, replace = TRUE),
  distance_in_km = runif(n = 20, min = 1, max = 100)
)

head(truck_log)
#>   booking_id gps_provider distance_in_km
#> 1          1            1       72.02301
#> 2          2            3       45.72558
#> 3          3            2       16.43956
#> 4          4            3       94.85217
#> 5          5            2       40.65340
#> 6          6            3       79.95299

truck_log %>%
  group_by(gps_provider) %>%
  summarise(trips = n(), toal_km = sum(distance_in_km, na.rm = TRUE))
#> # A tibble: 3 × 3
#>   gps_provider trips toal_km
#>          <int> <int>   <dbl>
#> 1            1     5    289.
#> 2            2     3    143.
#> 3            3    12    648.

Created on 2022-10-15 with reprex v2.0.2

  •  Tags:  
  • r
  • Related