I am trying to find the five shortest minimum distances, called min_dist, by origin/destination in the nycflights13 package in R Studio. The result should be a tibble with 5 rows and 3 columns(origin, dest, and min_dist).
I am a beginner and this is what I have so far:
Q3 <- flights %>%
arrange(flights, distance)
group_by(origin) %>%
summarise(min_dist = origin/dest)
I am getting the error: Error in group_by(origin) : object 'origin' not found. Any hints on what to do? A lot of the other questions are similar to this so I want to figure out how to do these. Thank you
CodePudding user response:
This may be done by select
ing the columns of interest, get the distinct
rows and apply the slice_min
with n = 5
library(dplyr)
flights %>%
select(origin, dest, min_distance = distance)%>%
distinct %>%
slice_min(n = 5, order_by = min_distance, with_ties = FALSE)
-output
# A tibble: 5 × 3
origin dest min_distance
<chr> <chr> <dbl>
1 EWR LGA 17
2 EWR PHL 80
3 JFK PHL 94
4 LGA PHL 96
5 EWR BDL 116
CodePudding user response:
We could use top_n
with negative sign:
library(nycflights13)
library(dplyr)
flights %>%
select(origin, dest, distance) %>%
distinct() %>%
top_n(-5) %>%
arrange(distance)
origin dest distance
<chr> <chr> <dbl>
1 EWR LGA 17
2 EWR PHL 80
3 JFK PHL 94
4 LGA PHL 96
5 EWR BDL 116