Home > Back-end >  How to add a total distance column in 'flights' dataset?
How to add a total distance column in 'flights' dataset?

Time:03-29

I am working with 'flights' dataset from 'nycflights13' package in R.

I want to add a column which adds the total distance covered by each 'carrier' in 2013. I got the total distance covered by each carrier and have stored the value in a new variable.
We have 16 carriers so how I bind a row of 16 numbers with a data frame of many more rows.

carrier <- flights %>%
group_by(carrier) %>%
select(distance) %>%
summarize(TotalDistance = sum(distance)) %>%
arrange(desc(TotalDistance))

How can i add the sum of these carrier distances in a new column in flights dataset?

Thank you for your time and effort here.]

PS. I tried running for loop, but it doesnt work. I am new to programming

CodePudding user response:

Use mutate instead:

flights %>%
  group_by(carrier) %>%
  mutate(TotalDistance = sum(distance)) %>%
  ungroup()-> carrier

We can also use left_join.

library(nycflights13)
data("flights")
library(dplyr)

flights %>%
  left_join(flights %>%
              group_by(carrier) %>%
              select(distance) %>%
              summarize(TotalDistance = sum(distance)) %>%
              arrange(desc(TotalDistance)), by='carrier')-> carrier

This will work even if you don't use arrange at the end.

  • Related