Home > other >  Modify data.table in function and aggregate
Modify data.table in function and aggregate

Time:10-28

I'd like to have a function that first modifies the data.table (if certain criterias are fulfilled) and than aggregates some values.

df.flights <- flights
setDT(df.flights)


aggregate_flights <- function(x, test = FALSE) {
  
  
  if (test == TRUE) {
    df_flights_red <- df.flights[tailnum == "N9EAMQ" | tailnum == "N950UW" | tailnum == "N460WN"]
  } else {
    df_flights_red <- df.flights
  }
  
  y <- df_flights_red[, .(air_time = sum(air_time, na.rm = TRUE),
             distance = sum(distance, na.rm = TRUE)), 
         by = .(month, x)]
  return(y)
}

agg <- aggregate_flights(df_flights_red[[tailnum]], TRUE)

I always get the error message that object "df_flights_red" can't be found. It seems, that my call of the function isn't correct

agg <- aggregate_flights(df_flights_red[[tailnum]], TRUE)

How do I have to make this call?

CodePudding user response:

The error message makes sense since you don't have any object named df_flights_red in your global environment. df_flights_red is present inside the function which you cannot access from outside.

Provided I have understood you clearly here is what you can use

  • Pass data as first argument to the function. This is not needed but is a good practice.

  • Pass column name as string ('tailnum')

  • test == TRUE is redundant, use only test.

  • A == 'a' | A == 'b' | A == 'c' can be changed to A %in% c('a', 'b', 'c').

library(data.table)

df.flights <- nycflights13::flights
setDT(df.flights)


aggregate_flights <- function(data, x, test = FALSE) {
  
  if (test) {
    df_flights_red <- data[tailnum %in%  c("N9EAMQ", "N950UW" ,"N460WN")]
  } else {
    df_flights_red <- data
  }
  
  y <- df_flights_red[, .(air_time = sum(air_time, na.rm = TRUE),
                          distance = sum(distance, na.rm = TRUE)), 
                      by = c('month', x)]
  return(y)
}

agg <- aggregate_flights(df.flights, 'tailnum', TRUE)
agg

#    month tailnum air_time distance
# 1:     1  N9EAMQ     2543    15944
# 2:     1  N950UW      597     2608
# 3:     1  N460WN      471     3040
# 4:    10  N9EAMQ     1346     9040
# 5:    10  N460WN      569     3966
# 6:    10  N950UW      728     3404
# 7:    11  N950UW      481     1856
#...
#...
  • Related