Home > OS >  Operating across a list of data.tables
Operating across a list of data.tables

Time:11-26

Although this may seem like a straightforward task to some, as a beginner in R it has been frustrating! The task is as follows. I have a table with the following columns:

colnames(gov_data)
 [1] "year"               "quarter"            "employed"          
 [4] "newhires"           "separations"        "jobscreated"       
 [7] "jobsdestroyed"      "state"              "mw"                
[10] "teen_wage"          "teen_pop"           "adult_wage"        
[13] "teen_share_working" "unemp_primemale"    "recession"         
[16] "period" 

Using state_list<-split(gov_data, gov_data$state) I now have a list of data.tables corresponding to each state. Within each of these data.tables, I want to order by date. Here is how I did that. If this is inefficient, I welcome your alternatives!

orderfun <- function (x) {
  x[order(period)]
}

lapply(state_list, orderfun)

I now want to add a column labeled "change_mw" which corresponds to the change in the "mw" column. I know how to do that to a single data.table. I would create a column that lags so its the value of "mw" in t-1 and then take the difference between those two columns: one_table[,`:=` (mw_t_minus_1 = shift(mw,n=1,type="lag"), change_mw = mw - mw_t_minus_1) ][, mw_t_minus_1 = NULL ] How can I do this across multiple data.tables in a list? Is it even possible to use the data.table [i,j,by] in this instance? How would you go about this task? Once again, your help is very much appreciated!

CodePudding user response:

Here is an example that does similar, I'd be able to get closer with proper demo data

library(data.table)
dtCars <- data.table(mtcars, keep.rownames=TRUE)

dtCars[order(hp), change:= hp-shift(hp), by=cyl]
  • Related