I have a dataframe consisting of 515 integer columns and 2 643 246 rows, from which I would like to subset an unkown number of columns and aggregate the data to a single column showing the sum, by two group-columns.
To do the first part I've used the selection-function from data.table like this,
TestData[,c(Kattegori_Henter("Medicine"), "id", "year"), with = FALSE]
Where Kattegori_Henter
is a function returning the name of the columns I would like to select, from a different dataset. From this selection I then want to do the aggregation.
I have attempted a couple different solutions in data.table to perform this aggregation, without getting a result. Given the intro-data.table vignette I believed the solution would be to add
TestData[,c(Kattegori_Henter("Medicine"), "id", "year"), with = FALSE, lapply(.SD,sum, na.rm =
TRUE), by = c(id, year)]
However, this returns the error Provide either by= or keyby= but not both
, which I do not understand the meaning of, and without google giving any good results.
I then attempted:
TestData[,c(Kattegori_Henter("Medicine"), "id", "year"), with = FALSE, a := sum(1.ncol), by = c(id, year)]
Which didn't result in anything at all, other than returning the subsetted dataframe.
The reasoning behind doing this is that I would like to do use lapply on the kattegori_henter
function, aggregating the 525 collumns into a set of cattegories.
Thanks in advance for all help!
Edit: Attempted
TestData[,c(Kattegori_Henter("Medicine"), "id", "year"), with =
FALSE][, lapply(.SD, sum, na.rm = TRUE), by = c("id", "year")]
As mentioned in the comments. The results was the same as the 2nd code above, returning an unchanged dataframe.
Edit 2: Removed this from the question, due to a comment on it not producing the wanted results: ", which would be equal to the tidyverse-code:
Test2 %>%
group_by(id, year) %>%
summarise(a = sum(1:ncol(.), na.rm = TRUE)) "
CodePudding user response:
I think the code you're looking for is likely:
TestData[, .(a = sum(.SD)), by = .(id, year), .SDcols = Kattegori_Henter("Medicine")]