I'm having trouble building a data table that matches numbers based on two conditions (ID and date). Below is an example of a table snippet containing batch data.
ID | Power | Fuel | Starting_date | Shutting_down_date |
---|---|---|---|---|
El_Bel | 344 | WB | 1983 | 2030 |
El_Opo | 256 | WK | 1987 | 2027 |
El_Tur | 400 | WB | 2019 | 2049 |
The question is how do I effectively match this data so that the data in the "Power" column is matched until the last year of operation by column "Shutting_down_date" is reached.
Date | El_Bel | El_Opo | El_Tur |
---|---|---|---|
2017 | |||
2018 | |||
2019 | |||
2020 | |||
2021 |
Many thanks for any suggestions.
CodePudding user response:
Let us call the first dataframe x
and the second data frame y
and let us further assume that each ID only occurs once in the first table. The problem is that you have a different number of years for each ID which means that they cannot be stored in a data.frame
(requires all columns to have the same length). You can use a list
, though:
result <- list()
for (i in 1:nrow(x)) {
id <- x[i,"ID"]
end_date <- x[i,"Shutting_down_date"]
result[[id]] <- subset(y[,c("Date",id)], Date <= end_date)
}
Then you can query the results as result[["El_Bel"]]
or result$El_Bel
etc.