id hire_month sep_month wage_jan wage_feb wage_mar wage_apr
1 1 2 3 740 780 780 780
2 1 4 0 890 890 890 890
3 2 3 5 550 550 550 550
4 2 5 10 890 250 250 400
wage_may wage_jun wage_jul wage_aug wage_sep wage_oct wage_nov
1 780 780 780 780 780 780 780
2 890 890 890 890 890 790 250
3 550 550 550 550 550 550 550
4 500 890 600 750 890 300 300
wage_dec
1 780
2 300
3 550
4 300
structure(list(id = c(1L, 1L, 2L, 2L), hire_month = c(2L, 4L,
3L, 5L), sep_month = c(3L, 0L, 5L, 10L), wage_jan = c(740L, 890L,
550L, 890L), wage_feb = c(780L, 890L, 550L, 250L), wage_mar = c(780L,
890L, 550L, 250L), wage_apr = c(780L, 890L, 550L, 400L), wage_may = c(780L,
890L, 550L, 500L), wage_jun = c(780L, 890L, 550L, 890L), wage_jul = c(780L,
890L, 550L, 600L), wage_aug = c(780L, 890L, 550L, 750L), wage_sep = c(780L,
890L, 550L, 890L), wage_oct = c(780L, 790L, 550L, 300L), wage_nov = c(780L,
250L, 550L, 300L), wage_dec = c(780L, 300L, 550L, 300L)), class = "data.frame", row.names = c(NA,
-4L))
I would like to take the differences between wages based on hire_month and lag(sep_month) -- separation month for the previous row and same id. For example, if lag(sep_month) is 4 (april) and hire_month is 7 (july), I would like to take the differences between wage_jul and wage_apr (wage_apr being the one on the previous row). So the wages I need to take differences between are actually in different rows, because each row in my dataset is a contract.
I have a large dataset, so I would like a way to automate this association.
CodePudding user response:
Updated
The OP clarified that the sep month value comes from the row above, which means that there are n-1 rows for each ID, where n is the number of rows for that ID. In the example above, there are two rows per ID, we will estimate the difference for the 2nd (and last) row only for each ID
data$diff = as_tibble(data) %>%
mutate(sep_wage = apply(.,1,function(x) x[x[3] 3])) %>%
group_by(id) %>%
mutate(sep_wage = lag(sep_wage)) %>% ungroup() %>%
apply(.,MARGIN = 1, function(x) x[x[2] 3] - x[16])
Output
id hire_month sep_month wage_jan wage_feb wage_mar wage_apr wage_may wage_jun wage_jul wage_aug wage_sep wage_oct wage_nov wage_dec diff
1 1 2 3 740 780 780 780 780 780 780 780 780 780 780 780 NA
2 1 4 0 890 890 890 890 890 890 890 890 890 790 250 300 110
3 2 3 5 550 550 550 550 550 550 550 550 550 550 550 550 NA
4 2 5 10 890 250 250 400 500 890 600 750 890 300 300 300 -50