Home > Software design >  convert Days in occurences in R
convert Days in occurences in R

Time:11-09

I have a data of this form :

ID    FIRM   PAY_START    PAY_end
1      A        1         359
1      A        360       500
1      B        720       780
1      C        800       930
1      D        934       1200
2      E        1         316
2      E        320       360
3      F        1         339
3      F        340       342
3      G        345       600

I want First to convert PAY_START( which are in Days) into occurences in the Panel 1, 2, 3 like this:

ID    FIRM   PAY_START    PAY_end  Occurence
1      A        1         359      1
1      A        360       500      2
1      B        720       780      3
1      C        800       930      4
1      D        934       1200     5
2      E        1         316      1
2      E        320       360      2
3      F        1         339      1
3      F        340       342      2
3      G        345       600      3

And I want to compute the lags of Pay_end:

ID    FIRM   PAY_START    PAY_end  Occurence   Lag_Pay_end
1      A        1         359      1           0
1      A        360       500      2           141
1      B        720       780      3           280
1      C        800       930      4           150
1      D        934       1200     5           270
2      E        1         316      1           0
2      E        320       360      2           44
3      F        1         339      1           0
3      F        340       342      2           3
3      G        345       600      3           258

Thank you for your help!

CodePudding user response:

a data.table approach

library(data.table)
DT <- fread("ID    FIRM   PAY_START    PAY_end
1      A        1         359
1      A        360       500
1      B        720       780
1      C        800       930
1      D        934       1200
2      E        1         316
2      E        320       360
3      F        1         339
3      F        340       342
3      G        345       600")

DT[, Occurence := rowid(ID)]
DT[, Lag_Pay_end := PAY_end - shift(PAY_end, type = "lag", fill = PAY_end[1]),
   by = .(ID)]
#    ID FIRM PAY_START PAY_end Occurence Lag_Pay_end
# 1:  1    A         1     359         1           0
# 2:  1    A       360     500         2         141
# 3:  1    B       720     780         3         280
# 4:  1    C       800     930         4         150
# 5:  1    D       934    1200         5         270
# 6:  2    E         1     316         1           0
# 7:  2    E       320     360         2          44
# 8:  3    F         1     339         1           0
# 9:  3    F       340     342         2           3
#10:  3    G       345     600         3         258
   

CodePudding user response:

Performing group-wise calculations using ave.

transform(dat, Occurence=ave(PAY_START, ID, FUN=\(x) as.numeric(as.factor(x))),
          Lag_Pay_end=ave(PAY_end, ID, FUN=\(x) c(0, diff(x))))
#    ID FIRM PAY_START PAY_end Occurence Lag_Pay_end
# 1   1    A         1     359         1           0
# 2   1    A       360     500         2         141
# 3   1    B       720     780         3         280
# 4   1    C       800     930         4         150
# 5   1    D       934    1200         5         270
# 6   2    E         1     316         1           0
# 7   2    E       320     360         2          44
# 8   3    F         1     339         1           0
# 9   3    F       340     342         2           3
# 10  3    G       345     600         3         258

Data:

dat <- structure(list(ID = c(1L, 1L, 1L, 1L, 1L, 2L, 2L, 3L, 3L, 3L), 
    FIRM = c("A", "A", "B", "C", "D", "E", "E", "F", "F", "G"
    ), PAY_START = c(1L, 360L, 720L, 800L, 934L, 1L, 320L, 1L, 
    340L, 345L), PAY_end = c(359L, 500L, 780L, 930L, 1200L, 316L, 
    360L, 339L, 342L, 600L)), class = "data.frame", row.names = c(NA, 
-10L))
  • Related