I am working on data frames (with 2000≤AY≤2005 and i=P0,P1,P2,P3,P4,P5) that may contain all possible AY like this one (a)
AY P0 P1 P2 P3 P4 P5
2000 250 0 0 0 0 45
2001 0 50 0 0 70 30
2002 20 0 0 0 0 60
2003 500 0 0 10 0 0
2004 10 45 90 0 5 0
2005 70 0 0 55 0 30
or may not like this one (b)
AY P0 P1 P2 P3 P4 P5
2001 750 50 0 0 70 30
2002 20 30 0 0 0 60
2003 60 0 50 10 0 0
2005 50 65 0 55 0 30
Again, i=P0,P1,P2,P3,P4,P5. Let P0=0, P1=1, P2=2, P3=3, P4=4, and P5=5. I want to keep only the column values for which column/s i satisfies/y AY i≤2005, and equate the rest to 0.
Desired output
(a)
AY P0 P1 P2 P3 P4 P5
2000 250 0 0 0 0 45
2001 0 50 0 0 70 0
2002 20 0 0 0 0 0
2003 500 0 0 0 0 0
2004 10 45 0 0 0 0
2005 70 0 0 0 0 0
(b)
AY P0 P1 P2 P3 P4 P5
2001 750 50 0 0 70 0
2002 20 30 0 0 0 0
2003 60 0 50 0 0 0
2005 50 0 0 0 0 0
Here, if AY=2005, then only column value for P0 is kept as it is. If AY=2003, then only column values for P0:P2 are kept as they are. And so on.
How do I do this?
CodePudding user response:
tab1[-1] <- Map(function(P, AY, val) ifelse((P AY) <= 2005, val, 0),
as.integer(sub("\\D", "", names(tab1)[-1])),
list(tab1$AY), tab1[,-1])
tab2[-1] <- Map(function(P, AY, val) ifelse((P AY) <= 2005, val, 0),
as.integer(sub("\\D", "", names(tab2)[-1])),
list(tab2$AY), tab2[,-1])
tab1
# AY P0 P1 P2 P3 P4 P5
# 1 2000 250 0 0 0 0 45
# 2 2001 0 50 0 0 70 0
# 3 2002 20 0 0 0 0 0
# 4 2003 500 0 0 0 0 0
# 5 2004 10 45 0 0 0 0
# 6 2005 70 0 0 0 0 0
tab2
# AY P0 P1 P2 P3 P4 P5
# 1 2001 750 50 0 0 70 0
# 2 2002 20 30 0 0 0 0
# 3 2003 60 0 50 0 0 0
# 4 2005 50 0 0 0 0 0
Data
tab1 <- structure(list(AY = 2000:2005, P0 = c(250L, 0L, 20L, 500L, 10L, 70L), P1 = c(0L, 50L, 0L, 0L, 45L, 0L), P2 = c(0L, 0L, 0L, 0L, 90L, 0L), P3 = c(0L, 0L, 0L, 10L, 0L, 55L), P4 = c(0L, 70L, 0L, 0L, 5L, 0L), P5 = c(45L, 30L, 60L, 0L, 0L, 30L)), class = "data.frame", row.names = c(NA, -6L))
tab2 <- structure(list(AY = c(2001L, 2002L, 2003L, 2005L), P0 = c(750L, 20L, 60L, 50L), P1 = c(50L, 30L, 0L, 65L), P2 = c(0L, 0L, 50L, 0L), P3 = c(0L, 0L, 10L, 55L), P4 = c(70L, 0L, 0L, 0L), P5 = c(30L, 60L, 0L, 30L)), class = "data.frame", row.names = c(NA, -4L))