Home > Net >  Retain/zero some column values based on a column value in R
Retain/zero some column values based on a column value in R

Time:12-30

I am working on data frames (with 2000≤AY≤2005 and i=P0,P1,P2,P3,P4,P5) that may contain all possible AY like this one (a)

   AY    P0   P1  P2  P3  P4  P5
  2000  250    0   0   0   0  45
  2001    0   50   0   0  70  30
  2002   20    0   0   0   0  60
  2003  500    0   0  10   0   0
  2004   10   45  90   0   5   0
  2005   70    0   0  55   0  30

or may not like this one (b)

   AY    P0   P1  P2  P3  P4  P5
  2001  750   50   0   0  70  30
  2002   20   30   0   0   0  60
  2003   60    0  50  10   0   0
  2005   50   65   0  55   0  30

Again, i=P0,P1,P2,P3,P4,P5. Let P0=0, P1=1, P2=2, P3=3, P4=4, and P5=5. I want to keep only the column values for which column/s i satisfies/y AY i≤2005, and equate the rest to 0.

Desired output

(a)

   AY    P0   P1  P2  P3  P4  P5
  2000  250    0   0   0   0  45
  2001    0   50   0   0  70   0
  2002   20    0   0   0   0   0
  2003  500    0   0   0   0   0
  2004   10   45   0   0   0   0
  2005   70    0   0   0   0   0

(b)

   AY    P0   P1  P2  P3  P4  P5
  2001  750   50   0   0  70   0
  2002   20   30   0   0   0   0
  2003   60    0  50   0   0   0
  2005   50    0   0   0   0   0

Here, if AY=2005, then only column value for P0 is kept as it is. If AY=2003, then only column values for P0:P2 are kept as they are. And so on.

How do I do this?

CodePudding user response:

tab1[-1] <- Map(function(P, AY, val) ifelse((P AY) <= 2005, val, 0), 
                as.integer(sub("\\D", "", names(tab1)[-1])), 
                list(tab1$AY), tab1[,-1])
tab2[-1] <- Map(function(P, AY, val) ifelse((P AY) <= 2005, val, 0), 
                as.integer(sub("\\D", "", names(tab2)[-1])), 
                list(tab2$AY), tab2[,-1])

tab1
#     AY  P0 P1 P2 P3 P4 P5
# 1 2000 250  0  0  0  0 45
# 2 2001   0 50  0  0 70  0
# 3 2002  20  0  0  0  0  0
# 4 2003 500  0  0  0  0  0
# 5 2004  10 45  0  0  0  0
# 6 2005  70  0  0  0  0  0

tab2
#     AY  P0 P1 P2 P3 P4 P5
# 1 2001 750 50  0  0 70  0
# 2 2002  20 30  0  0  0  0
# 3 2003  60  0 50  0  0  0
# 4 2005  50  0  0  0  0  0

Data

tab1 <- structure(list(AY = 2000:2005, P0 = c(250L, 0L, 20L, 500L, 10L, 70L), P1 = c(0L, 50L, 0L, 0L, 45L, 0L), P2 = c(0L, 0L, 0L, 0L, 90L, 0L), P3 = c(0L, 0L, 0L, 10L, 0L, 55L), P4 = c(0L, 70L, 0L, 0L, 5L, 0L), P5 = c(45L, 30L, 60L, 0L, 0L, 30L)), class = "data.frame", row.names = c(NA, -6L))
tab2 <- structure(list(AY = c(2001L, 2002L, 2003L, 2005L), P0 = c(750L, 20L, 60L, 50L), P1 = c(50L, 30L, 0L, 65L), P2 = c(0L, 0L, 50L, 0L), P3 = c(0L, 0L, 10L, 55L), P4 = c(70L, 0L, 0L, 0L), P5 = c(30L, 60L, 0L, 30L)), class = "data.frame", row.names = c(NA, -4L))
  • Related