Home > Blockchain >  selecting rows with condition in data table R
selecting rows with condition in data table R

Time:10-17

I have a data table as follows:

DT <- data.table(ID = rep(c('A','B','C'),3), Month = rep(c(1,2,3),3),
             Tier = c(rep('Reg',3),'Reg','Reg','Late',rep('Reg',3)))

I want to select the IDs that in certain months (in this case let's say all of months 1, 2, and 3) have the Tier value of Reg. So in this example it should only return IDs A and B, whose Tier is Reg in all the Month. It seems to be simple but I am somehow stuck. I tried:

unique(DT[Month %in% c(1,2,3) & Tier == 'Reg',ID])

But it returns A,B,C.

Thanks in advance.

CodePudding user response:

One possible solution:

DT[,.(select=all(Tier=='Reg')),by=ID][select==T,ID]

[1] "A" "B"

CodePudding user response:

Does this satisfy your needs?

library(data.table)
DT <- data.table(ID = rep(c('A','B','C'),3), Month = rep(c(1,2,3),3),
                 Tier = c(rep('Reg',3),'Reg','Reg','Late',rep('Reg',3)))

DTX <- DT[, .(KEEP = length(unique(Tier)) == 1), by = ID]
DTX
#>    ID  KEEP
#> 1:  A  TRUE
#> 2:  B  TRUE
#> 3:  C FALSE
DTX[KEEP == TRUE, ID]
#> [1] "A" "B"

In case you want to manipulate your data for one case only, replace . by :=.

  • Related