I have a data.table that is supposed to remove all rows per group
until a negative number is met in value
(including the row with the negative number itself). However, if there is no negative number in value
I would like to keep all rows from that group.
# Example data
group = rep(1:4,each=3)
value = c(1,2,3,1,-2,3,1,2,-3,-1,2,3)
DT = data.table(group,value)
> DT
group value row_idx
1: 1 1 1
2: 1 2 2
3: 1 3 3
4: 2 1 1
5: 2 -2 2
6: 2 3 3
7: 3 1 1
8: 3 2 2
9: 3 -3 3
10: 4 -1 1
11: 4 2 2
12: 4 3 3
My attempt so far:
DT[,row_idx := seq_len(.N), by = "group"] #append row index per group
DT[,.SD[row_idx > (which(sign(value) == -1))], by = "group"]
group value row_idx
1: 2 3 3
2: 4 2 2
3: 4 3 3
In this example group 1
is being deleted although I would like to keep it as no negative number is present in this group. I can check for the presence/absence of negative signs in value
by DT[,(-1) %in% sign(value), by = "group"]
but I do not know how to use this to achieve what I want.
CodePudding user response:
We may use a if/else
condition
library(data.table)
DT[DT[, if(any(sign(value) < 0))
.I[row_idx > (which(sign(value) == -1))] else .I, by = group]$V1]
-output
group value row_idx
<int> <num> <int>
1: 1 1 1
2: 1 2 2
3: 1 3 3
4: 2 3 3
5: 4 2 2
6: 4 3 3
Or slightly more compact option
DT[DT[, .I[seq_len(.N) > match(-1, sign(value), nomatch = 0)], group]$V1]
group value
<int> <num>
1: 1 1
2: 1 2
3: 1 3
4: 2 3
5: 4 2
6: 4 3
CodePudding user response:
DT[, .SD[if (min(value) > 0) TRUE else -(1:which.max(value < 0))], by = group]
# group value
# <int> <num>
# 1: 1 1
# 2: 1 2
# 3: 1 3
# 4: 2 3
# 5: 4 2
# 6: 4 3