I have a data looks like below, I would like to skip 2 rows after max index of certain types (3 and 4). For example, I have two 4s in my table, but I only need to remove 2 rows after the second 4. Same for 3, I only need to remove 2 rows after the third 3.
-----------------
| grade | type |
-----------------
| 93 | 2 |
-----------------
| 90 | 2 |
-----------------
| 54 | 2 |
-----------------
| 36 | 4 |
-----------------
| 31 | 4 |
-----------------
| 94 | 1 |
-----------------
| 57 | 1 |
-----------------
| 16 | 3 |
-----------------
| 11 | 3 |
-----------------
| 12 | 3 |
-----------------
| 99 | 1 |
-----------------
| 99 | 1 |
-----------------
The desired output would be:
-----------------
| grade | type |
-----------------
| 93 | 2 |
-----------------
| 90 | 2 |
-----------------
| 54 | 2 |
-----------------
| 36 | 4 |
-----------------
| 31 | 4 |
-----------------
| 16 | 3 |
-----------------
| 11 | 3 |
-----------------
| 12 | 3 |
-----------------
Here is the code of my example:
data <- data.frame(grade = c(93,90,54,36,31,94,57,16,11,12,99,99), type = c(2,2,2,4,4,1,1,3,3,3,1,1))
Could anyone give me some hints on how to approach this in R? Thanks a bunch in advance for your help and your time!
CodePudding user response:
library(dplyr)
data.frame(grade = c(93, 90, 54, 36, 31, 94, 57, 16, 11, 12, 99, 99),
type = c(2, 2, 2, 4, 4, 1, 1, 3, 3, 3, 1, 1)) %>%
mutate(large_shadow = slider::slide_dbl(type, ~sum(.x >= 3), .before = 2, .after = -2)) %>%
filter(large_shadow < 1)
grade type large_shadow
1 93 2 0
2 90 2 0
3 54 2 0
4 36 4 0
5 31 4 0
6 16 3 0
7 11 3 0
Or a base R approach:
df$large = df$type >= 3
df$shadow = c(0,0,df$large[1:(nrow(df)-2)])
df <- df[df$shadow == 0, 1:2]
CodePudding user response:
Using some indexing:
data[-(nrow(data) - match(c(3,4), rev(data$type)) 1 rep(1:2, each=2)),]
# grade type
#1 93 2
#2 90 2
#3 54 2
#4 36 4
#5 31 4
#8 16 3
#9 11 3
#10 12 3
Or more generically:
vals <- c(3,4)
data[-(nrow(data) - match(vals, rev(data$type)) 1 rep(1:2, each=length(vals))),]
The logic is to match the first instance of each value to the reversed values in the column, then spin that around to give the original row index, then add 1 and 2 to the row indexes, then drop these rows.
CodePudding user response:
data[-c(max(which(data$type==3)) 1:2,max(which(data$type==4)) 1:2),]
# grade type
# 1 93 2
# 2 90 2
# 3 54 2
# 4 36 4
# 5 31 4
# 8 16 3
# 9 11 3
# 10 12 3
CodePudding user response:
Similar to Ric, but I find it a bit easier to read (way more verbose, though):
idx = data %>% mutate(id = row_number()) %>%
filter(type %in% 3:4) %>% group_by(type) %>% filter(id == max(id)) %>% pull(id)
data[-c(idx 1, idx 2),]