I have this type of data, where Sequ
is a grouping variable:
df <- data.frame(
Sequ = c(1,1,1,1,
2,2,2,
3,3,3,
4,4,4,4,
5,5,5,
6,6,6,6),
Speaker = c("A","B",NA,"A",
"B",NA,"C",
"A",NA,"A",
"A","C",NA,"A",
"A",NA,"C",
"B","A",NA,"C")
)
For each Sequ
I want to remove the second row on the condition that its Speaker
value is not NA
. I've tried this but it removes the whole Sequ
:
library(dplyr)
df %>%
group_by(Sequ) %>%
filter(!is.na(nth(Speaker,2)))
How can I obtain this desired output:
df
1 1 A
2 1 <NA>
3 1 A
4 2 B
5 2 <NA>
6 2 C
7 3 A
8 3 <NA>
9 3 A
10 4 A
11 4 <NA>
12 4 A
13 5 A
14 5 <NA>
16 5 C
17 6 B
18 6 <NA>
19 6 C
CodePudding user response:
with dplyr
library(dplyr)
df %>%
group_by(Sequ) %>%
filter(row_number() != 2 | is.na(Speaker))
CodePudding user response:
in base R:
subset(df, ave(Sequ, Sequ, FUN=seq_along) != 2 | is.na(Speaker))
Sequ Speaker
1 1 A
3 1 <NA>
4 1 A
5 2 B
6 2 <NA>
7 2 C
8 3 A
9 3 <NA>
10 3 A
11 4 A
13 4 <NA>
14 4 A
15 5 A
16 5 <NA>
17 5 C
18 6 B
20 6 <NA>
21 6 C