Home > OS >  How to aggregate group of rows by a irregular interval in R?
How to aggregate group of rows by a irregular interval in R?

Time:10-11

I have a data frame with lines of a transcription of a conversation, in which what was said by each person is separated by an empty line. I now need to aggregate the lines so that each one is a row, but the line ranges are irregular. How can I aggregate this data?

The data are like this:

Speech Sep line
Was in Augoust 0
Don't you remember? 0
1
Yes, i did 0
It was a hot Saturday 0
we were in the park 0
1
That's right 0
it was a fun day 0

I want the date to be like:

speech
Was in Augoust, Don't you remember?
Yes, i did. It was a hot Saturday, we were in the park
That's right,it was a fun day

CodePudding user response:

Here's a way with dplyr -

df %>% 
  mutate(group = cumsum(sep_line)) %>% 
  filter(sep_line == 0) %>% 
  group_by(group) %>% 
  summarise(
    speech = paste(speech, collapse = " ")
  ) %>% 
  select(speech)
  • Related