I am trying a situation where, I need to group Year values (number in format) and keep the first occurrence based on if they are in sequential format. Eg data:
Org | Year | Value |
---|---|---|
A | 2011 | 1 |
A | 2012 | 1 |
A | 2013 | 2 |
A | 2016 | 2 |
A | 2017 | 2 |
A | 2018 | 2 |
A | 2019 | 2 |
A | 2022 | 5 |
B | 2007 | 1 |
B | 2008 | 1 |
B | 2009 | 1 |
B | 2015 | 1 |
B | 2016 | 1 |
B | 2019 | 3 |
B | 2021 | 4 |
B | 2022 | 5 |
Expected Output:
Org | Year | Value |
---|---|---|
A | 2011 | 1 |
A | 2016 | 2 |
A | 2022 | 5 |
B | 2007 | 1 |
B | 2015 | 1 |
B | 2019 | 3 |
B | 2021 | 4 |
Thank You!
CodePudding user response:
library(dplyr)
df <- read.table(header = TRUE, text= "Org Year Value
A 2011 1
A 2012 1
A 2013 2
A 2016 2
A 2017 2
A 2018 2
A 2019 2
A 2022 5
B 2007 1
B 2008 1
B 2009 1
B 2015 1
B 2016 1
B 2019 3
B 2021 4
B 2022 5")
df |>
filter(Org != lag(Org) | Year != lag(Year) 1 | dplyr::row_number() == 1)