I am trying to produce a new column (Yrgroup) that puts individual years into 2year groups so:
Yrs TS Yrgroup
2011 2 11/12
2011 2 11/12
2012 4 11/12
2012 8 11/12
2013 2 13/14
2013 1 13/14
2014 3 13/14
2014 7 13/14
Yr = c(2011,2011,2012,2012,2013,2013,2014,2014)
Yr
Tranship = c(2,5,8,2,2,2,7,8)
df = data.frame(Yr, Tranship)
df
df$Yrgroup = NA
#library(dplyr)
df %>%
group_by(Yr 1)
This is what I have tried so far but I cannot fill in the year group column
CodePudding user response:
You can do this as follows:
f <- function(y) if_else(y%%2==0, paste0(y-1,"/",y),paste0(y,"/",y 1))
mutate(df, Yrsgroup = f(Yrs%00))
Output:
Yrs TS Yrsgroup
1: 2011 2 11/12
2: 2011 2 11/12
3: 2012 4 11/12
4: 2012 8 11/12
5: 2013 2 13/14
6: 2013 1 13/14
7: 2014 3 13/14
8: 2014 7 13/14
Note that my use of Yrs%00
is not as generalizable as this below alternative, which produces the same output, but works for wider set of years
mutate(df, Yrsgroup = f(as.numeric(substr(Yrs,3,4))))
Finally, this version of f()
handles more cases (for example, it would correctly handle the year 2000; I've changed the input data to show this, below), makes the call simpler:
f <- function(y) {
substr(if_else(y%%2==0, paste0(y-1,"/",substr(y,3,4)),paste0(y,"/",substr(y 1,3,4))),3,7)
}
mutate(df, Yrsgroup = f(Yrs)
Output:
Yrs TS Yrsgroup
1: 2000 2 99/00
2: 2011 2 11/12
3: 2012 4 11/12
4: 2012 8 11/12
5: 2013 2 13/14
6: 2013 1 13/14
7: 2014 3 13/14
8: 2014 7 13/14
CodePudding user response:
It looks like you always have a format [uneven year]/[even year]. You can check for that using modulo 2
, and determine the Yrgroup using that.
Yr = c(2011,2011,2012,2012,2013,2013,2014,2014)
Tranship = c(2,5,8,2,2,2,7,8)
df = data.frame(Yr, Tranship)
df$Yrgroup <- ifelse(df$Yr %%2 == 1,
yes = paste(substr(df$Yr, 3, 4),
as.numeric(substr(df$Yr, 3, 4)) 1,
sep = "/"),
no = paste(as.numeric(substr(df$Yr, 3, 4)) - 1,
substr(df$Yr, 3, 4),
sep = "/"))
df
#> Yr Tranship Yrgroup
#> 1 2011 2 11/12
#> 2 2011 5 11/12
#> 3 2012 8 11/12
#> 4 2012 2 11/12
#> 5 2013 2 13/14
#> 6 2013 2 13/14
#> 7 2014 7 13/14
#> 8 2014 8 13/14
EDIT However, this will not work with the year 2000, as 00 - 1 = -1.
To handle this, you might want to use actual dates. lubridate
is a package that is useful for handling dates.
Yr = c(1999, 2000, 2011,2011,2012,2012,2013,2013,2014,2014)
Tranship = c(8,5,2,5,8,2,2,2,7,8)
df = data.frame(Yr, Tranship)
library(lubridate)
df$Yrgroup <- ifelse(df$Yr%00%%2 == 1,
paste(substr(df$Yr, 3, 4),
format(ymd(df$Yr*10000 101) years(1), "%y"),
sep = "/"),
paste(format(ymd(df$Yr*10000 101) - years(1), "%y"),
substr(df$Yr, 3, 4),
sep = "/"))
df
#> Yr Tranship Yrgroup
#> 1 1999 8 99/00
#> 2 2000 5 99/00
#> 3 2011 2 11/12
#> 4 2011 5 11/12
#> 5 2012 8 11/12
#> 6 2012 2 11/12
#> 7 2013 2 13/14
#> 8 2013 2 13/14
#> 9 2014 7 13/14
#> 10 2014 8 13/14