Home > Software engineering >  R - Create Row for Each Year of a Record based on column subtraction
R - Create Row for Each Year of a Record based on column subtraction

Time:12-29

I have a dataframe in R of volunteerships. I want a row for each fiscal year of a volunteership. So my data comes in like this:

ID  Volunteership   FYStart FYEnd
1   Fabulousness    2019    2021
2   Graciousness    2021    2021
3   Loveliness  1995    1999

I need a new row for each fiscal year involved, and a need a “VolFY” column that labels each. I need to end up with:

ID  Volunteership   FYStart FYEnd   VolFY
1   Fabulousness    2019    2021    2019
1   Fabulousness    2019    2021    2020
1   Fabulousness    2019    2021    2021
2   Graciousness    2021    2021    2021
3   Loveliness  1995    1999    1995
3   Loveliness  1995    1999    1996
3   Loveliness  1995    1999    1997
3   Loveliness  1995    1999    1998
3   Loveliness  1995    1999    1999

So far, I’ve played with seq() and with rep(). I’ve also played with a for loop and a while loop, but haven’t quite gotten the exact trick.

If you’ve done this sort of thing, please share how you did it. It’s a conundrum.

--Marianne

CodePudding user response:

Here's a way to do it in tidyverse -

library(tidyverse)

df %>%
  mutate(VolFY = map2(FYStart, FYEnd, seq)) %>%
  unnest(VolFY)

#     ID Volunteership FYStart FYEnd VolFY
#  <int> <chr>           <int> <int> <int>
#1     1 Fabulousness     2019  2021  2019
#2     1 Fabulousness     2019  2021  2020
#3     1 Fabulousness     2019  2021  2021
#4     2 Graciousness     2021  2021  2021
#5     3 Loveliness       1995  1999  1995
#6     3 Loveliness       1995  1999  1996
#7     3 Loveliness       1995  1999  1997
#8     3 Loveliness       1995  1999  1998
#9     3 Loveliness       1995  1999  1999

CodePudding user response:

Try this:

newdat <- do.call(rbind, Map(function(ID, S, E) data.frame(ID=ID, VolFY=seq(S, E)), dat$ID, dat$FYStart, dat$FYEnd))
merge(dat, newdat, by = "ID")
#   ID Volunteership FYStart FYEnd VolFY
# 1  1  Fabulousness    2019  2021  2019
# 2  1  Fabulousness    2019  2021  2020
# 3  1  Fabulousness    2019  2021  2021
# 4  2  Graciousness    2021  2021  2021
# 5  3    Loveliness    1995  1999  1995
# 6  3    Loveliness    1995  1999  1996
# 7  3    Loveliness    1995  1999  1997
# 8  3    Loveliness    1995  1999  1998
# 9  3    Loveliness    1995  1999  1999

This works because we first iterate over each row and generate a mini frame of just the new years (before mergeing back in to the original data):

Map(function(ID, S, E) data.frame(ID=ID, VolFY=seq(S, E)), dat$ID, dat$FYStart, dat$FYEnd)
# [[1]]
#   ID VolFY
# 1  1  2019
# 2  1  2020
# 3  1  2021
# [[2]]
#   ID VolFY
# 1  2  2021
# [[3]]
#   ID VolFY
# 1  3  1995
# 2  3  1996
# 3  3  1997
# 4  3  1998
# 5  3  1999

Data

dat <- structure(list(ID = 1:3, Volunteership = c("Fabulousness", "Graciousness", "Loveliness"), FYStart = c(2019L, 2021L, 1995L), FYEnd = c(2021L, 2021L, 1999L)), class = "data.frame", row.names = c(NA, -3L))
  • Related