Home > Software design >  R: How can I convert my dataframe using two columns to create a consecutive row
R: How can I convert my dataframe using two columns to create a consecutive row

Time:10-03

I would like to transform my data frame so that two columns (MinYear, MaxYear) merge together into a new column (Year) that consists of consecutive sequences limited by MinYear and MaxYear.

My data frame looks like this:

Value MinYear MaxYear
5,2 1997 2000
3,3 1996 2000

I want to get my data frame like this:

Value Year
5,2 2000
5,2 1999
5,2 1998
5,2 1997
3,3 2000
3,3 1999
3,3 1998
3,3 1997
3,3 1996
3,3 1995

Do you have any glues on how to solve that issue?

Thanks for your help!

CodePudding user response:

You can group by Value and use dplyr::summarize:

library(dplyr)

df %>% 
  group_by(Value) %>% 
  summarize(Year = MaxYear:MinYear, .groups="drop")

Output:

  Value  Year
  <chr> <int>
1 3,3    2000
2 3,3    1999
3 3,3    1998
4 3,3    1997
5 3,3    1996
6 5,2    2000
7 5,2    1999
8 5,2    1998
9 5,2    1997

If Value is not unique, you can use rowwise() to ensure that this is executed separately for each row of the original frame:

df %>% 
  rowwise() %>% 
  summarize(Value = Value, Year = MaxYear:MinYear)

Input:

df = structure(list(Value = c("5,2", "3,3"), MinYear = 1997:1996, 
    MaxYear = c(2000L, 2000L)), row.names = c(NA, -2L), class = "data.frame")

CodePudding user response:

Another approach is to use map2 from purrr. For each row, it will create a sequence of Years based on MaxYear and MinYear.

library(tidyverse)

df %>%
  transmute(Value, Year = map2(MaxYear, MinYear, seq)) %>%
  unnest(cols = Year)

Output

  Value  Year
  <chr> <int>
1 5,2    2000
2 5,2    1999
3 5,2    1998
4 5,2    1997
5 3,3    2000
6 3,3    1999
7 3,3    1998
8 3,3    1997
9 3,3    1996
  • Related