Home > Software engineering >  How to get date from a condition on certain column
How to get date from a condition on certain column

Time:09-27

I am working of data that has wave number and they represent the data for a certain month of a certain year. The data is bi-monthly so an increment in wave column means the increase of 2 months The column of wave looks something like this: |wave| |----| |1 | |1 | |1 | |2 | |2 | |3 | |3 | |3 | |4 | |5 | |5 | |5 | |7 | |8 |

Now from this I want to create a date column that corresponds to the wave number. I have values of wave ranging from 1 to 64. I want to add the column of date in my data frame as follows

wave date
1 jan2012
1 jan2012
1 jan2012
2 mar2012
2 mar2012
3 may2012
3 may2012
3 may2012
4 jul2012
5 sep2012
5 sep2012
5 sep2012
6 nov2012
7 jan2013

CodePudding user response:

I used dplyr and zoo which has a year-month date format:

library(dplyr)
library(zoo)

df <- data.frame(wave = c(1,1,1,2,2,3,3,3,4,5,5,5,6,7))

opener <- as.yearmon("2021-01")

df %>% mutate(date = opener   ((wave-1)*2/12))

Which gives us:

   wave     date
1     1 Jan 2021
2     1 Jan 2021
3     1 Jan 2021
4     2 Mar 2021
5     2 Mar 2021
6     3 May 2021
7     3 May 2021
8     3 May 2021
9     4 Jul 2021
10    5 Sep 2021
11    5 Sep 2021
12    5 Sep 2021
13    6 Nov 2021
14    7 Jan 2022

Explanation: Each wave after 1 adds 2/12 (i.e. two months) to the date.

Edit: I put the wrong opening year/month, sorry, but the principle remains the same! Make sure to correct the "opener" if you try this.

CodePudding user response:

For a solution that does not actually use Date objects at all, you can use integer division to turn your wave into month and year, then paste these together:

df$date = paste(
    month.name[((df$wave - 1) %% 6) * 2   1],   # the modulo picks the month name
    2012   (df$wave - 1) %/% 6                  # the integer quotient picks the year
)

> df
   wave           date
1     1   January 2012
2     1   January 2012
3     1   January 2012
4     2     March 2012
5     2     March 2012
6     3       May 2012
7     3       May 2012
8     3       May 2012
9     4      July 2012
10    5 September 2012
11    5 September 2012
12    5 September 2012
13    6  November 2012
14    7   January 2013

In this case, I am using the built-in month.name array for convenience, but to replicate exactly your example, you could set up your own array

months <- c("jan", "feb", "mar", "apr", "may", "jun", "jul", "aug", "sep", "oct", "nov", "dec")

df$date = paste0(                              # paste0 omits the space by default
    months[((df$wave - 1) %% 6) * 2   1],   
    2012   (df$wave - 1) %/% 6                  
)

> df
   wave    date
1     1 jan2012
2     1 jan2012
3     1 jan2012
4     2 mar2012
5     2 mar2012
6     3 may2012
7     3 may2012
8     3 may2012
9     4 jul2012
10    5 sep2012
11    5 sep2012
12    5 sep2012
13    6 nov2012
14    7 jan2013
  • Related