Home > Net >  Find all rows with January as the month and update the year column so that it is accounted for in th
Find all rows with January as the month and update the year column so that it is accounted for in th

Time:12-30

I have daily time series data. I want to identify all rows in the data that correspond to the month of January. For these rows, I want to update the year column so that it is shifted back by one year. This will allow the January rows to be accounted for in the previous year's season rather than the current year.

This is a reproducible code that generates what resembles my data:

library(dplyr)
library(tibble)

# Set the seed for reproducibility
set.seed(123)

# Create a sequence of dates from 2001 to 2005
dates <- seq(as.Date("2001-01-01"), as.Date("2005-12-31"), by = "day")
# Create a tibble with the dates and random numbers for var1 to var4
df <- tibble(year = year(dates), month = month(dates), day = day(dates),
             var1 = runif(length(dates)), var2 = runif(length(dates)),
             var3 = runif(length(dates)), var4 = runif(length(dates)))

df

Any thoughts please?

CodePudding user response:

For a dplyr use you could probably do a mutate with case_when. I added a new variable to demonstrate, just mutate year if you really want to.

library(dplyr)
library(tibble)
library(lubridate)

# Set the seed for reproducibility
set.seed(123)

# Create a sequence of dates from 2001 to 2005
dates <- seq(as.Date("2001-01-01"), as.Date("2005-12-31"), by = "day")
# Create a tibble with the dates and random numbers for var1 to var4
df <- tibble(year = year(dates), month = month(dates), day = day(dates),
             var1 = runif(length(dates)), var2 = runif(length(dates)),
             var3 = runif(length(dates)), var4 = runif(length(dates)))

# add a new grouping variable
df$countyear <- df$year
df <- df %>% mutate(countyear = case_when(.$month == 1 ~ year - 1, .$month != 1 ~ year))   
> head(df)
# A tibble: 6 x 8
   year month   day   var1  var2  var3  var4 countyear
  <dbl> <dbl> <int>  <dbl> <dbl> <dbl> <dbl>     <dbl>
1  2001     1     1 0.576  0.455 0.517 0.857      2000
2  2001     1     2 0.741  0.934 0.381 0.593      2000
3  2001     1     3 0.0914 0.264 0.717 0.907      2000
4  2001     1     4 0.541  0.818 0.981 0.910      2000
5  2001     1     5 0.603  0.118 0.768 0.586      2000
6  2001     1     6 0.222  0.888 0.614 0.716      2000

CodePudding user response:

you can show year variable as an integer and filtering January (month "01") substract a year from theses dates:

library(dplyr)
library(tibble)

# Set the seed for reproducibility
set.seed(123)

# Create a sequence of dates from 2001 to 2005
dates <- seq(as.Date("2001-01-01"), as.Date("2005-12-31"), by = "day")
# Create a tibble with the dates and random numbers for var1 to var4
df <- tibble(year = as.integer(format(dates, format="%Y")), month = format(dates, format="%m"), day = format(dates, format="%d"),
             var1 = runif(length(dates)), var2 = runif(length(dates)),
             var3 = runif(length(dates)), var4 = runif(length(dates)))

df$year[df$month == "01"] <- df$year[df$month == "01"] - 1
  •  Tags:  
  • r
  • Related