Home > Back-end >  Extract year from date with weird date format
Extract year from date with weird date format

Time:07-25

I have a date format as follows: yyyymmdd. So, 10 March 2022 is fromatted as 20220310. So there is no separator between the day, month and year. But no I want to replace to column with all those dates with a column that only contains the year. Normally I would use the following code:

df <- df %>%
  mutate(across(contains("Date"), ~(parse_date_time(., orders = c('ymd')))))

And then separate the column into three different columns with year, month and days and than delete the monht and day column. But somehow the code above doesn't work. Hope that anyone can help me out.

CodePudding user response:

Not as fancy, but you could simply get the year from a substring of the whole date:

df$Year <- as.numeric(substr(as.character(df$Date),1,4))

CodePudding user response:

you can try this:

df$column_with_date <- as.integer(x = substr(x = df$column_with_date, start = 1, stop = 4))

The as.integer function is optional, but you could use it to save more space in memory.

CodePudding user response:

You code works if it is in the format below. You can use mutate_at with a list of year, month, and day to create the three columns like this:

df <- data.frame(Date = c("20220310"))

library(lubridate)
library(dplyr)
df %>%
  mutate(across(contains("Date"), ~(parse_date_time(., orders = c('ymd'))))) %>%
  mutate_at(vars(Date), list(year = year, month = month, day = day))
#>         Date year month day
#> 1 2022-03-10 2022     3  10

Created on 2022-07-25 by the reprex package (v2.0.1)

  •  Tags:  
  • r
  • Related