Home > Net >  Extract information for date from other columns
Extract information for date from other columns

Time:06-14

I working with monthly data. Data is in a specific format in two columns Month and Year. Below you can see a sample of data:

df<-data.frame(
              Month=c("m1","m2","m3","m4","m5","m6","m7","m8","m9","m10","m11","m12"),
       Year=c("2020","2020","2020","2020","2020","2020","2020","2020","2020","2020","2020","2020"))

Now I want to convert this data, from that format into the format shown below or more precisely in column Date

enter image description here

So can anybody help me how to solve this problem?

CodePudding user response:

In base R you can do:

df$Date <- as.Date(paste0(df$Year, gsub("m", "-", df$Month, fixed = TRUE), "-01"))

CodePudding user response:

Here is an option using parse_number and my function -

library(dplyr)
library(readr)
library(lubridate)

df %>%
  mutate(Month = parse_number(Month), 
         Date = my(paste(Month, Year)))

#   Month Year       Date
#1      1 2020 2020-01-01
#2      2 2020 2020-02-01
#3      3 2020 2020-03-01
#4      4 2020 2020-04-01
#5      5 2020 2020-05-01
#6      6 2020 2020-06-01
#7      7 2020 2020-07-01
#8      8 2020 2020-08-01
#9      9 2020 2020-09-01
#10    10 2020 2020-10-01
#11    11 2020 2020-11-01
#12    12 2020 2020-12-01

Base R option -

transform(df, Date = as.Date(paste(1, sub('m', '', Month), Year), '%d %m %Y'))

CodePudding user response:

Try

as.Date(paste0(df$Year, '-', gsub('\\D ', '', df$Month), '-01'))

#[1] "2020-01-01" "2020-02-01" "2020-03-01" "2020-04-01" "2020-05-01" "2020-06-01" "2020-07-01" "2020-08-01" "2020-09-01" "2020-10-01" "2020-11-01" "2020-12-01"
  • Related