Home > Enterprise >  Numeric year but month as character. How to change months into numeric?
Numeric year but month as character. How to change months into numeric?

Time:10-15

So imagine I have a dataset where the column "date" contains years 2011-2017 and months for each year, however months are written out in letters. For example:

date: 11-Jan

I would like to make the months numeric so I get:

date: 11-01

Any suggestions on how I can tackle this problem?

Kind regards!

CodePudding user response:

Make your input proper dates, parse them, then format them.

x <- c("11-Jan", "12-Feb")
Sys.setlocale("LC_TIME", "C") #parsing of months depends on locale
format(
  as.Date(paste0(x, "-1"), format = "%y-%b-%d"),
  "%y-%m"
)
#[1] "11-01" "12-02"

See help("strptime") for details on format strings.

CodePudding user response:

Assuming your data is like:

df1 <- structure(list(day_mon = c("16-Dec", "18-Nov", "12-Oct", "8-Oct", 
"15-May", "29-Jun", "22-Feb", "25-May", "23-Jan", "24-Oct", "23-May", 
"27-Sep", "9-Apr", "28-Oct", "18-Jan", "8-Apr", "7-Jan", "13-Dec", 
"28-Nov", "24-May"), year = c(2012L, 2014L, 2011L, 2015L, 2015L, 
2015L, 2011L, 2015L, 2012L, 2015L, 2011L, 2012L, 2014L, 2012L, 
2013L, 2011L, 2017L, 2016L, 2014L, 2014L)), 
row.names = c(
   1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 
   13L, 14L, 15L, 16L, 17L, 18L, 19L, 20L), class = "data.frame")

You can:

# Format the month and day: mon_day_fmt => character vector
df1$mon_day_fmt <- paste(
   sprintf(
      "d",
      match(
         gsub(
            "\\d \\-(\\w )",
            "\\1",
            with(
               df1,
               day_mon
            )
         ),
         month.abb
      )
   ),
   sprintf(
      "d",
      as.integer(
         gsub(
            "^(\\d )\\-\\w $",
            "\\1",
            with(
               df1,
               day_mon
            )
         )
      )
   ),
   sep = "-"
)

# Create a date vector: date => Date Vector
df1$date <- as.Date(
   paste(
      df1$year,
      df1$mon_day_fmt,
      sep = "-"
   )
)
  • Related