So imagine I have a dataset where the column "date" contains years 2011-2017 and months for each year, however months are written out in letters. For example:
date: 11-Jan
I would like to make the months numeric so I get:
date: 11-01
Any suggestions on how I can tackle this problem?
Kind regards!
CodePudding user response:
Make your input proper dates, parse them, then format them.
x <- c("11-Jan", "12-Feb")
Sys.setlocale("LC_TIME", "C") #parsing of months depends on locale
format(
as.Date(paste0(x, "-1"), format = "%y-%b-%d"),
"%y-%m"
)
#[1] "11-01" "12-02"
See help("strptime")
for details on format strings.
CodePudding user response:
Assuming your data is like:
df1 <- structure(list(day_mon = c("16-Dec", "18-Nov", "12-Oct", "8-Oct",
"15-May", "29-Jun", "22-Feb", "25-May", "23-Jan", "24-Oct", "23-May",
"27-Sep", "9-Apr", "28-Oct", "18-Jan", "8-Apr", "7-Jan", "13-Dec",
"28-Nov", "24-May"), year = c(2012L, 2014L, 2011L, 2015L, 2015L,
2015L, 2011L, 2015L, 2012L, 2015L, 2011L, 2012L, 2014L, 2012L,
2013L, 2011L, 2017L, 2016L, 2014L, 2014L)),
row.names = c(
1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L,
13L, 14L, 15L, 16L, 17L, 18L, 19L, 20L), class = "data.frame")
You can:
# Format the month and day: mon_day_fmt => character vector
df1$mon_day_fmt <- paste(
sprintf(
"d",
match(
gsub(
"\\d \\-(\\w )",
"\\1",
with(
df1,
day_mon
)
),
month.abb
)
),
sprintf(
"d",
as.integer(
gsub(
"^(\\d )\\-\\w $",
"\\1",
with(
df1,
day_mon
)
)
)
),
sep = "-"
)
# Create a date vector: date => Date Vector
df1$date <- as.Date(
paste(
df1$year,
df1$mon_day_fmt,
sep = "-"
)
)