I have a df
with a column which has dates stored in character
format, for which I want to extract the months. For this I use the following:
mutate(
Date = as.Date(
str_remove(Timestamp, "_.*")
),
Month = month(
Date,
label = F)
)
However, the October, November and December are stored with an extra zero in front of the month. The lubridate
library doesn't recognise it. How can I adjust the code above to fix this? This is my Timestamp
column:
c("2021-010-01_00h39m", "2021-010-01_01h53m", "2021-010-01_02h36m",
"2021-010-01_10h32m", "2021-010-01_10h34m", "2021-010-01_14h27m"
)
CodePudding user response:
First convert the values to date and use format
to get months from it.
format(as.Date(x, '%Y-0%m-%d'), '%b')
#[1] "Oct" "Oct" "Oct" "Oct" "Oct" "Oct"
%b
gives abbreviated month name, you may also use %B
or %m
depending on your choice.
format(as.Date(x, '%Y-0%m-%d'), '%B')
#[1] "October" "October" "October" "October" "October" "October"
format(as.Date(x, '%Y-0%m-%d'), '%m')
#[1] "10" "10" "10" "10" "10" "10"
CodePudding user response:
One way would be use strsplit
to extract the second element:
month.abb[readr::parse_number(sapply(strsplit(x, split = '-'), "[[", 2))]
which will return:
#"Oct" "Oct" "Oct" "Oct" "Oct" "Oct"
data:
c("2021-010-01_00h39m", "2021-010-01_01h53m", "2021-010-01_02h36m",
"2021-010-01_10h32m", "2021-010-01_10h34m", "2021-010-01_14h27m"
) -> x