Home > Software engineering >  Remove extra 0 in front of numeric month
Remove extra 0 in front of numeric month

Time:03-01

I have a df with a column which has dates stored in character format, for which I want to extract the months. For this I use the following:

mutate(
    
    Date = as.Date(
      
      str_remove(Timestamp, "_.*")
      
      ),
    
    Month = month(
      
      Date, 
      
      label = F)
    
  ) 

However, the October, November and December are stored with an extra zero in front of the month. The lubridate library doesn't recognise it. How can I adjust the code above to fix this? This is my Timestamp column:

c("2021-010-01_00h39m", "2021-010-01_01h53m", "2021-010-01_02h36m", 
"2021-010-01_10h32m", "2021-010-01_10h34m", "2021-010-01_14h27m"
)

CodePudding user response:

First convert the values to date and use format to get months from it.

format(as.Date(x, '%Y-0%m-%d'), '%b')
#[1] "Oct" "Oct" "Oct" "Oct" "Oct" "Oct"

%b gives abbreviated month name, you may also use %B or %m depending on your choice.

format(as.Date(x, '%Y-0%m-%d'), '%B')
#[1] "October" "October" "October" "October" "October" "October"

format(as.Date(x, '%Y-0%m-%d'), '%m')
#[1] "10" "10" "10" "10" "10" "10"

CodePudding user response:

One way would be use strsplit to extract the second element:

month.abb[readr::parse_number(sapply(strsplit(x, split = '-'), "[[", 2))]

which will return:

#"Oct" "Oct" "Oct" "Oct" "Oct" "Oct"

data:

c("2021-010-01_00h39m", "2021-010-01_01h53m", "2021-010-01_02h36m", 
  "2021-010-01_10h32m", "2021-010-01_10h34m", "2021-010-01_14h27m"
) -> x
  • Related