I was trying to use
library(dplyr)
library(tidyr)
library(stringr)
# Dataframe has "Date" column and date in the format "dd/mm/yyyy" or "dd/m/yyyy"
df <- data.frame(Date = c("10/1/2001", "15/01/2010", "15/2/2010", "20/02/2010", "25/3/2010", "31/03/2010"))
# extract into three columns
df %>% extract(Date, c("Day", "Month", "Year"), "([^/] ), ([^/] ), ([^)] )")
But above code is returning:
Day Month Year
1 <NA> <NA> <NA>
2 <NA> <NA> <NA>
3 <NA> <NA> <NA>
4 <NA> <NA> <NA>
5 <NA> <NA> <NA>
6 <NA> <NA> <NA>
How to correctly extract the dates in the result as expected:
Day Month Year
1 10 1 2010
2 15 1 2010
3 15 2 2010
4 20 2 2010
5 25 3 2010
6 31 3 2010
CodePudding user response:
Might be easier to use separate
in this case
df %>%
separate("Date", into=c("Day","Month","Year"), sep="/") %>%
mutate(Month=str_replace(Month, "^0",""))
That will keep everything as character values. If you want the values to be numeric, use
df %>%
separate("Date", into=c("Day","Month","Year"), sep="/", convert=TRUE)
CodePudding user response:
Your regex pattern is off. Use this version:
df %>% extract(Date, c("Day", "Month", "Year"), "(\\d )/(\\d )/(\\d )")
CodePudding user response:
We could use lubridate
:
library(lubridate)
library(dplyr)
df %>%
mutate(Date = dmy(Date), # if your Date column is character type
across(Date, funs(year, month, day)))
Date Date_year Date_month Date_day
1 2001-01-10 2001 1 10
2 2010-01-15 2010 1 15
3 2010-02-15 2010 2 15
4 2010-02-20 2010 2 20
5 2010-03-25 2010 3 25
6 2010-03-31 2010 3 31