Home > OS >  Convert dates into numbers
Convert dates into numbers

Time:10-23

I need your support while working with dates. While importing an .xls file, the column of dates was correctly converted into numbers by R. Unfortunately some dates are still there in the format: dd/mm/yyyy or d/mm/yyyy or dd/mm/yy. Probably this results from different settings of different os. I don't know. Is there a way to manage this?

Thank you in advance

 my_data <- read_excel("my_file.xls")
born_date     
 18520            
 30859                     
 16/04/1972       
 26612            
 30291            
 24435 
11/02/1964       
26/09/1971       
 18427            
 23688    

Original_dates
14/9/1950
26/6/1984
16/04/1972
9/11/1972
6/12/1982
24/11/1966
11/02/1964
26/09/1971
13/6/1950

CodePudding user response:

Here is one way how we could solve it: First we define the numeric values only by exlcuden those containing the string /. Then we use excel_numeric_to_date function from janitor package. Finally with coalesce we combine both:

library(dplyr)
library(janitor)
library(lubridate)
df %>%
  mutate(x = ifelse(str_detect(born_date, '\\/'), NA_real_, born_date),
         x = excel_numeric_to_date(as.numeric(as.character(x)), date_system = "modern"),
         born_date = dmy(born_date)) %>% 
  mutate(born_date = coalesce(born_date, x), .keep="unused")
    born_date
1  1950-09-14
2  1984-06-26
3  1972-04-16
4  1972-11-09
5  1982-12-06
6  1966-11-24
7  1964-02-11
8  1971-09-26
9  1950-06-13
10 1964-11-07

data:

df <- structure(list(born_date = c("18520", "30859", "16/04/1972", 
"26612", "30291", "24435", "11/02/1964", "26/09/1971", "18427", 
"23688")), class = "data.frame", row.names = c(NA, -10L))

CodePudding user response:

This translates the two types of dates. Each returns an NA for those elements not of that type. Then we use coalesce to combine them. This only needs dplyr and no warnings are produced.

library(dplyr)

my_data %>%
 mutate(born_date = coalesce(
  as.Date(born_date, "%d/%m/%Y"), 
  as.Date(as.numeric(ifelse(grepl("/",born_date),NA,born_date)), o="1899-12-30"))
  )
##     born_date
## 1  1950-09-14
## 2  1984-06-26
## 3  1972-04-16
## 4  1972-11-09
## 5  1982-12-06
## 6  1966-11-24
## 7  1964-02-11
## 8  1971-09-26
## 9  1950-06-13
## 10 1964-11-07

HEre is a base R version.

my_data |>
 transform(born_date = pmin(
  as.Date(born_date, "%d/%m/%Y"), 
  as.Date(as.numeric(ifelse(grepl("/",born_date),NA,born_date)), o="1899-12-30"),
  na.rm = TRUE)
 )

Note

The input in reproducible form.

my_data <-
structure(list(born_date = c("18520", "30859", "16/04/1972", 
"26612", "30291", "24435", "11/02/1964", "26/09/1971", "18427", 
"23688")), class = "data.frame", row.names = c(NA, -10L))
  •  Tags:  
  • r
  • Related