Home > front end >  How to replace numbers with dates or remove rows with numbers from data frame?
How to replace numbers with dates or remove rows with numbers from data frame?

Time:12-07

I have data frame like this:

[701] "38760"      "38788"      "38819"      "38849"      "38880"      "38910"      "38941"     
[708] "38972"      "39002"      "39033"      "39063"      "12/13/2006" "12/14/2006" "12/15/2006"
[715] "12/16/2006" "12/17/2006" "12/18/2006" "12/19/2006" "12/20/2006" "12/21/2006" "12/22/2006"
[722] "12/23/2006" "12/24/2006" "12/25/2006" "12/26/2006" "12/27/2006" "12/28/2006" "12/29/2006"
[729] "12/30/2006" "12/31/2006"

and I would like to have only dates (replacing or removing numbers). Anyone know some reasonable quick way to do this?

CodePudding user response:

Assuming that we have a 'dates' column in the data.frame which includes only two formats - 1) %m/%d/%Y and 2) numeric values - then we apply the as.Date with the format %m/%d/%Y on the whole 'dates'. If the values are numeric, it will return NA, which we do separately and update in base R

dates1 <- with(df1, as.Date(dates, format = "%m/%d/%Y"))
dates1[is.na(dates1)] <- with(df1, as.Date(as.integer(dates[is.na(dates1)]), 
       origin = "1899-12-30"))
df1$dates <- dates1

-output

> df1$dates
 [1] "2006-02-12" "2006-03-12" "2006-04-12" "2006-05-12" "2006-06-12" "2006-07-12" "2006-08-12" "2006-09-12" "2006-10-12" "2006-11-12" "2006-12-12" "2006-12-13"
[13] "2006-12-14" "2006-12-15" "2006-12-16" "2006-12-17" "2006-12-18" "2006-12-19" "2006-12-20" "2006-12-21" "2006-12-22" "2006-12-23" "2006-12-24" "2006-12-25"
[25] "2006-12-26" "2006-12-27" "2006-12-28" "2006-12-29" "2006-12-30" "2006-12-31"

data

df1 <- structure(list(dates = c("38760", "38788", "38819", "38849", 
"38880", "38910", "38941", "38972", "39002", "39033", "39063", 
"12/13/2006", "12/14/2006", "12/15/2006", "12/16/2006", "12/17/2006", 
"12/18/2006", "12/19/2006", "12/20/2006", "12/21/2006", "12/22/2006", 
"12/23/2006", "12/24/2006", "12/25/2006", "12/26/2006", "12/27/2006", 
"12/28/2006", "12/29/2006", "12/30/2006", "12/31/2006")),
 class = "data.frame", row.names = c(NA, 
-30L))

CodePudding user response:

I assume it is a Excel date. So now you can use the origin argument of as.Date which December 30, 1899.

library(dplyr)
df %>% 
  mutate(date = as.Date(date, origin = "1899-12-30"))
       date Przedmiescie Centrum
1 2005-01-01            0       0
2 2005-02-01            0       0
3 2005-03-01            0       0
4 2005-04-01            0       0
5 2005-05-01            0       2
6 2005-06-01            0       0

data:

structure(list(date = c(38353L, 38384L, 38412L, 38443L, 38473L, 
38504L), Przedmiescie = c(0L, 0L, 0L, 0L, 0L, 0L), Centrum = c(0L, 
0L, 0L, 0L, 2L, 0L)), class = "data.frame", row.names = c("1", 
"2", "3", "4", "5", "6"))
  •  Tags:  
  • r
  • Related