I have a column in my dataframe which is a mix of some dates and string values. I want to specifically choose the dates and convert into a UNIX timestamp and leave the string values as such. How can this be accomplished ?
Sample data
|column1|
---------
|2020-12-21 00:00:00|
|test1|
|test2|
|test3|
|2021-12-21 00:00:00|
Expected Result
|Column1|
---------------
|1608508800|
|test1|
|test2|
|test3|
|1608508800|
CodePudding user response:
x = read.table(text = 'column1
2020-12-21 00:00:00
test1
test2
test3
2021-12-21 00:00:00', sep = ";", header = T)
uts = as.numeric(as.POSIXct(x$column1, format = "%Y-%m-%d %H:%M:%S", tz = "UTC"))
uts_i = which(!is.na(uts))
x$column1[uts_i] = uts[uts_i]
x
# column1
# 1 1608508800
# 2 test1
# 3 test2
# 4 test3
# 5 1640044800
Or with dplyr
:
x %>%
mutate(
uts = as.numeric(as.POSIXct(x$column1, format = "%Y-%m-%d %H:%M:%S", tz = "UTC")),
column1 = coalesce(as.character(uts), column1)
) %>%
select(-uts)
# column1
# 1 1608508800
# 2 test1
# 3 test2
# 4 test3
# 5 1640044800