I have a question. I have a set of numeric values that are a date, but apparently the date is wrongly formatted and coming out of SAS. For example, I have the value 5893 that is in SAS 19.02.1976 when formatted correctly. I want to achieve this in Python/PySpark. From what I've found until now, there is a function fromtimestamp
.
However, when I do this, it gives a wrong date:
value = 5893
date = datetime.datetime.fromtimestamp(value)
print(date)
1970-01-01 02:38:13
Any proposals to get the correct date? Thank you! :-) EDIT: And how would the code look like when this operation is imposed on a dataframe column rather than a variable?
CodePudding user response:
The Epoch, as far as SAS is concerned, is 1st January 1960. The number you have (5893) is the number of elapsed days since that Epoch. Therefore:
from datetime import timedelta, date
print(date(1960, 1, 1) timedelta(days=5893))
...will give you the desired result
CodePudding user response:
import numpy as np
import pandas as pd
ser = pd.Series([19411.0, 19325.0, 19325.0, 19443.0, 19778.0])
ser = pd.to_timedelta(ser, unit='D') pd.Timestamp('1960-1-1')