I have a df, you can have it by running this code:
import pandas as pd
from io import StringIO
dfs = """
BornDate
2 19850100
3 19000100
5 19850100
6 19000100
7 19820100
8 19850100
9 19000100
10 19790100
11 19850100
"""
df = pd.read_csv(StringIO(dfs.strip()), sep='\s ',
dtype={"BornDate": int})
df
The output is:
BornDate
2 19850100
3 19000100
5 19850100
6 19000100
7 19820100
8 19850100
9 19000100
10 19790100
11 19850100
What I need is converting the BornDate
to age base on a variable ValuationDate,
the age equals ValuationDate
minus BornDate
:
ValuationDate = 20201231
ValuationDate=pd.to_datetime(ValuationDate)
df['BornDate']=pd.to_datetime(df['BornDate'])
df['BornDate']=ValuationDate-df['BornDate']
df['BornDate']
But it returns:
2 00:00:00.000351
3 00:00:00.001201
5 00:00:00.000351
6 00:00:00.001201
7 00:00:00.000381
8 00:00:00.000351
9 00:00:00.001201
10 00:00:00.000411
11 00:00:00.000351
Name: BornDate, dtype: timedelta64[ns]
The output should be:
2 35.1
3 12.01
5 35.1
6 120.1
7 38.1
8 35.1
9 120.1
10 41.1
11 35.1
CodePudding user response:
Suppose your column BornDate
is in format the same as ValuationDate
in YYYYmmdd
, then, you should change your codes as follows to cater for this date string format:
ValuationDate = 20201231
ValuationDate=pd.to_datetime(ValuationDate, format='%Y%m%d')
df['BornDate']=pd.to_datetime(df['BornDate'], format='%Y%m%d')
df['BornDate'] = (ValuationDate - df['BornDate']) / np.timedelta64(1, 'Y')
Data Input
Modified the day of each date from invalid 00
to 01
:
BornDate
2 19850101
3 19000101
5 19850101
6 19000101
7 19820101
8 19850101
9 19000101
10 19790101
11 19850101
Output
BornDate
2 35.998001
3 120.999062
5 35.998001
6 120.999062
7 38.998747
8 35.998001
9 120.999062
10 41.999493
11 35.998001
CodePudding user response:
Use dt.microseconds
:
>>> df['BornDate'].dt.microseconds.div(10)
2 35.1
3 120.1
5 35.1
6 120.1
7 38.1
8 35.1
9 120.1
10 41.1
11 35.1
Name: BornDate, dtype: float64