How is a DataFrame of encoded dates converted to a DataFrame of datetime in pandas?
Starting dataFrame:
| year | month | day |
| ---- | ----- | ---
| 2022 | 'F' | 9 |
| 2022 | 'J' | 1 |
| 2022 | 'K' | 5 |
code for month:
'F' = January,
'J' = April,
'K' = May
Desired dataFrame:
| datetime |
| ---------- |
| 2022-01-09 |
| 2022-04-01 |
| 2022-05-05 |
What's the right way to do this?
This function translates a single code; how do I apply this function to the DataFrame?
def datefromcode(df):
match month:
case 'F': mo = 1
case 'J': mo = 4
case 'K': mo = 5
return datetime.date(year, mo, day)
CodePudding user response:
Here's one way:
pd.to_datetime(df.assign(month=df["month"].map({
"F": 1,
"J": 2,
"K": 3
})))
CodePudding user response:
You can use a dictionary to map the month numbers and feed it to pandas.to_datetime
that accepts a DataFrame as input with the year/month/day column names:
codes = {'F': 1, 'J': 4, 'K': 5}
pd.to_datetime(df.assign(month=df['month'].map(codes)))
output:
0 2022-01-09
1 2022-04-01
2 2022-05-05
dtype: datetime64[ns]
Or using string concatenation:
months = {'F': 'January', 'J': 'April', 'K': 'May'}
pd.to_datetime(df['year'].astype(str) df['month'].map(months) df['day'].astype(str), format='%Y%B%d')
CodePudding user response:
If you use a dictionary for the codes, you can map them first.
Then, using datetime, you can transform these to dates.
import pandas as pd
import datetime as dt
df = pd.DataFrame({'year': [2022]*3,
'month': list('FJK'),
'day': [9,1,5]})
months = {'F': 'January', 'J': 'April', 'K': 'May'}
df["month"] = df["month"].map(months)
df.apply(lambda x: dt.datetime.strptime(str(x.year) "-" x.month "-" str(x.day),"%Y-%B-%d"), axis=1)
#Out:
#0 2022-01-09
#1 2022-04-01
#2 2022-05-05
#dtype: datetime64[ns]