I have year, month and date in three columns, I am concatenating them to one column then trying to make this column to YYYY/mm/dd format as follows:
dfyz_m_d['dt'] = '01'# to bring one date of each of the month
dfyz_m_d['CalendarWeek1'] = dfyz_m_d['year'].map(str) dfyz_m_d['mon'].map(str) dfyz_m_d['dt'].map(str)
dfyz_m_d['CalendarWeek'] = pd.to_datetime(dfyz_m_d['CalendarWeek1'], format='%Y%m%d')
but for both 1 ( jan) and 10 ( Oct) months I am getting only oct in final outcome (CalendarWeek comun doesn't have any Jan. Basically it is retaining all records but Jan month also it is formatting to Oct
CodePudding user response:
The issue is Jan
is single digit numerically, so you end up with something like 2021101
which will be interpreted as Oct instead of Jan. Make sure your mon
column is always converted to two digit months with leading zeros if needed using .zfill(2)
:
dfyz_m_d['year'].astype(str) dfyz_m_d['mon'].astype(str).str.zfill(2) dfyz_m_d['dt'].astype(str)
zfill example:
df = pd.DataFrame({'mon': [1,2,10]})
df.mon.astype(str).str.zfill(2)
0 01
1 02
2 10
Name: mon, dtype: object
CodePudding user response:
I usually do
pd.to_datetime(df.mon,format='%m').dt.strftime('%m')
0 01
1 02
2 10
Name: mon, dtype: object
Also , if you name the column correctly , notice the name as year month and day
df['day'] = '01'
df['new'] = pd.to_datetime(df.rename(columns={'mon':'month'})).dt.strftime('%m/%d/%Y')
df
year mon day new
0 2020 1 1 01/01/2020
1 2020 1 1 01/01/2020
CodePudding user response:
I like str.pad
:)
dfyz_m_d['year'].astype(str) dfyz_m_d['mon'].astype(str).str.pad(2, 'left', '0') dfyz_m_d['dt'].astype(str)
It will pad zeros to the left to ensure that the length of the strings will be two. SO 1
becomes 01
, but 10
stays to be 10
.
CodePudding user response:
You should be able to use pandas.to_datetime
with your input dataframe. You may need to rename your columns.
import pandas as pd
df = pd.DataFrame({'year': [2015, 2016],
'month': [2, 3],
'dt': [4, 5]})
print(pd.to_datetime(df.rename(columns={"dt": "day"})))
Output
0 2015-02-04
1 2016-03-05
dtype: datetime64[ns]