Home > Net >  converting str to YYYYmmdd format in python
converting str to YYYYmmdd format in python

Time:09-27

I have year, month and date in three columns, I am concatenating them to one column then trying to make this column to YYYY/mm/dd format as follows:

dfyz_m_d['dt'] =  '01'# to bring one date of each of the month
dfyz_m_d['CalendarWeek1'] =  dfyz_m_d['year'].map(str)   dfyz_m_d['mon'].map(str)   dfyz_m_d['dt'].map(str)
dfyz_m_d['CalendarWeek'] = pd.to_datetime(dfyz_m_d['CalendarWeek1'], format='%Y%m%d')

but for both 1 ( jan) and 10 ( Oct) months I am getting only oct in final outcome (CalendarWeek comun doesn't have any Jan. Basically it is retaining all records but Jan month also it is formatting to Oct

CodePudding user response:

The issue is Jan is single digit numerically, so you end up with something like 2021101 which will be interpreted as Oct instead of Jan. Make sure your mon column is always converted to two digit months with leading zeros if needed using .zfill(2):

dfyz_m_d['year'].astype(str)   dfyz_m_d['mon'].astype(str).str.zfill(2)   dfyz_m_d['dt'].astype(str)

zfill example:

df = pd.DataFrame({'mon': [1,2,10]})

df.mon.astype(str).str.zfill(2)
0    01
1    02
2    10
Name: mon, dtype: object

CodePudding user response:

I usually do

pd.to_datetime(df.mon,format='%m').dt.strftime('%m')
0    01
1    02
2    10
Name: mon, dtype: object

Also , if you name the column correctly , notice the name as year month and day

df['day'] =  '01'
df['new'] = pd.to_datetime(df.rename(columns={'mon':'month'})).dt.strftime('%m/%d/%Y')
df
   year    mon  day         new
0  2020      1    1  01/01/2020
1  2020      1    1  01/01/2020

CodePudding user response:

I like str.pad :)

dfyz_m_d['year'].astype(str)   dfyz_m_d['mon'].astype(str).str.pad(2, 'left', '0')   dfyz_m_d['dt'].astype(str)

It will pad zeros to the left to ensure that the length of the strings will be two. SO 1 becomes 01, but 10 stays to be 10.

CodePudding user response:

You should be able to use pandas.to_datetime with your input dataframe. You may need to rename your columns.

import pandas as pd

df = pd.DataFrame({'year': [2015, 2016],
                   'month': [2, 3],
                   'dt': [4, 5]})
print(pd.to_datetime(df.rename(columns={"dt": "day"})))

Output

0   2015-02-04
1   2016-03-05
dtype: datetime64[ns]
  • Related