I have a table where there are four columns consisting of dates in simple numbers (without any separator). There are some rows in 4th column having 0 instead of date, which I want to skip(not drop) and change the format of other dates to DD/MM/YYYY
Date1. | date2 | date3 | date4 |
---|---|---|---|
20200122 | 20201012 | 20210424 | 20180710 |
20190315 | 20210120 | 20180710 | 0 |
20220514 | 20200122 | 20200122 | 0 |
20180317 | 20200122 | 20201003 | 20200122 |
20201003 | 20200122 | 20200122 | 20171203 |
20171203 | 20200122 | 20200122 | 0 |
20210423 | 20200122 | 20200122 | 20190315 |
The output should be:
Date1. | date2 | date3 | date4 |
---|---|---|---|
22/01/2020 | 12/10/2020 | 24/04/2021 | 10/07/2018 |
15/03/2019 | 12/10/2021 | 10/07/2018 | 0 |
14/05/2022 | 12/10/2020 | 22/01/2020 | 0 |
17/03/2018 | 12/10/2020 | 03/10/2020 | 22/01/2020 |
03/10/2020 | 12/10/2020 | 22/01/2020 | 03/12/2017 |
03/12/2017 | 12/10/2020 | 22/01/2020 | 0 |
23/04/2021 | 12/10/2020 | 22/01/2020 | 15/03/2019 |
Logic: The dates in the first table should change to a format of DD/MM/YYYY and for the 4th column it should ignore the 0.
Thanks.
CodePudding user response:
You can convert date4
column to datetime by pd.to_datetime
with errors='coerce'
which treats mismatched time as NaT then convert it to string with .dt.strftime
and at last fill the NaT with original mismatched time.
df['date4'] = (pd.to_datetime(df['date4'], format='%Y%m%d', errors='coerce')
.dt.strftime('%d/%m/%Y')
.fillna(df['date4'])) # or fillna(0)
print(df)
Date1. date2 date3 date4
0 20200122 20201012 20210424 10/07/2018
1 20190315 20210120 20180710 0
2 20220514 20200122 20200122 0
3 20180317 20200122 20201003 22/01/2020
4 20201003 20200122 20200122 03/12/2017
5 20171203 20200122 20200122 0
6 20210423 20200122 20200122 15/03/2019