I'm trying to unify dates in a column as they come in different formats; current date entries:
[... '18-Aug-21' '16-Aug-21' '17-Aug-21' '22-Aug-21' '21-Aug-21' '20-Aug-21' '19-Aug-21' '23-Aug-21' '24-Aug-21' '25-Aug-21' '28-Aug-21' '26-Aug-21' '27-Aug-21' '31-Aug-21' '30-Aug-21' '29-Aug-21' '06 Sep 2021' '07 Sep 2021' '23 Sep 2021' '17 Sep 2021' '18 Sep 2021' '30 Sep 2021' '11 Sep 2021' '12 Sep 2021' '20 Sep 2021' '15 Sep 2021' '16 Sep 2021' '08 Sep 2021' '09 Sep 2021' '24 Sep 2021' '25 Sep 2021' '03 Sep 2021' '10 Sep 2021' '19 Sep 2021' '01 Sep 2021' '29 Sep 2021' '26 Sep 2021' '27 Sep 2021' '13 Sep 2021' '14 Sep 2021' '02 Sep 2021' '04 Sep 2021' '05 Sep 2021' ...
#1: trying to replace the dash here doesn't work on all dates
#2: when the year is YY as in '6-Aug-21', how can I format it?
for date in DF_all["SALES_DATE"]:
date = date.replace("-"," ")
DF_all["SALES_DATE"] = pd.to_datetime(DF_all["SALES_DATE"], format='%d
%b%Y', errors='ignore')
print(DF_all["SALES_DATE"].unique())
Output:
[...'18-Aug-21' '16-Aug-21' '17-Aug-21' '22-Aug-21' '21-Aug-21' '20-Aug-21' '19-Aug-21' '23-Aug-21' '24-Aug-21' '25-Aug-21' '28-Aug-21' '26-Aug-21' '27-Aug-21' '31-Aug-21' '30-Aug-21' '29-Aug-21' '06 Sep 2021' '07 Sep 2021' '23 Sep 2021' '17 Sep 2021' '18 Sep 2021' '30 Sep 2021' '11 Sep 2021' '12 Sep 2021' '20 Sep 2021' '15 Sep 2021' '16 Sep 2021' '08 Sep 2021' '09 Sep 2021' '24 Sep 2021' '25 Sep 2021' '03 Sep 2021' '10 Sep 2021' '19 Sep 2021' '01 Sep 2021' '29 Sep 2021' '26 Sep 2021' '27 Sep 2021' '13 Sep 2021' '14 Sep 2021' '02 Sep 2021' '04 Sep 2021' ...]
Is there a preferred method in python that solves this issue?
CodePudding user response:
I recommend dateutil
for this:
import dateutil
DF_all["SALES_DATE"] = DF_all["SALES_DATE"].apply(dateutil.parser.parse)
Output:
>>> DF_all
0 2021-08-18
1 2021-08-16
2 2021-08-17
3 2021-08-22
4 2021-08-21
5 2021-08-20
6 2021-08-19
7 2021-08-23
8 2021-08-24
9 2021-08-25
...
Name: 0, dtype: datetime64[ns]
You might need to install dateutil
first. Run the following in a terminal:
pip install python-dateutil
Or, in IPython or a Jupyter Notebook, run:
!pip install python-dateutil