i got a list of dates, like below:
date_list = ['1. Okt 2021', '2. Okt 2021', '3. Okt 2021', '4. Okt 2021', '5. Okt 2021', '6. Okt 2021', '24. Sep 2021', '25. Sep 2021', '26. Sep 2021']
i want to transform into datetime
dates = [datetime.strptime(x,"%d %b %Y") for x in date_list]
Output is:
Traceback (most recent call last):
File "c:/Users/Benutzt/Desktop/web_scraping/main.py", line 27, in <module>
dates = [datetime.strptime(x,"%d %M %Y") for x in date_list]
File "c:/Users/Benutzt/Desktop/web_scraping/main.py", line 27, in <listcomp>
dates = [datetime.strptime(x,"%d %M %Y") for x in date_list]
File "C:\Users\Benutzt\anaconda3\lib\_strptime.py", line 568, in _strptime_datetime
tt, fraction, gmtoff_fraction = _strptime(data_string, format)
File "C:\Users\Benutzt\anaconda3\lib\_strptime.py", line 349, in _strptime
raise ValueError("time data %r does not match format %r" %
ValueError: time data '1. Okt 2021' does not match format '%d %b %Y'
CodePudding user response:
For language specific month (or day) names, you can set the locale, e.g. German
import locale
locale.setlocale(locale.LC_TIME, 'de_de') # locale (2nd parameter) is platform-specific !
For a list of valid date inputs, this gives for example
from datetime import datetime
date_list = ['1. Okt 2021', '2. Okt 2021', '3. Okt 2021', '4. Okt 2021', '5. Okt 2021', '6. Okt 2021', '30. Sep 2021']
dates = [datetime.strptime(x, "%d. %b %Y") for x in date_list]
print(dates)
[datetime.datetime(2021, 10, 1, 0, 0), datetime.datetime(2021, 10, 2, 0, 0), datetime.datetime(2021, 10, 3, 0, 0), datetime.datetime(2021, 10, 4, 0, 0), datetime.datetime(2021, 10, 5, 0, 0), datetime.datetime(2021, 10, 6, 0, 0), datetime.datetime(2021, 9, 30, 0, 0)]
Side-note: The locale setting also makes it work in pandas:
import pandas as pd
df = pd.DataFrame({'dates': date_list})
df['dates'] = pd.to_datetime(df['dates'], format="%d. %b %Y")
df['dates']
0 2021-10-01
1 2021-10-02
2 2021-10-03
3 2021-10-04
4 2021-10-05
5 2021-10-06
6 2021-09-30
Name: dates, dtype: datetime64[ns]
CodePudding user response:
You can use dateparser
package:
# Python env: pip install dateparser
# Anaconda env: conda install dateparser
from dateparser import parse
df = pd.DataFrame({'Date': ['1. Okt 2021', '2. Okt 2021', '3. Okt 2021',
'4. Okt 2021', '5. Okt 2021', '6. Okt 2021',
'24. Sep 2021', '25. Sep 2021', '26. Sep 2021']})
df['Date'] = df['Date'].apply(parse, languages=['de'])
print(df)
# Output:
0 2021-10-01
1 2021-10-02
2 2021-10-03
3 2021-10-04
4 2021-10-05
5 2021-10-06
6 2021-09-24
7 2021-09-25
8 2021-09-26
Name: Date, dtype: datetime64[ns]
For a list:
date_list = ['1. Okt 2021', '2. Okt 2021', '3. Okt 2021',
'4. Okt 2021', '5. Okt 2021', '6. Okt 2021',
'24. Sep 2021', '25. Sep 2021', '26. Sep 2021']
dates = [parse(d, languages=['de']) for d in date_list]
print(dates)
# Output:
[datetime.datetime(2021, 10, 1, 0, 0),
datetime.datetime(2021, 10, 2, 0, 0),
datetime.datetime(2021, 10, 3, 0, 0),
datetime.datetime(2021, 10, 4, 0, 0),
datetime.datetime(2021, 10, 5, 0, 0),
datetime.datetime(2021, 10, 6, 0, 0),
datetime.datetime(2021, 9, 24, 0, 0),
datetime.datetime(2021, 9, 25, 0, 0),
datetime.datetime(2021, 9, 26, 0, 0)]
CodePudding user response:
It looks like the first part of your date is an ID, e.g. the order of the item in a list. If so, you'll need to remove it before converting the dates. Also, Okt
will not match the %b
format. You'll need to convert it to Oct
.
dates = [datetime.strptime(x.split(".")[-1].strip(), "%b %Y") for x in date_list]