I am trying to convert a string to a datetime type in Python using Pandas, I scraped the data from a webpage. A sample of the data is given below. When I convert this using the pd.to_datetime function I receive NaT values but I'm not sure why, the object type is changed to datetime successfully however.
I have two values which should be "2021" but are "20201". I have replaced these and then converted to date time:
df['Date'] = df['Date'].replace("20201", "2021", inplace=True)
df['Date'] = pd.to_datetime(df['Date'])
I have also tried the below:
df['Date'] = pd.to_datetime(df['Date'], format = "%d/%m/%Y", errors = "coerce")
df['Date'] = pd.to_datetime(df['Date'], format = "%d/%m/%Y")
If I do not replace these values and instead just convert to datetime directly, ignoring the "20201" is out of range error, the code works fine and does not produce NaT values.
df['Date'] = pd.to_datetime(df['Date'], errors = "ignore")
Date | A | B | C |
---|---|---|---|
07/01/20201 | a | b | 2 |
08/01/20201 | b | c | 2 |
09/01/2022 | c | d | 1 |
10/01/2022 | d | e | 1 |
13/01/2022 | e | f | 3 |
14/01/2022 | f | g | 3 |
17/01/2022 | g | h | 3 |
Updated Dict:
{'Unnamed: 0': {351: 351, 352: 352},
'Date': {351: '17/4/20201', 352: '17/4/20201'},
'Selection': {351: 'Pour La Victoire', 352: 'Wiley Post'},
'Stake': {351: 1.0, 352: 1.0},
'Odds Advised': {351: 6.5, 352: 2.5},
'Profit / Loss': {351: -1.0, 352: -1.0},
'Bet Type': {351: 'Win', 352: 'Win'}}
CodePudding user response:
Problem might be that you use inplace=True
and Series.replace
replaces the whole string, you can use Series.str.replace
df['Date'].replace("20201", "2021", inplace=True)
# or
df['Date'] = df['Date'].replace("20201", "2021")
df['Date'] = pd.to_datetime(df['Date'])
print(df)
Date A B C
0 2021-07-01 a b 2.0
1 2021-08-01 b c 2.0
2 2022-09-01 c d 1.0
3 2022-10-01 d e 1.0
4 2022-01-13 e f 3.0
5 2022-01-14 f g 3.0
6 2022-01-17 g h 3.0