Home > Software engineering >  Converting Twitter created_at to datetime not working
Converting Twitter created_at to datetime not working

Time:05-08

I'm working on a project to analyze tweets and am first trying to convert the created_at column to datetimes.

format = "%Y-%m-%d %H:%M:%S"
df['created_at_datetime'] = pd.to_datetime(df['created_at'], format = format).dt.tz_localize(None)

I keep on getting the following errorenter image description here:

I am in a very introductory and rudimentary class on analyzing Twitter so am not a coding expert at all. I've done homework assignments before where this line of code worked so am unsure as to what the error is now.

I am working in Colab and here is the full thing: https://colab.research.google.com/drive/1XXJsoMQouzH-1t7eWRd1c-fsrI3vYFcf?usp=sharing.

Thank you!

CodePudding user response:

try this :

format_y = "%Y-%m-%d %H:%M:%S"
pd.to_datetime(date, format = format_y).tz_localize(None)

CodePudding user response:

Check that all values in the 'created_at' column are timestamps formatted as you expect. It seems like some row could have the string "en" instead of a timestamp.

CodePudding user response:

You need to find the culprit value that doesn't fit. Here's the workflow:

import pandas as pd
raw_dt_series = pd.Series(['2022-05-05', 'foobar','2022-05-02', '202', None])
raw_dt_series_notna = raw_dt_series.dropna()
dt_series = pd.to_datetime(raw_dt_series_notna, errors='coerce')

Output:

0   2022-05-05
1          NaT
2   2022-05-02
3          NaT     **< - Treated as np.NaN in pandas**
dtype: datetime64[ns]

You found the rows that raised the Type error.

raw_dt_series_notna.loc[dt_series.isna()]

Time to investigate why the given values don't meet the format. After you've found out, adjust the format parameter:

pd.to_datetime(raw_dt_series, format='%YOUR%NEW%FORMAT)
  • Related