I have a DataFrame with a column of the time and a column in which I have stored a time lag. The data looks like this:
2020-04-18 14:00:00 0 days 03:00:00
2020-04-19 02:00:00 1 days 13:00:00
2020-04-28 14:00:00 1 days 17:00:00
2020-04-29 20:00:00 2 days 09:00:00
2020-04-30 19:00:00 2 days 11:00:00
Time, Length: 282, dtype: datetime64[ns] Average time lag, Length: 116, dtype: object
I want to plot the Time on the x-axis vs the time lag on the y-axis. However, I keep having errors with plotting the second column. Any tips on how to handle this data for the plot?
CodePudding user response:
In order to plot the time lag on the y-axis, you will need to convert the time lag from a timedelta object to a numerical value that can be used in the plot. One way to do this is to convert the time lag to seconds using the total_seconds method, and then plot the resulting values on the y-axis.
Here is an example of how you can do this:
import pandas as pd
import matplotlib.pyplot as plt
# Create a dataframe with the time and time lag data
data = [ ['2020-04-18 14:00:00', '0 days 03:00:00'],
['2020-04-19 02:00:00', '1 days 13:00:00'],
['2020-04-28 14:00:00', '1 days 17:00:00'],
['2020-04-29 20:00:00', '2 days 09:00:00'],
['2020-04-30 19:00:00', '2 days 11:00:00'],
]
df = pd.DataFrame(data, columns=['time', 'time_lag'])
# Convert the time and time lag columns to datetime and timedelta objects
df['time'] = pd.to_datetime(df['time'])
df['time_lag'] = pd.to_timedelta(df['time_lag'])
# Convert the time lag to seconds
df['time_lag_seconds'] = df['time_lag'].dt.total_seconds()
# Create a scatter plot with the time on the x-axis and the time lag in seconds on the y-axis
plt.scatter(df['time'], df['time_lag_seconds'])
plt.show()