Home > Blockchain >  Unable to plot a pandas dataframe or calculate the median mean
Unable to plot a pandas dataframe or calculate the median mean

Time:12-30

I have a similar problem to this question - How to retain datetime column in pandas grouper and group by?

My data is like this -

 Date       Time                 Data1            Data2
 2021-11-15 1:00:05              100                 20
 2021-11-15 1:00:10              200                 21

 df['Datetime'] = pd.to_datetime(df['Date'].apply(str) ' ' df['Time'].apply(str))
 df2 = df.groupby(pd.Grouper(freq='H', key='Datetime')).mean().reset_index()
 df2 = df2[(df2['Datetime'] >'2022-09-16') & (df2['Datetime'] <'2022-10-01') 

Then I try either of these constructs -

 df2.plot(x='Datetime',y='Data1')
 plt.show()

or

 a = pd.to_timedelta(df2['Datetime']   ':00').mean()

In both these cases I get the following error

 KeyError: Datetime 

Where am I going wrong ?

When I type

 df2.columns 

Datetime is not one of the columns.

Based on the linked question and answer the datetime column should be part of the dataframe 'df2' but it is not happening in my case. I am not sure how to proceed further.

CodePudding user response:

Not sure what is going on, but the following code just seems to work. Make sure your data selection on df2 includes the pd.TimeStamp() format though, instead of strings

import pandas as pd
import matplotlib.pyplot as plt

data = {"Date": ["2021-11-15", "2021-11-15", "2021-11-15", "2021-11-15"], 
        "Time": ["1:00:05", "1:00:10", "2:00:05", "2:00:10"],
        "Data1": [100,200,300,350],
        "Data2":[20,21,22,23]}
df = pd.DataFrame(data)

df['Datetime'] = pd.to_datetime(df['Date'].apply(str) ' ' df['Time'].apply(str))
df2 = df.groupby(pd.Grouper(freq='H', key='Datetime')).mean(numeric_only=True).reset_index()
df2 = df2[(df2['Datetime'] > pd.Timestamp('2020-03-31')) & (df2['Datetime'] <pd.Timestamp('2022-03-31'))]
 
df2.plot(x='Datetime',y='Data1')
plt.show()
  • Related