Home > Software design >  How to retain datetime column in pandas grouper and group by?
How to retain datetime column in pandas grouper and group by?

Time:12-26

I have a pandas dataframe that has a structure as shown in this question Parsing JSON with number as key usng pandas-

    Date       Time                 InverterVoltage        Invertercurrent
    2021-11-15 14:37:05              219.1                 20
    2021-11-15 14:38:05              210.2                 21

And so on . Data is available every 1 minute.

I have code like this -

df['inverterConsumption'] = df.inverterVoltage*df.inverterCurrent

Then I calculate the hourly mean by using this groupby construct

df['Datetime'] = pd.to_datetime(df['Date'].apply(str) ' ' df['Time'].apply(str))
davg_df2 = df.groupby(pd.Grouper(freq='H', key='Datetime')).mean()

What I want to do is the following - I want to filter the inverterConsumption for only the month of September

davg_df2 = davg_df2[davg_df2['Datetime'].dt.month_name() =='September']

But I get an error saying

KeyError: Datetime 

So clearly the davg_df2 dataframe does not include the Datetime column that is present in df(as it is non numeric). How can I include that in the groupby and grouper clause ?

Pandas version 1.5.2 and Python version 3.8

CodePudding user response:

df['Datetime'] = pd.to_datetime(df['Date'].apply(str) ' ' df['Time'].apply(str))
davg_df2 = df.groupby(pd.Grouper(freq='H', key='Datetime')).mean().reset_index()

add reset_index() in result of groupby

CodePudding user response:

To avoid getting Datetime (the grouping column) as an index, and instead keeping it as a column, the pd.DataFrame.groupby can receive the as_index=False parameter that will do that for you.

as_index: bool, default True For aggregated output, return object with group labels as the index. Only relevant for DataFrame input. as_index=False is effectively “SQL-style” grouped output.

Source: pandas.DataFrame.groupby

  • Related