I have a pandas dataframe that has a structure as shown in this question Parsing JSON with number as key usng pandas-
Date Time InverterVoltage Invertercurrent
2021-11-15 14:37:05 219.1 20
2021-11-15 14:38:05 210.2 21
And so on . Data is available every 1 minute.
I have code like this -
df['inverterConsumption'] = df.inverterVoltage*df.inverterCurrent
Then I calculate the hourly mean by using this groupby construct
df['Datetime'] = pd.to_datetime(df['Date'].apply(str) ' ' df['Time'].apply(str))
davg_df2 = df.groupby(pd.Grouper(freq='H', key='Datetime')).mean()
What I want to do is the following - I want to filter the inverterConsumption for only the month of September
davg_df2 = davg_df2[davg_df2['Datetime'].dt.month_name() =='September']
But I get an error saying
KeyError: Datetime
So clearly the davg_df2
dataframe does not include the Datetime column that is present in df
(as it is non numeric). How can I include that in the groupby and grouper clause ?
Pandas version 1.5.2 and Python version 3.8
CodePudding user response:
df['Datetime'] = pd.to_datetime(df['Date'].apply(str) ' ' df['Time'].apply(str))
davg_df2 = df.groupby(pd.Grouper(freq='H', key='Datetime')).mean().reset_index()
add reset_index() in result of groupby
CodePudding user response:
To avoid getting Datetime
(the grouping column) as an index, and instead keeping it as a column, the pd.DataFrame.groupby
can receive the as_index=False
parameter that will do that for you.
as_index: bool, default True For aggregated output, return object with group labels as the index. Only relevant for DataFrame input. as_index=False is effectively “SQL-style” grouped output.
Source: pandas.DataFrame.groupby