Printing an output using pandas .groupby to include keys that equalled to 0?-CodePudding

I'm trying to get an output that includes every key, even if the is an equivalent value of 0.

import pandas as pd

df = pd.read_csv('climate_data_Dec2017.csv')

wind_direction = df['Direction of maximum wind gust']
is_on_a_specific_day = df['Date'].str.contains("12-26")
specific_day = df[is_on_a_specific_day]

grouped_by_date = specific_day.groupby('Direction of maximum wind gust')
number_record_by_date = grouped_by_date.size()

print(number_record_by_date)

The current output looks like this right now:

E      4
ENE    2
ESE    1
NE     1
NNE    1
NNW    1
SE     3
SSE    3
SW     1

But I'm trying to get it to include other directions too. ie

E      4
ENE    2
ESE    1
N      0
NE     1
NNE    1
NNW    1
NW     0
S      0
SE     3
SSE    3
SW     1
...

Is there any way to get my code to include it? I tried to group it by the wind direction dataframe rather than the specific_day dataframe, but going down that route, I'm stuck on what to do next. Any pointers would be great! Thanks

CodePudding user response：

Probably, you need something like this:

df['is_on_a_specific_day'] = df['Date'].str.contains("12-26")
df.groupy('Direction of maximum wind gust').sum()[['is_on_a_specific_day']]

CodePudding user response：

What you can do is:

Computing a list with all the unique value of the column 'Direction of maximum wind gust' in the original dataset (list_all_dirs = df['Direction of maximum wind gust'].unique())
Filter the dataset and compute the groupby as you said
Append to the result one row for each of the value in the list that is not already there. What you can do is building a series like this: series_to_append = pd.Series({dir: 0 for dir in list_all_dirs if dir not in number_record_by_date.index}, name='Direction of maximum wind gust') and then append it to the series you already computed at the previous step.

Eleonora