I'm trying to get an output that includes every key, even if the is an equivalent value of 0.
import pandas as pd
df = pd.read_csv('climate_data_Dec2017.csv')
wind_direction = df['Direction of maximum wind gust']
is_on_a_specific_day = df['Date'].str.contains("12-26")
specific_day = df[is_on_a_specific_day]
grouped_by_date = specific_day.groupby('Direction of maximum wind gust')
number_record_by_date = grouped_by_date.size()
print(number_record_by_date)
The current output looks like this right now:
E 4
ENE 2
ESE 1
NE 1
NNE 1
NNW 1
SE 3
SSE 3
SW 1
But I'm trying to get it to include other directions too. ie
E 4
ENE 2
ESE 1
N 0
NE 1
NNE 1
NNW 1
NW 0
S 0
SE 3
SSE 3
SW 1
...
Is there any way to get my code to include it? I tried to group it by the wind direction dataframe rather than the specific_day dataframe, but going down that route, I'm stuck on what to do next. Any pointers would be great! Thanks
CodePudding user response:
Probably, you need something like this:
df['is_on_a_specific_day'] = df['Date'].str.contains("12-26")
df.groupy('Direction of maximum wind gust').sum()[['is_on_a_specific_day']]
CodePudding user response:
What you can do is:
- Computing a list with all the unique value of the column 'Direction of maximum wind gust' in the original dataset (
list_all_dirs = df['Direction of maximum wind gust'].unique()
) - Filter the dataset and compute the groupby as you said
- Append to the result one row for each of the value in the list that is not already there. What you can do is building a series like this:
series_to_append = pd.Series({dir: 0 for dir in list_all_dirs if dir not in number_record_by_date.index}, name='Direction of maximum wind gust')
and then append it to the series you already computed at the previous step.
Eleonora