The possible entries for age_group
are
and for sex
are Male or Female.
The data will be resampled monthly, how can I ensure (map values) to ensure that for each month for each district, there are all age_groups present (sometimes "Unidentified" is not present so it would be to fill the "Unidentified" for both Female and Male and then fill the total_number_individuals_vaccinated value with zero)
CodePudding user response:
have you tried
df[„age_groups“].fillna(„Unidentified“)
and
df[„total_number_individuals_vaccinated“].fillna(0)
CodePudding user response:
Each time you'll recieve new data. You can use:
df_new.loc[:, 'age_group'].nunique()
or
df_new.loc[:, 'age_group'].value_counts()
to verify the categories. You can check the official pandas api documentation for both functions .value_counts and .unique
When you'll receive a new set of data. Let's call it df_new
after reading it. You can fill NaN values using:
df_new.loc[:, 'age_group'] = df_new.loc[:, 'age_group'].fillna("Unidentified")
You can also do the same for the total_number_individuals_vaccinated
column
df_new.loc[:, 'total_number_individuals_vaccinated'] = df_new.loc[:, 'total_number_individuals_vaccinated'].fillna("Unidentified")
Then, you can concat your new data with past ones.
df = pd.concat([df_past, df_new], axis=1)