Home > database >  Pandas groupby method to aggregate based on string contained in column
Pandas groupby method to aggregate based on string contained in column

Time:03-14

New to Pandas/Python (student). I have what should be a simple problem but every approach I try fails.

Dataset has "country" column and "indicator" column. Countries appear >1 time. Indicator col tells us who is pro-vaccine ("Vac_plan" and "Vac_done") and who is not (as well as other info). I simply want a total for each country based on the count of who is pro-vaccine for that respective country., e.g.,

Ethiopia  7
Nigeria   5

My latest failed attempts are below:

vaccines_by_country=df.groupby('country')['indicator'=='Vac_plan|Vac_done'].count()

and...

df.groupby(['country']).str.contains('Vac_plan|Vac_done').count() 

TIA for your merciful help.

CodePudding user response:

You're quite close in your second attempt; you just need to reverse the order of actions. First find the strings, then group:

df['indicator'].str.contains('Vac_plan|Vac_done').groupby(df['country']).sum()
  • Related