I want to find the percentage of rows in a column that has a certain value. And find the percentage for each combination of two other columns.
Here is a example df:
data =[['North Shields','UK','Y'],['North Shields','Foreign','N']]
df = pd.DataFrame(data, columns = ['Port','Type','Shellfish Licence licence (Y/N)']
df
i have tried the following but get a key error, probably because i cant grouby two columns in this way.
port_shel_df = landing_fish_merge['Shellfish Licence licence
(Y/N)'].eq('Y').groupby(port_merge_lic_df['Port','Type]).mean().reset_index(name='Shellfish
license
percentage')
port_shel_df = port_shel_df.set_index('Port')
port_shel_df[:1]
CodePudding user response:
Use:
df = (landing_fish_merge.assign(new= landing_fish_merge['Shellfish Licence licence (Y/N)'].eq('Y'))
.groupby(['Port','Type'])['new'].mean()
.reset_index(name='Shellfish license percentage'))