I have a column, it has A value either 'Y' or 'N' for yes or no. i want to be able to calculate the percentage of the occurance of Yes. and then include this as the value of a new column called "Percentage"
I have come up with this so far, Although this is what i need i dont know how to get the information in the way i describe
port_merge_lic_df.groupby(['Port'])['Shellfish Licence licence
(Y/N)'].value_counts(normalize=True) * 100
Port Shellfish Licence licence (Y/N)
ABERDEEN Y 80.731789
N 19.268211
AYR N 94.736842
Y 5.263158
BELFAST N 81.654676
...
STORNOWAY N 23.362692
0.383857
ULLAPOOL N 56.936826
Y 43.063174
WICK N 100.000000
Name: Shellfish Licence licence (Y/N), Length: 87, dtype: float64
The dataframe is in the form:
df1 = pd.DataFrame({'Port': {0: 'NORTH SHIELDS', 1: 'NORTH SHIELDS',
2: 'NORTH SHIELDS', 3: 'NORTH SHIELDS', 4: 'NORTH SHIELDS'},
'Shellfish Licence licence (Y/N)': {0: 'N', 1: 'N', 2: 'N', 3: 'N', 4: 'N'},
'Scallop Licence (Y/N)': {0: 'N', 1: 'N', 2: 'N', 3: 'N', 4: 'N'},
'Length Group': {0: 'Over10m', 1: 'Over10m', 2: 'Over10m',3:
'Over10m',4: 'Over10m'}})
df1
CodePudding user response:
Use:
df1['Shellfish Licence licence (Y/N)'].eq('Y').groupby(df1['Port']).mean().reset_index(name='meanY')
CodePudding user response:
IIUC, you can use:
df1['Shellfish Licence licence (Y/N)'].eq('Y').groupby(df1['Port']).mean()
output:
Port
NORTH SHIELDS 0.2
Name: Shellfish Licence licence (Y/N), dtype: float64