How to create a new column with the percentage of the occurance of a particular value in another col-CodePudding

I have a column, it has A value either 'Y' or 'N' for yes or no. i want to be able to calculate the percentage of the occurance of Yes. and then include this as the value of a new column called "Percentage"

I have come up with this so far, Although this is what i need i dont know how to get the information in the way i describe

port_merge_lic_df.groupby(['Port'])['Shellfish Licence licence 
(Y/N)'].value_counts(normalize=True) * 100

Port       Shellfish Licence licence (Y/N)
ABERDEEN   Y                                   80.731789
           N                                   19.268211
AYR        N                                   94.736842
           Y                                    5.263158
BELFAST    N                                   81.654676
                                         ...    
STORNOWAY  N                                   23.362692
                                        0.383857
ULLAPOOL   N                                   56.936826
           Y                                   43.063174
WICK       N                                  100.000000
Name: Shellfish Licence licence (Y/N), Length: 87, dtype: float64

The dataframe is in the form:

df1 = pd.DataFrame({'Port': {0: 'NORTH SHIELDS', 1: 'NORTH SHIELDS', 
2: 'NORTH SHIELDS', 3: 'NORTH SHIELDS',  4: 'NORTH SHIELDS'},
'Shellfish Licence licence (Y/N)': {0: 'N', 1: 'N', 2: 'N', 3: 'N', 4: 'N'},
'Scallop Licence (Y/N)': {0: 'N', 1: 'N', 2: 'N', 3: 'N', 4: 'N'},
'Length Group': {0: 'Over10m',  1: 'Over10m', 2: 'Over10m',3: 
'Over10m',4: 'Over10m'}})

df1

CodePudding user response：

Use:

df1['Shellfish Licence licence (Y/N)'].eq('Y').groupby(df1['Port']).mean().reset_index(name='meanY')

CodePudding user response：

IIUC, you can use:

df1['Shellfish Licence licence (Y/N)'].eq('Y').groupby(df1['Port']).mean()

output:

Port
NORTH SHIELDS    0.2
Name: Shellfish Licence licence (Y/N), dtype: float64