I'm sorry I'm very new to python. I have a dataset "olympics games": dataset and columns
olympics.isnull().sum
ID 0
Name 0
Sex 0
Age 9315
Height 58814
Weight 61527
Team 0
NOC 0
Games 0
Year 0
Season 0
City 0
Sport 0
Event 0
Medal 229959
dtype: int64
and I have created a dataframe that shows the number of athletics grouped by 'Sex' for the USA team
sex_counts_usa = pd.DataFrame(team_usa.groupby('Sex').count()['ID']).sort_values(by = 'Sex', ascending = False)
how can I add to this dataframe a new column to show the same results but as percentages?
many thanks in advance
CodePudding user response:
Try this
# count athletes by sex
sex_counts_usa = team_usa['Sex'].value_counts().to_frame('Count')
# percentage of athletes by sex
sex_counts_usa['Percentage'] = (sex_counts_usa / sex_counts_usa.sum() * 100).astype('string') '%'
If the aim is only to count by some column such as Sex, it's better to use .value_counts()
rather than .groupby('Sex').count()['ID']
in my opinion.