I have an ID and its corresponding multiple ages
|ID| AGE| SEX|
|----|---- |---- |
|25| 11|1 |
|25| 12| 1 |
|18| 11| 1 |
|18| 12| 2 |
|18| 13|2 |
|199| 11| 1 |
|409| 11| 1 |
I would like to plot number of profiles for each ID and its corresponding age, and use hue to separate by gender. For example, for age 11, there are 4 ID(profiles). for age 12, there exists 2 profiles.
I used following approach. The problem is when I use unique(), I get a list of ages for some ID numbers.
for an_id in df.ID.unique():
if (len(df[df['ID'] == an_id]['AGE'].unique()))==1:
print(an_id, df[df['ID'] == an_id]['AGE'].unique()[0])
else:
print(an_id,df[df['ID']==an_id]['AGE'].unique())
How can I plot number of unique ID for each age?
CodePudding user response:
You can use the .nunique()
method
grouped = df.groupby('AGE')['ID'].nunique().reset_index(name='counts')
CodePudding user response:
this works ?
df = pd.DataFrame(data={
"AGE":[11,12,13,13,13,13,12,11,11,11,11],
"ID":[1,2,3,4,5,6,7,8,1,2,1],
"SEX":["MALE","FEMALE","FEMALE","FEMALE","FEMALE","MALE","MALE","MALE","MALE","MALE","MALE"]
})
pivot = df.pivot_table(index="AGE", columns="SEX", aggfunc="value_counts").reset_index()