I have a sample dataframe like;
df=pd.DataFrame({'degree_awarded':['yes','no','yes','yes',
'yes','yes' ,'yes','no'],
'avg_score':[78,87,94,55,68,76,78,8]
})
degree_awarded | avg_score |
---|---|
yes | 78 |
no | 87 |
yes | 94 |
yes | 55 |
etc. | etc. |
I'd like to separate the 'degree_awarded' column into 'degree_awarded', 'no_degree_awarded' arrays with the relevant score for example
degree_awarded: [78, 94, 55, etc.]
no_degree_awarded: [87, etc.]
but I'm not sure how to do this.
Any help would be appreciated, thanks for your time.
CodePudding user response:
listScoreAwarded=list(df[df['degree_awarded']=='yes']['avg_score'])
listScoreNotAwarded=list(df[df['degree_awarded']=='no']['avg_score'])
Both these lists should work
CodePudding user response:
You can assign
the labels you want, then use groupby.agg(list)
.
As Series:
(df
.assign(group=df['degree_awarded'].map({'yes': 'degree_awarded',
'no': 'no_degree_awarded'}))
.groupby('group')['avg_score'].agg(list)
)
output:
group
degree_awarded [78, 94, 55, 68, 76, 78]
no_degree_awarded [87, 8]
Name: avg_score, dtype: object
As dictionary:
(df
.assign(group=df['degree_awarded'].map({'yes': 'degree_awarded',
'no': 'no_degree_awarded'}))
.groupby('group')['avg_score'].agg(list)
.to_dict()
)
output: {'degree_awarded': [78, 94, 55, 68, 76, 78], 'no_degree_awarded': [87, 8]}