Home > other >  Create an array from a data frame, based off of conditions
Create an array from a data frame, based off of conditions

Time:04-10

I have a sample dataframe like;

df=pd.DataFrame({'degree_awarded':['yes','no','yes','yes',
                                 'yes','yes' ,'yes','no'],
                  'avg_score':[78,87,94,55,68,76,78,8]
                })

degree_awarded avg_score
yes 78
no 87
yes 94
yes 55
etc. etc.

I'd like to separate the 'degree_awarded' column into 'degree_awarded', 'no_degree_awarded' arrays with the relevant score for example

degree_awarded: [78, 94, 55, etc.]
no_degree_awarded: [87, etc.]

but I'm not sure how to do this.

Any help would be appreciated, thanks for your time.

CodePudding user response:

listScoreAwarded=list(df[df['degree_awarded']=='yes']['avg_score'])

listScoreNotAwarded=list(df[df['degree_awarded']=='no']['avg_score'])

Both these lists should work

CodePudding user response:

You can assign the labels you want, then use groupby.agg(list).

As Series:

(df
 .assign(group=df['degree_awarded'].map({'yes': 'degree_awarded',
                                         'no': 'no_degree_awarded'}))
 .groupby('group')['avg_score'].agg(list)
)

output:

group
degree_awarded       [78, 94, 55, 68, 76, 78]
no_degree_awarded                     [87, 8]
Name: avg_score, dtype: object

As dictionary:

(df
 .assign(group=df['degree_awarded'].map({'yes': 'degree_awarded',
                                         'no': 'no_degree_awarded'}))
 .groupby('group')['avg_score'].agg(list)
 .to_dict()
)

output: {'degree_awarded': [78, 94, 55, 68, 76, 78], 'no_degree_awarded': [87, 8]}

  • Related