I have four columns of values consisting of the looking times to a right/left/upper/lower positioned image. Another column shows the position of the image which was chosen (decision_resp). I created a new column showing the looking time of the chosen image. Now I want to create 3 more columns showing the looking times of the not chosen images sorted by highest looking time (top1), second highest looking time (toop2) and third highest looking time (top 3). The looking time of the chosen image has to be excluded.
These are the columns I have:
lookRight_t lookLeft_t lookUp_t lookDown_t decision_resp chosen_img_et
0 1.291667 1.325000 3.025000 1.141667 up 3.025000
1 0.000000 0.000000 1.125000 3.150000 down 3.150000
2 0.000000 0.000000 3.508333 2.275000 up 3.508333
3 3.700000 1.950000 0.000000 0.000000 right 3.700000
4 2.633333 1.316667 1.341667 0.000000 right 2.633333
5 1.766667 1.333333 0.825000 2.208333 down 2.208333
6 0.000000 0.000000 1.108333 5.283333 down 5.283333
My approach was:
# create new column for looking time of chosen image
trials.loc[trials['decision_resp']=='right','chosen_img_et'] = trials['lookRight_t']
trials.loc[trials['decision_resp']=='left','chosen_img_et'] = trials['lookLeft_t']
trials.loc[trials['decision_resp']=='down','chosen_img_et'] = trials['lookDown_t']
trials.loc[trials['decision_resp']=='up','chosen_img_et'] = trials['lookUp_t']
# here I got stuck
trials.loc[trials['decision_resp']=='right', 3 new columns (top1/2/3)] = trials[['lookLeft_t', 'lookDown_t', 'lookUp_t']].find max values and put it in order
trials.loc[trials['decision_resp']=='left', 3 new columns (top1/2/3)] = trials[['lookRight_t', 'lookDown_t', 'lookUp_t']].find max values and put it in order
trials.loc[trials['decision_resp']=='down', 3 new columns (top1/2/3)] = trials[['lookLeft_t', 'lookRight_t', 'lookUp_t']].find max values and put it in order
trials.loc[trials['decision_resp']=='up', 3 new columns (top1/2/3)] = trials[['lookLeft_t', 'lookDown_t', 'lookRight_t']].find max values and put it in order
Thank you for any help!
CodePudding user response:
First you can use Looking up solution for new column chosen_img_et
You can sorting selected columns by numpy.sort
and then use indexing for top3 rows without selected value, so is compared columns names by column decision_resp
with broadcasting and set missing values by mask by DataFrame.mask
:
cols = ['lookRight_t','lookLeft_t','lookUp_t','lookDown_t']
#replace substrings for match column decision_resp
look = [x.replace('look','').replace('_t','').lower() for x in cols]
new = [f'top{x 1}' for x in range(3)]
#lookup
idx1, cols1 = pd.factorize(trials['decision_resp'])
trials['chosen_img_et'] = (trials[cols].set_axis(look, axis=1)
.reindex(cols1, axis=1)
.to_numpy()[np.arange(len(trials)), idx1])
mask = np.array(look) == trials['decision_resp'].to_numpy()[:, None]
#np.sort sorting by default descending,
#so for ascending order use -1 in indexing and 2 is for remove first only NaN column
trials[new] = np.sort(trials[cols].mask(mask), axis=1)[:, 2::-1]
print (trials)
lookRight_t lookLeft_t lookUp_t lookDown_t decision_resp chosen_img_et \
0 1.291667 1.325000 3.025000 1.141667 up 3.025000
1 0.000000 0.000000 1.125000 3.150000 down 3.150000
2 0.000000 0.000000 3.508333 2.275000 up 3.508333
3 3.700000 1.950000 0.000000 0.000000 right 3.700000
4 2.633333 1.316667 1.341667 0.000000 right 2.633333
5 1.766667 1.333333 0.825000 2.208333 down 2.208333
6 0.000000 0.000000 1.108333 5.283333 down 5.283333
top1 top2 top3
0 1.325000 1.291667 1.141667
1 1.125000 0.000000 0.000000
2 2.275000 0.000000 0.000000
3 1.950000 0.000000 0.000000
4 1.341667 1.316667 0.000000
5 1.766667 1.333333 0.825000
6 1.108333 0.000000 0.000000
Details:
print (trials[cols].mask(mask))
lookRight_t lookLeft_t lookUp_t lookDown_t
0 1.291667 1.325000 NaN 1.141667
1 0.000000 0.000000 1.125000 NaN
2 0.000000 0.000000 NaN 2.275000
3 NaN 1.950000 0.000000 0.000000
4 NaN 1.316667 1.341667 0.000000
5 1.766667 1.333333 0.825000 NaN
6 0.000000 0.000000 1.108333 NaN
CodePudding user response:
- Find the max column name and extract the direction from it.
- Find the max for each row.
- Find the remaining 3 values and sort them as desired.
- Concat the results together.
- Renameing columns as I go.
decision_resp = df.idxmax(axis=1).str.extract('look(\w*)_t', expand=False)
decision_resp.rename('decision_resp', inplace=True)
chosen_img_et = df.max(axis=1, numeric_only=True)
chosen_img_et.rename('chosen_img_et', inplace=True)
top3 = df.apply(lambda x: x.nlargest(4).sort_values(ascending=False, ignore_index=True)[1:], axis=1)
top3.columns = ['top1', 'top2', 'top3']
df = pd.concat([df, decision_resp, chosen_img_et, top3], axis=1)
print(df)
Output:
lookRight_t lookLeft_t lookUp_t lookDown_t decision_resp chosen_img_et \
0 1.291667 1.325000 3.025000 1.141667 Up 3.025000
1 0.000000 0.000000 1.125000 3.150000 Down 3.150000
2 0.000000 0.000000 3.508333 2.275000 Up 3.508333
3 3.700000 1.950000 0.000000 0.000000 Right 3.700000
4 2.633333 1.316667 1.341667 0.000000 Right 2.633333
5 1.766667 1.333333 0.825000 2.208333 Down 2.208333
6 0.000000 0.000000 1.108333 5.283333 Down 5.283333
top1 top2 top3
0 1.325000 1.291667 1.141667
1 1.125000 0.000000 0.000000
2 2.275000 0.000000 0.000000
3 1.950000 0.000000 0.000000
4 1.341667 1.316667 0.000000
5 1.766667 1.333333 0.825000
6 1.108333 0.000000 0.000000
Other way, addressing jezrael's concerns:
col_list = ['lookRight_t', 'lookLeft_t', 'lookUp_t', 'lookDown_t']
idx, cols = pd.factorize('look' df['decision_resp'].str.title() '_t')
df['chosen_img_et'] = df.reindex(cols, axis=1).to_numpy()[np.arange(len(df)), idx]
mask = np.array(col_list) == df['decision_resp'].to_numpy()[:, None]
df[[f'top{x 1}' for x in range(3)]] = np.sort(df[col_list].mask(mask), axis=1)[:, 2::-1]
Output:
lookRight_t lookLeft_t lookUp_t lookDown_t decision_resp chosen_img_et \
0 1.291667 1.325000 3.025000 1.141667 up 3.025000
1 0.000000 0.000000 1.125000 3.150000 down 3.150000
2 0.000000 0.000000 3.508333 2.275000 up 3.508333
3 3.700000 1.950000 0.000000 0.000000 right 3.700000
4 2.633333 1.316667 1.341667 0.000000 right 2.633333
5 1.766667 1.333333 0.825000 2.208333 down 2.208333
6 0.000000 0.000000 1.108333 5.283333 down 5.283333
top1 top2 top3
0 1.325000 1.291667 1.141667
1 1.125000 0.000000 0.000000
2 2.275000 0.000000 0.000000
3 1.950000 0.000000 0.000000
4 1.341667 1.316667 0.000000
5 1.766667 1.333333 0.825000
6 1.108333 0.000000 0.000000