I have a pandas data frame from this CSV file.
I have several of these files that I'm combining into one large file. I need to iterate over the player column, and if one cell partially matches the corresponding CPTN cell, then I need to update the Pos value for the player cell containing CPTN to be CPTN '(POS)'.
The final results would look something like this:
Bryce Mitchell WR
Bryant Koback RB
Bryant Koback CPTN CPTN (RB)
Bryce Mitchell CPTN CPTN (WR)
Here is a spreadsheet to test: Test Sheet
CodePudding user response:
an idea will be to update over a mask:
cptn_mask = df['Player'].str.contains('CPTN')
df.loc[cptn_mask , 'Player'] = "CPTN" df.loc[cptn_mask , 'Position']
CodePudding user response:
I slightly adjusted the data so we can see one who didn't play CPTN
df
###
Player Pos Salary
0 Bryce Mitchell WR 6400
1 Bryant Koback RB 10200
2 Bryant Koback CPTN CPTN 15300
3 Bryce Mitchell CPTN CPTN 9600
4 Jordan Legendre QB 23450
temp = df.copy()
temp['Player'] = temp['Player'].str.replace(' CPTN', '')
temp_g = temp.groupby('Player')['Pos'].apply(lambda x: x.str.cat(sep=' ')).reset_index()
temp_g['Player'] = np.where(temp_g['Pos'].str.contains('CPTN'), temp_g['Player'] ' CPTN', temp_g['Player'])
temp_g['Pos'] = np.where(temp_g['Pos'].str.contains('CPTN'), 'CPTN (' temp_g['Pos'].str.replace('CPTN', '').str.strip() ')', temp_g['Pos'])
temp_g = temp_g[temp_g['Player'].str.contains('CPTN')]
df['Pos'] = np.where(df['Player'].str.contains('CPTN'), df['Player'].map(temp_g.set_index('Player')['Pos']), df['Pos'])
df
###
Player Pos Salary
0 Bryce Mitchell WR 6400
1 Bryant Koback RB 10200
2 Bryant Koback CPTN CPTN (RB) 15300
3 Bryce Mitchell CPTN CPTN (WR) 9600
4 Jordan Legendre QB 23450