I have the following dataframe in pandas
:
Player Pos G IP H R ER BB K ERA W L
6 Andre Duffty RHP 1 0.2 0 0 0 1 1 0.00 0 0
4 Chad Ring RHP 1 1.0 0 0 0 2 0 0.00 0 0
5 Jake Johnson LHP 1 1.0 0 0 0 0 2 0.00 0 0
I need to sort it by the last name. So far I have been able to sort it by the first name with:
batting_df = batting_df.sort_values(by=['Player'])
pitching_df = pitching_df.sort_values(by=['Player'])
How do I sort it by the last name, not the first?
CodePudding user response:
You can use sort_values
and specify as key
a function to extract the last word:
df.sort_values(by='Player', key=lambda x: x.str.split('\s ').str[-1])
Alternatively, you can reverse the order of the words to benefit from first name secondary sorting in case of identical last names:
df.sort_values(by='Player', key=lambda x: x.str.split('\s ').str[::-1])
output:
Player Pos G IP H R ER BB K ERA W L
6 Andre Duffty RHP 1 0.2 0 0 0 1 1 0.0 0 0
5 Jake Johnson LHP 1 1.0 0 0 0 0 2 0.0 0 0
4 Chad Ring RHP 1 1.0 0 0 0 2 0 0.0 0 0