Home > Net >  Sort a pandas dataframe by the second string in a column
Sort a pandas dataframe by the second string in a column

Time:09-27

I have the following dataframe in pandas:

            Player        Pos  G   IP  H  R  ER  BB  K   ERA  W  L
6       Andre Duffty      RHP  1  0.2  0  0   0   1  1  0.00  0  0
4          Chad Ring      RHP  1  1.0  0  0   0   2  0  0.00  0  0
5        Jake Johnson     LHP  1  1.0  0  0   0   0  2  0.00  0  0

I need to sort it by the last name. So far I have been able to sort it by the first name with:

batting_df = batting_df.sort_values(by=['Player'])
pitching_df = pitching_df.sort_values(by=['Player'])

How do I sort it by the last name, not the first?

CodePudding user response:

You can use sort_values and specify as key a function to extract the last word:

df.sort_values(by='Player', key=lambda x: x.str.split('\s ').str[-1])

Alternatively, you can reverse the order of the words to benefit from first name secondary sorting in case of identical last names:

df.sort_values(by='Player', key=lambda x: x.str.split('\s ').str[::-1])

output:

         Player  Pos  G   IP  H  R  ER  BB  K  ERA  W  L
6  Andre Duffty  RHP  1  0.2  0  0   0   1  1  0.0  0  0
5  Jake Johnson  LHP  1  1.0  0  0   0   0  2  0.0  0  0
4     Chad Ring  RHP  1  1.0  0  0   0   2  0  0.0  0  0
  • Related