Home > database >  Reverse order of substring in pandas column
Reverse order of substring in pandas column

Time:07-19

I have a pandas dataframe of that records the publications and the authors.

The dataframe is like this:

Title Author
A     A Ala, D Pamucar, EB Tirkolaee
B     A Heydari, S Niroomand
C     F Marisa, SS Syed Ahmad, N Kausar, S Kousar
...

I would like to reverse the order of the authors' last names and the first names, so the last name will be listed first:

Title Author
A     Ala A, Pamucar D, Tirkolaee EB 
B     Heydari A, Niroomand S 
C     Marisa F, Syed Ahmad SS, Kausar N , Kousar S 
...

I'm thinking of using str.split to split the authors, and then use join and reversed. But the authors' orders are changed too. Is there a better solution to do this?

CodePudding user response:

You can use a regex. Assuming the first name has up to two letters here, but you can adapt if needed (use \w in place of \w{,2}):

df['Author'] = df['Author'].str.replace(r'\b(\w{,2})\b\s \b([^,] )\b',
                                        r'\2 \1', regex=True

output (as new columns "Author2" for clarity):

  Title                                       Author                                      Author2
0     A               A Ala, D Pamucar, EB Tirkolaee               Ala A, Pamucar D, Tirkolaee EB
1     B                       A Heydari, S Niroomand                       Heydari A, Niroomand S
2     C  F Marisa, SS Syed Ahmad, N Kausar, S Kousar  Marisa F, Syed Ahmad SS, Kausar N, Kousar S

regex:

\b(\w{,2})\b   # match first name (up to 2 letters)
\s             # one or more spaces
\b([^,] )\b    # one or more non "," characters

CodePudding user response:

    
df.Author.apply(lambda x: ', '.join([' '.join(i.split()[::-1]) for i in x.split(',')]) )

Output:

0                Ala A, Pamucar D, Tirkolaee EB
1                       Heydari A, Niroomand S
2    Marisa F, Ahmad Syed SS, Kausar N, Kousar S
Name: Author, dtype: object
  • Related