Home > database >  How to get given and family from a pandas column
How to get given and family from a pandas column

Time:01-30

I have a pandas dataframe with a column name fullname:

    fullname
Andres Kirk Polsky
Anna Lorensen Polsman Uirch

I want to create two new columns like:

fullname                        given       rest_name
Andres Kirk Polsky              Andres      Kirk Polsky
Anna Lorensen Polsman Uirch     Anna        Lorensen Polsman Uirch

I am trying use str.aplit(' ', expand=True) but this no giving the result I want.

CodePudding user response:

A possible solution:

(df.assign(**df['fullname'].str.split(' ', n=1, expand=True)
           .rename({0: 'given', 1: 'rest_name'}, axis=1)))

Output:

                      fullname   given               rest_name
0           Andres Kirk Polsky  Andres             Kirk Polsky
1  Anna Lorensen Polsman Uirch    Anna  Lorensen Polsman Uirch

CodePudding user response:

A regex approach would be to use pandas.Series.str.extractall :

df[["given", "rest_name"]] = (df["fullname"].str.extractall(r"^(\S )\s(.*)")
                                              .reset_index(drop=True))

Output :

print(df)

                      fullname   given               rest_name
0           Andres Kirk Polsky  Andres             Kirk Polsky
1  Anna Lorensen Polsman Uirch    Anna  Lorensen Polsman Uirch
  • Related