I have a pandas dataframe with a column name fullname:
fullname
Andres Kirk Polsky
Anna Lorensen Polsman Uirch
I want to create two new columns like:
fullname given rest_name
Andres Kirk Polsky Andres Kirk Polsky
Anna Lorensen Polsman Uirch Anna Lorensen Polsman Uirch
I am trying use str.aplit(' ', expand=True
) but this no giving the result I want.
CodePudding user response:
A possible solution:
(df.assign(**df['fullname'].str.split(' ', n=1, expand=True)
.rename({0: 'given', 1: 'rest_name'}, axis=1)))
Output:
fullname given rest_name
0 Andres Kirk Polsky Andres Kirk Polsky
1 Anna Lorensen Polsman Uirch Anna Lorensen Polsman Uirch
CodePudding user response:
A regex approach would be to use pandas.Series.str.extractall
:
df[["given", "rest_name"]] = (df["fullname"].str.extractall(r"^(\S )\s(.*)")
.reset_index(drop=True))
Output :
print(df)
fullname given rest_name
0 Andres Kirk Polsky Andres Kirk Polsky
1 Anna Lorensen Polsman Uirch Anna Lorensen Polsman Uirch