Home > Back-end >  Iterating over row and editing strings in Pandas
Iterating over row and editing strings in Pandas

Time:10-12

Good morning,

I have exhaustively searched for how best to do two things in Python/Pandas, and have not yet found the answer.

I have a df such as:

User Role
Roger Dodger (rogerdodger) user
Edwin Cullen (edwincullen) user
Hunter Andrews (hunterandrews) user

I would like iterate over the user column and leave only the text inside the parenthesis, with a result such as:

User Role
rogerdodger user
edwincullen user
hunterandrews user

I've found many successful ways for iterating. I've not found a way to do the string edits cleanly. I've seen some regex suggestions but am not all that familiar with how to implement them based on the other examples given.

CodePudding user response:

There are various ways to do that.

One way would be using pandas.Series.apply and a custom lambda function as follows

df['User'] = df['User'].apply(lambda x: x[x.find('(') 1:x.find(')')])

[Out]:

            User  Role
0    rogerdodger  user
1    edwincullen  user
2  hunterandrews  user

Another way could be with pandas.Series.str.extract as follows

df['User'] = df['User'].str.extract(r'\((.*?)\)', expand=False)

[Out]:

            User  Role
0    rogerdodger  user
1    edwincullen  user
2  hunterandrews  user

Notes:

  • If needed, one can also store the username in a different column, such as the column username as follows

    df['username'] = df['User'].str.extract(r'\((.*?)\)', expand=False)
    
    [Out]:
                                 User  Role       username
    0      Roger Dodger (rogerdodger)  user    rogerdodger
    1      Edwin Cullen (edwincullen)  user    edwincullen
    2  Hunter Andrews (hunterandrews)  user  hunterandrews
    
  • Related