Home > Blockchain >  Add new column to existing dataframe from substring of existing column
Add new column to existing dataframe from substring of existing column

Time:10-25

I have a df that looks similar to this:

|email|first_name|last_name|id|group_email|
|-|-|-|-|-|
|[email protected]|drew|barry|05|[email protected]|
|[email protected]|nate|lewis|03|[email protected]|
|[email protected]|chris|ryan|04|[email protected]|

I parse out the group_code, the sub string after the 3rd hyphen. I now want to add this sub tring back into the dataframe for each entry. So the df will look like so:

|email|first_name|last_name|id|group_email|group_code|
|-|-|-|-|-|-|
|[email protected]|drew|barry|05|[email protected]|rate|
|[email protected]|nate|lewis|03|[email protected]|factor|
|[email protected]|chris|ryan|04|[email protected]|drive|

How can I go about doing this?

CodePudding user response:

Let's try

df['group_code'] = (df['group_email'].str.extract('(-[^@]*){3}')[0]
                    .str.lstrip('-'))
print(df)

            email first_name last_name  id                    group_email group_code
0   [email protected]       drew     barry   5     [email protected]       rate
1   [email protected]       nate     lewis   3  [email protected]     factor
2  [email protected]      chris      ryan   4  [email protected]      drive
  • Related