I have a df that looks similar to this:
|email|first_name|last_name|id|group_email|
|-|-|-|-|-|
|[email protected]|drew|barry|05|[email protected]|
|[email protected]|nate|lewis|03|[email protected]|
|[email protected]|chris|ryan|04|[email protected]|
I parse out the group_code, the sub string after the 3rd hyphen. I now want to add this sub tring back into the dataframe for each entry. So the df will look like so:
|email|first_name|last_name|id|group_email|group_code|
|-|-|-|-|-|-|
|[email protected]|drew|barry|05|[email protected]|rate|
|[email protected]|nate|lewis|03|[email protected]|factor|
|[email protected]|chris|ryan|04|[email protected]|drive|
How can I go about doing this?
CodePudding user response:
Let's try
df['group_code'] = (df['group_email'].str.extract('(-[^@]*){3}')[0]
.str.lstrip('-'))
print(df)
email first_name last_name id group_email group_code
0 [email protected] drew barry 5 [email protected] rate
1 [email protected] nate lewis 3 [email protected] factor
2 [email protected] chris ryan 4 [email protected] drive