I have df
as
acc_id acc_status username TeamMemRole email
A1 Inactive A Serv Primary [email protected]
A1 Inactive B Sales - Primary [email protected]
A1 Inactive C Ent Specialist [email protected]
A2 Inactive D Sales - Primary [email protected]
B1 Inactive E Tech Sales - Primary [email protected]
B1 Inactive F Ent Specialist [email protected]
B1 Inactive G Sales - Primary [email protected]
C1 Inactive H Sales - Primary [email protected]
C1 Inactive I Sales - Secondary [email protected]
C1 Inactive J Tech Sales - Primary [email protected]
I want all the the emails expect Sales - Primary
for that particular acc_id
to be concatenated and be in a separate column and the emails of only Sales - Primary
in a separate column as below
acc_id acc_status username TeamMemRole email To CC
A1 Inactive A Serv Primary [email protected] [email protected] [email protected],[email protected]
A1 Inactive B Sales - Primary [email protected]
A1 Inactive C Ent Specialist [email protected]
A2 Inactive D Sales - Primary [email protected] [email protected]
B1 Inactive E Tech Sales - Primary [email protected] [email protected] [email protected],[email protected]
B1 Inactive F Ent Specialist [email protected]
B1 Inactive G Sales - Primary [email protected]
C1 Inactive H Sales - Primary [email protected] [email protected] [email protected],[email protected]
C1 Inactive I Sales - Secondary [email protected]
C1 Inactive J Tech Sales - Primary [email protected]
I tried using
df[df['TeamMemRole'].str.contains('Sales - Primary')==False]
but this is removing Tech Sales - Primary
as well.
Please help with any suggestions and approaches
CodePudding user response:
Let's try:
# first row for each group
first_rows = ~df['acc_id'].duplicated()
# primary sale rows
prim_sale_rows = df['TeamMemRole'].eq('Sales - Primary')
# To column
prim_sales = df.loc[prim_sale_rows].groupby('acc_id')['email'].apply(', '.join)
df.loc[first_rows, 'To'] = df['acc_id'].map(prim_sales)
# CC column
CC_list = df.loc[~prim_sale_rows].groupby('acc_id')['email'].apply(', '.join)
df.loc[first_rows, 'CC'] = df['acc_id'].map(CC_list)
Output:
acc_id acc_status username TeamMemRole email To CC
0 A1 Inactive A Serv Primary [email protected] [email protected] [email protected], [email protected]
1 A1 Inactive B Sales - Primary [email protected] NaN NaN
2 A1 Inactive C Ent Specialist [email protected] NaN NaN
3 A2 Inactive D Sales - Primary [email protected] [email protected] NaN
4 B1 Inactive E Tech Sales - Primary [email protected] [email protected] [email protected], [email protected]
5 B1 Inactive F Ent Specialist [email protected] NaN NaN
6 B1 Inactive G Sales - Primary [email protected] NaN NaN
7 C1 Inactive H Sales - Primary [email protected] [email protected] [email protected], [email protected]
8 C1 Inactive I Sales - Secondary [email protected] NaN NaN
9 C1 Inactive J Tech Sales - Primary [email protected] NaN NaN