I am trying to consolidate a one-to-many relationship in my dataset while creating a link for each, and sometimes multiple items. For a given Ticket, there can be 1 or many Work Orders, so I have used the following to consolidate and list Work Orders in a single column:
df = df.groupby(['Date', 'Ticket ID', 'Score', 'many other Columns...'])['Work Order'].apply(', '.join).reset_index()
Which gives me a nice output where Work Orders are either listed individually or comma separted like so:
Date Ticket ID Work Order
2018-08-30 22:52:25 1444008 119846184
2021-09-29 13:33:49 1724734 122445397, 122441551
Now, I want to create links for the Work Orders as well that are also comma delimited. For the single Work Orders the following works just fine:
woLink = r'www.google.com/woNum='
df['WO Link'] = woLink df['Work Order'].astype(str)
However for the multiple it essentially undoes the groupby and re-adds the row that I grouped by. From the sample data above like:
# Wrong
Date Ticket ID Work Order Link
2018-08-30 22:52:25 1444008 119846184 google.com/woNum=119846184
2021-09-29 13:33:49 1724734 122445397 google.com/woNum=122445397
2021-09-29 13:33:49 1724734 122441551 google.com/woNum=122441551
What I want is:
Date Ticket ID Work Order Link (s)
2018-08-30 22:52:25 1444008 119846184 google.com/woNum=119846184
2021-09-29 13:33:49 1724734 122445397, 122441551 google.com/woNum=122445397, google.com/woNum=122441551
I have tried several different variations of Groupby/apply like regrouping without success:
df = df.groupby(['Date', 'Ticket ID', 'Score', 'many other Columns...'])['Link'].apply(', '.join).reset_index()
It always seems to duplicate the row again - I know this is because WO and WOLink both have a 1 to Many, but I cannot seem to figure how to handle both at once.
How can I groupby and consolidate this dataframe?
CodePudding user response:
You can use a regex to add the URL part:
woLink = r'example.org/woNum='
df['Link'] = df['Work Order'].str.replace('(\d )', rf'{woLink}\1')
output:
Date Ticket ID Work Order Link
0 2018-08-30 22:52:25 1444008 119846184 example.org/woNum=119846184
1 2021-09-29 13:33:49 1724734 122445397, 122441551 example.org/woNum=122445397, example.org/woNum=122441551