I have two Data Frames
jedis = {jedi_id': ["2", "4", "6", "1"],
'name':["Kylo", "Bastila", "Revan", "Steve from Minecraft"],
'Looted Items':}
inventory = {jedi_number': ["9", "4" , "6", "1", "1", "0", "2", "6", "1" , "55", "4",
"4", "0", "9"], 'Loot':["Holocron", "Bantha Fodder", "Blaster", "Bantha Fodder", "Credits", "Bantha Fodder", "Blaster", "Bantha Fodder", "Holocron", "Blaster", "Holocron", "bread loaf", "Credits", "Holocron"]}
jedis_df = pd.DataFrame(jedis)
inventory_df = pd.DataFrame(inventory)
So in this problem, I need take all the jedis from the jedis_df via jedi_df and confirm that they're they exist within the inventory_df. (ex.jedi_id 2 exists in inventory)
Once The ID is identified and is in the jedis_df, then the associated looted item will attach add itself to the jedis_df with the associated ID
- Example: Revan will have 'blaster, holocron' within the 'Looted Items' column and same cell next to the corresponding jedi ID
x = jedis_df.merge( inventory_df.groupby("jedi_number").sum(), left_on="jedi_id", right_index=True, how="left", )
I tried merging the two tables but couldnt figure out a way to figure out the solution.
I only really know how to add one specific corresponding matching id to one cell so this is new to me.
If anyone could help with this problem that would be much appreciated
CodePudding user response:
groupby.sum
automatically excludes non-numeric columns, so it doesn't concatenate strings within groups the way you might expect.
The solution is to run ', ',join
on the Loot
values of each group.
Option 1: groupby.agg
inventory_df.groupby("jedi_number")['Loot'].agg(', '.join)
Option 2: groupby.apply
inventory_df.groupby("jedi_number")['Loot'].apply(lambda x: ', '.join(x))
Both options produce the same output:
jedi_number
0 Bantha Fodder, Credits
1 Bantha Fodder, Credits, Holocron
2 Blaster
4 Bantha Fodder, Holocron, bread loaf
55 Blaster
6 Blaster, Bantha Fodder
9 Holocron, Holocron
Name: Loot, dtype: object
Replacing inventory_df.groupby("jedi_number").sum()
with either of these options in your expression for x
should produce the desired result.