How to merge dataframe to each groupby group?-CodePudding

Given df1 and df2:

df1:

id	item	sub_item
1	52	AAA
2	52	AAA
3	55	BAA
4	77	CAA
5	77	CAA
6	77	CAA
7	77	CAB
8	77	CAC

df2:

id	color
1	#11
2	#12
3	#13
4	#14
5	#17
6	#18
7	#19
8	#20
9	#21
10	#25

produce df_result:

id	item	sub_item	color
1	52	AAA	#11
2	52	AAA	#11
3	55	BAA	#11
4	77	CAA	#11
5	77	CAA	#11
6	77	CAA	#11
7	77	CAB	#12
8	77	CAC	#13

id is unimportant. Each new sub-item, within an item group, gets new serial color number. Colors start re-assignment back to #11 for each new item. df2 will always be long enough to cover unique sub-items within an item group.

My intuition says to use groupby on df1, then merge each group with df2, but I need help with the code to make this happen.

CodePudding user response：

I'm not sure, but I think that's what you want

merged_df = pd.merge(
        df1,df2, how="left", on="Id"
    )

CodePudding user response：

Try:

df1["color"] = (df2.merge(df1["sub-item"].ne(df1.groupby("item")["sub-item"].transform("first"))
                                         .cumsum()
                                         .add(1)
                                         .rename("id"), 
                          how="right")["color"]
                )

>>> df1
   id  item sub-item color
0   1    52      AAA   #11
1   2    52      AAA   #11
2   3    55      BAA   #11
3   4    77      CAA   #11
4   5    77      CAA   #11
5   6    77      CAA   #11
6   7    77      CAB   #12
7   8    77      CAC   #13

id	item	sub_item	color
1	52	AAA	#11
2	52	AAA	#11
3	55	BAA	#11
4	77	CAA	#11
5	77	CAA	#11
6	77	CAA	#11
7	77	CAB	#12
8	77	CAC	#13

id	item	sub_item	color
1	52	AAA	#11
2	52	AAA	#11
3	55	BAA	#11
4	77	CAA	#11
5	77	CAA	#11
6	77	CAA	#11
7	77	CAB	#12
8	77	CAC	#13

id	item	sub_item	color
1	52	AAA	#11
2	52	AAA	#11
3	55	BAA	#11
4	77	CAA	#11
5	77	CAA	#11
6	77	CAA	#11
7	77	CAB	#12
8	77	CAC	#13