Home > Blockchain >  How to merge dataframe to each groupby group?
How to merge dataframe to each groupby group?

Time:12-24

Given df1 and df2:

df1:

id item sub_item
1 52 AAA
2 52 AAA
3 55 BAA
4 77 CAA
5 77 CAA
6 77 CAA
7 77 CAB
8 77 CAC

df2:

id color
1 #11
2 #12
3 #13
4 #14
5 #17
6 #18
7 #19
8 #20
9 #21
10 #25

produce df_result:

id item sub_item color
1 52 AAA #11
2 52 AAA #11
3 55 BAA #11
4 77 CAA #11
5 77 CAA #11
6 77 CAA #11
7 77 CAB #12
8 77 CAC #13

id is unimportant. Each new sub-item, within an item group, gets new serial color number. Colors start re-assignment back to #11 for each new item. df2 will always be long enough to cover unique sub-items within an item group.

My intuition says to use groupby on df1, then merge each group with df2, but I need help with the code to make this happen.

CodePudding user response:

I'm not sure, but I think that's what you want

merged_df = pd.merge(
        df1,df2, how="left", on="Id"
    )

CodePudding user response:

Try:

df1["color"] = (df2.merge(df1["sub-item"].ne(df1.groupby("item")["sub-item"].transform("first"))
                                         .cumsum()
                                         .add(1)
                                         .rename("id"), 
                          how="right")["color"]
                )

>>> df1
   id  item sub-item color
0   1    52      AAA   #11
1   2    52      AAA   #11
2   3    55      BAA   #11
3   4    77      CAA   #11
4   5    77      CAA   #11
5   6    77      CAA   #11
6   7    77      CAB   #12
7   8    77      CAC   #13
  • Related