Perform merge for specific duplicate rows in pandas DataFrame-CodePudding

Let's be the following two DataFrames in python:

df:

code_1	other
19001	white
19009	blue
19008	red

df_1:

code_1	code_2
19001	00001
19001	00002
19009	00003
19008	00001

I want to merge df with df_1:

    df_merge = pd.merge(df, df_1, how="left", on=['code_1'])

df_merge:

code_1	other	code_2
19001	white	00001
19001	white	00002
19009	blue	00003
19008	red	00004

I want the merge to remove duplicates in the case of code_1 and only do the merge for the first row. I could do a drop_duplicates for [other, code_1], but I would like to know if it is possible to include some parameter in the merge function to do it directly.

Expected result:

code_1	other	code_2
19001	white	00001
19009	blue	00003
19008	red	00004

CodePudding user response：

In my opinion there is no specifc parameter for pandas.merge() that fit your needs, but you could reduce the result by dropping duplicates before merging, assumed there are only duplicates in df_1:

df_merge = df.merge(df_1.drop_duplicates('code_1'), how="left", on=['code_1'])