Home > OS >  Merge two dataframe based on the unique (first) value
Merge two dataframe based on the unique (first) value

Time:07-29

I have two dataframes. I want to merge these dataframe based on the first values. Here is a simple example: I have this df:

df = pd.DataFrame()
df['id'] = [1, 2, 3, 4, 5]
df['x'] = [3, 4, 7, 8, 9]

   id   x
0   1   3
1   2   4
2   3   7
3   4   8
4   5   9

I want to get the value of 'y' from the following df1 based on the first values.

df1 = pd.DataFrame()
df1['id'] = [1, 1, 2, 3,3]
df1['y'] = [ 10, 100, 200, 500, 100]
   id   y
0   1   10
1   1   100
2   2   200
3   3   500
4   3   100

For id=1, we have two values in df1, but we only want the first one which is 10.

The desired output is:

   id   y
0   1   10
1   2   200
2   3   500

Thank you so much for your help

CodePudding user response:

here is one way to do it

df1.groupby('id').first().merge(df, on='id', how='left').drop(columns='x')
    id    y
0   1    10
1   2   200
2   3   500

CodePudding user response:

Although this is not an ideal solution, as I am new to programming (2 months old), I could only manage this.

merged_df = pd.merge(df1, df, on='id', how='left')
merged_df.drop(columns='x', index=[1,4], inplace= True)
merged_df.reset_index()

Found another way (the common way)

df1.groupby("id").first().reset_index()
  • Related