Home > Enterprise >  Merge and get only a subset of columns in pandas
Merge and get only a subset of columns in pandas

Time:10-16

I have 2 pandas dataframes as below:

 df1:
 id    name1
 123   AAA
 345   BBB

 df2:
 id_1   city address_1  address_2 province country mail
 123    SFO  -          -         -        -       -
 234    TO   -          -         -        -       -
 345    NY   -          -         -        -       -

I am merging df1 with df2 but in the result i only need the "city" column from df2. One way is I could do the merge and drop them as below

df_merged = pd.merge(df1,df2,left_on='id',right_on='id_1',how='left').drop(columns=['address_1',..])

But if the number of columns is more, it gets tedious to specify all those unnecessary columns in drop function. Just wondering is there an efficient way to specify only the columns that are needed from the merge.

Any suggestions would be appreciated.

CodePudding user response:

If all you need is the city column, you could just do:

df_merged = pd.merge(df1,df2,left_on='id',right_on='id_1',how='left')['City']

Of course, if you need more than that, you could add them. Just make sure you add a second second of brackets, as for >1 column you need to pass a list.

CodePudding user response:

You can select the relevant column from df2 during the merge step itself.

df_merged = pd.merge(df1,
                    df2[[ 'id_1', 'City']],
                    left_on='id',right_on='id_1',how='left')
  • Related