Home > Software engineering >  Join two tables on ID where table one has duplicates and table two does not
Join two tables on ID where table one has duplicates and table two does not

Time:12-15

I have two tables, Table A['id', 'col1'] and Table B['id', 'col3']. Table A has duplicate values in 'id' column and its important that those duplicate row stay as they are but Table B has single pair for 'id' and 'col3'. I want to use python to join the two tables.

Table A

id col1
1 Apple
1 Mango
2 Banana

Table B

id col3
1 Eat
2 Drink

Final output

id col1 col3
1 Apple Eat
1 Mango Eat
2 Banana Drink

I tried to use merge and concat but did not get the desired result.

CodePudding user response:

Here's a way to use merge to do it:

A.merge(B, how='left') )

Explanation:

  • In the docs for merge(), when the how argument is 'left':
    • use only keys from left frame, similar to a SQL left outer join; preserve key order.
  • Also from the docs, when the on argument is not specified, it takes the default value of None, and on is interpreted as follows:
    • Column or index level names to join on. These must be found in both DataFrames. If on is None and not merging on indexes then this defaults to the intersection of the columns in both DataFrames.

Output:

   id    col1   col3
0   1   Apple    Eat
1   1   Mango    Eat
2   2  Banana  Drink

CodePudding user response:

like this:

tableA = tableA.merge(right=tableB, how='left', on='id')
  • Related