Home > Enterprise >  Python: Join dataframes issues
Python: Join dataframes issues

Time:11-23

With this 2 dataframes:

df1 = pd.DataFrame(
    {
        "ID": ["ID0", "ID1", "ID2", "ID3"],
        "A": ["A0", "A1", "A2", "A3"],
        "B": ["B0", "B1", "B2", "B3"],
    },
)


df2 = pd.DataFrame(
    {
        "ID": ["ID0", "ID1", "ID2", "ID4"],
        "C": ["C0", "C1", "C2", "C4"],
        "D": ["D0", "D1", "D2", "D4"],
    },
)

My goal is to join them, not havind repeated ID's and have None when there's no information:

ID  A   B   C   D
ID0 A0  B0  C0  D0
ID1 A1  B1  C1  D1
ID2 A2  B2  C2  D2
ID3 A3  B3  None None
ID4 None None C4 D4

Whate are the .concat parameters to do this, Have tried several but without the result that I want.

CodePudding user response:

Use pd.merge instead:

df1.merge(df2, on='ID', how='outer')

Output:

    ID    A    B    C    D
0  ID0   A0   B0   C0   D0
1  ID1   A1   B1   C1   D1
2  ID2   A2   B2   C2   D2
3  ID3   A3   B3  NaN  NaN
4  ID4  NaN  NaN   C4   D4

This will have Nans where values are missing.

  • Related