The shape of fd is (315360, 4) and fd2's shape is (214704, 4).
df:
Country | Partner Country | Year | Variable1 |
---|---|---|---|
Turkey | Spain | 1993 | 183188 |
Spain | Turkey | 1993 | 3918281 |
US | UK | 1993 | 495949282 |
US | UK | 1994 | 495949282 |
UK | US | 1994 | 495949282 |
df2:
Country | Partner Country | Year | Variable2 |
---|---|---|---|
Syria | Spain | 1993 | 183188 |
Turkey | Spain | 1993 | 3918281 |
US | UK | 1993 | 495949282 |
Germany | UK | 1994 | 495949282 |
UK | US | 1994 | 495949282 |
df.merge(df2, how="outer", on=["Country", "Partner Country", "Year"])
data shape I get after the merge is (351351, 5). Why is not (530064, 5) and how can I have this without losing any info.
CodePudding user response:
You can use concat
with drop_duplicates
:
df = pd.concat([df, df2]).drop_duplicates()