Home > other >  Why do I have less rows when I merge using "outer" argument?
Why do I have less rows when I merge using "outer" argument?

Time:10-26

The shape of fd is (315360, 4) and fd2's shape is (214704, 4).

df:

Country Partner Country Year Variable1
Turkey Spain 1993 183188
Spain Turkey 1993 3918281
US UK 1993 495949282
US UK 1994 495949282
UK US 1994 495949282

df2:

Country Partner Country Year Variable2
Syria Spain 1993 183188
Turkey Spain 1993 3918281
US UK 1993 495949282
Germany UK 1994 495949282
UK US 1994 495949282
df.merge(df2, how="outer", on=["Country", "Partner Country", "Year"])

data shape I get after the merge is (351351, 5). Why is not (530064, 5) and how can I have this without losing any info.

CodePudding user response:

You can use concat with drop_duplicates:

df = pd.concat([df, df2]).drop_duplicates()
  • Related