Home > Mobile >  Pandas - Merge 2 dataframes, only with columns from the first dataframe as output
Pandas - Merge 2 dataframes, only with columns from the first dataframe as output

Time:10-02

Say I have one dataframe with columns as follows:

A B C X F

and the other dataframe has columns:

A J L B C

How can I add the rows from the second dataframe onto the first that only has 5 columns? I don't want columns J and L to be in the final dataframe.

Also assume there are many other columns i don't want, so is there a way to this without specifying column names? It'll be massive...

CodePudding user response:

Use Index.intersection for filter columns from df2 by df1.columns:

#if all columns from df1 are in df2
df22 = df2[df1.columns]
#if NOT all columns from df1 are in df2
df22 = df2[df2.columns.intersection(df1.columns)]

And then use DataFrame.append:

df = df1.append(df22, ignore_index=True)

Or concat:

df = pd.concat([df1, df22], ignore_index=True)

Another solution is filter after append new rows by columns in df1:

df = pd.concat([df1, df2], ignore_index=True)[df1.columns]
  • Related