Say I have one dataframe with columns as follows:
A | B | C | X | F |
---|
and the other dataframe has columns:
A | J | L | B | C |
---|
How can I add the rows from the second dataframe onto the first that only has 5 columns? I don't want columns J and L to be in the final dataframe.
Also assume there are many other columns i don't want, so is there a way to this without specifying column names? It'll be massive...
CodePudding user response:
Use Index.intersection
for filter columns from df2
by df1.columns
:
#if all columns from df1 are in df2
df22 = df2[df1.columns]
#if NOT all columns from df1 are in df2
df22 = df2[df2.columns.intersection(df1.columns)]
And then use DataFrame.append
:
df = df1.append(df22, ignore_index=True)
Or concat
:
df = pd.concat([df1, df22], ignore_index=True)
Another solution is filter after append new rows by columns in df1
:
df = pd.concat([df1, df2], ignore_index=True)[df1.columns]