Home > OS >  Creating columns in one dataframe only from non matching columns from another dataframe and set all
Creating columns in one dataframe only from non matching columns from another dataframe and set all

Time:12-21

I have two dataframes that look like this:

df1

A    B    C
5    1    5
4    2    8
2    5    3

df2

A    B    C    D
4    3    4    1
3    5    1    2
1    2    5    4

df1 and df2 share the same columns names except "D" which is only found in df2. What I would like to do is add D to df1 but fill all rows with "0"'s

In other words, if a column exists in df2 but it doesn't in df1, add that column to df1 but make all values in that column 0 (below)

df1

A    B    C    D
5    1    5    0
4    2    8    0
2    5    3    0

I realize it would be very easy to add one column called "D" to df1 but this is just a dummy example when in reality I am dealing with much larger and many more dataframes. So, I am looking for a way to do this with a code I could implement in a loop or iteratively

CodePudding user response:

You can find the missing columns with Index.difference.

Then there are a ton of ways to assign multiple columns with a static value to a DataFrame, so here's one where you unpack a dictionary where the keys are the column names and the values of that dict is the static value you want to assign.

df1 = df1.assign(**{x: 0 for x in df2.columns.difference(df1.columns)})

   A  B  C  D
0  5  1  5  0
1  4  2  8  0
2  2  5  3  0

CodePudding user response:

You can use DataFrame.add with fill_value:

print(df1.add(df2, fill_value=0))

Output:

   A  B  C    D
0  9  4  9  1.0
1  7  7  9  2.0
2  3  7  8  4.0

Note: This method will fill the existing nan in each dataframe with 0 as well.

CodePudding user response:

Try this:

df3 = df1.add(df2).fillna(0).astype(int)

Output:

>>> df3
   A  B  C  D
0  9  4  9  0
1  7  7  9  0
2  3  7  8  0

CodePudding user response:

You can reindex one dataframe using columns from another one:

df1.reindex(df2.columns, axis=1, fill_value=0)
  • Related