I have two dataframes that look like this:
df1
A B C
5 1 5
4 2 8
2 5 3
df2
A B C D
4 3 4 1
3 5 1 2
1 2 5 4
df1 and df2 share the same columns names except "D" which is only found in df2. What I would like to do is add D to df1 but fill all rows with "0"'s
In other words, if a column exists in df2 but it doesn't in df1, add that column to df1 but make all values in that column 0 (below)
df1
A B C D
5 1 5 0
4 2 8 0
2 5 3 0
I realize it would be very easy to add one column called "D" to df1 but this is just a dummy example when in reality I am dealing with much larger and many more dataframes. So, I am looking for a way to do this with a code I could implement in a loop or iteratively
CodePudding user response:
You can find the missing columns with Index.difference
.
Then there are a ton of ways to assign multiple columns with a static value to a DataFrame, so here's one where you unpack a dictionary where the keys are the column names and the values of that dict is the static value you want to assign.
df1 = df1.assign(**{x: 0 for x in df2.columns.difference(df1.columns)})
A B C D
0 5 1 5 0
1 4 2 8 0
2 2 5 3 0
CodePudding user response:
You can use DataFrame.add
with fill_value
:
print(df1.add(df2, fill_value=0))
Output:
A B C D
0 9 4 9 1.0
1 7 7 9 2.0
2 3 7 8 4.0
Note: This method will fill the existing nan in each dataframe with 0
as well.
CodePudding user response:
Try this:
df3 = df1.add(df2).fillna(0).astype(int)
Output:
>>> df3
A B C D
0 9 4 9 0
1 7 7 9 0
2 3 7 8 0
CodePudding user response:
You can reindex
one dataframe using columns from another one:
df1.reindex(df2.columns, axis=1, fill_value=0)