Context:
I'd like to divide 2 dataframes with several rows and different column names, element-wise. For example:
df1 = pd.DataFrame({'A':[4,2,1],'B':[10,4,2]})
df2 = pd.DataFrame({'C':[8,4,1],'D':[20,2,4]})
would results in this dataframe:
df3 = pd.DataFrame({'A':[0.5,0.5,1],'B':[0.5,2,0.5]})
I was trying something like this:
df3 = pd.DataFrame(df1.div(df2,axis='columns'),columns=df1.columns)
but I keep getting NAs as a result and it seems that the df1 and df2 are being concatenated by rows instead (total of 6 rows instead of 3). Is there a way to do this without converting to np.arrays or series?
Thank you in advance!
CodePudding user response:
When performing an operation on two DataFrames, pandas aligns the indices (index and columns).
Here as you have different column names, this would give you NaNs.
One option is to divide by a bumpy array:
df1.div(df2.to_numpy())
Alternatively, if you still want to align on the index, you can just set the columns names to df1
on df2
:
df1.div(df2.set_axis(df1.columns, axis=1))
Output:
A B
0 0.5 0.5
1 0.5 2.0
2 1.0 0.5