Home > other >  Multiplying 2 pandas dataframes generates nan
Multiplying 2 pandas dataframes generates nan

Time:09-19

I have 2 dataframes as below

import pandas as pd
dat = pd.DataFrame({'val1' : [1,2,1,2,4], 'val2' : [1,2,1,2,4]})
dat1 = pd.DataFrame({'val3' : [1,2,1,2,4]})

Now with each column of dat and want to multiply dat1. So I did below

dat * dat1

However this generates nan value for all elements.

Could you please help on what is the correct approach? I could run a for loop with each column of dat, but I wonder if there are any better method available to perform the same.

Thanks for your pointer.

CodePudding user response:

When doing multiplication (or any arithmetic operation), pandas does index alignment. This goes for both the index and columns in case of dataframes. If matches, it multiplies; otherwise puts NaN and the result has the union of the indices and columns of the operands.

So, to "avoid" this alignment, make dat1 a label-unaware data structure, e.g., a NumPy array:


In [116]: dat * dat1.to_numpy()
Out[116]:
   val1  val2
0     1     1
1     4     4
2     1     1
3     4     4
4    16    16


To see what's "really" being multiplied, you can align yourself:

In [117]: dat.align(dat1)
Out[117]:
(   val1  val2  val3
 0     1     1   NaN
 1     2     2   NaN
 2     1     1   NaN
 3     2     2   NaN
 4     4     4   NaN,
    val1  val2  val3
 0   NaN   NaN     1
 1   NaN   NaN     2
 2   NaN   NaN     1
 3   NaN   NaN     2
 4   NaN   NaN     4)

(extra: you have the indices same for dat & dat1; please change one of them's index, and then align again to see the union-behaviour.)

CodePudding user response:

You need to change two things:

  • use mul with axis=0
  • use a Series instead of dat1 (else multiplication will try to align the indices, there is no common ones between your two dataframes
out = dat.mul(dat1['val3'], axis=0)

output:

   val1  val2
0     1     1
1     4     4
2     1     1
3     4     4
4    16    16
  • Related