Home > Back-end >  Add numpy matrices treating SymPy nan entries as 0
Add numpy matrices treating SymPy nan entries as 0

Time:06-18

I have a numpy matrix and am trying to find the sum of each row. There are some nan entries which I want to treat as 0. For instance, I would expect the sum of

[1, nan, 30]
[nan, 2, nan]
[8, nan, nan]
[nan, 1, 16]

to be

[31]
[2]
[8]
[17]

EDIT: each row contains entries which are either nan or a sympy expr. So the sum of the row [x, nan, 30x] should be 31x

However, these nan values are not np.NaN, but are created in calculations with SymPy and therefore are S.NaN.

np.isnan(S.NaN) returns False. I have tried using np.nansum but because of this, it just returns nan.

A = np.matrix([[1, np.nan, 30],
    [np.nan, 2, np.nan],
    [8, np.nan, np.nan],
    [np.nan, 1, 16]])

B = np.matrix([[1, S.NaN, 30],
        [S.NaN, 2, S.NaN],
        [8, S.NaN, S.NaN],
        [S.NaN, 1, 16]])

np.nansum(A, 1) returns

matrix([[31.],
        [ 2.],
        [ 8.],
        [17.]])

as expected.

But np.nansum(B, 1) returns

matrix([[nan],
        [nan],
        [nan],
        [nan]])

I also thought that I might be able to replace the S.NaN values with 0. There are quite a lot of answered questions about how to convert np.NaN values to 0, e.g. convert nan value to zero, but these solutions don't work because I don't think there's an equivalent in SymPy for np.isnan().

Is there either an equivalent in SymPy for np.nansum which I could use, or a way to replace S.NaN values with 0 (or np.NaN)?

CodePudding user response:

I played with sympy.Matrix, but turns out it isn't needed.

A numpy array with sympy.nan elements. Note the object dtype:

In [65]: arr
Out[65]: 
array([[1, nan, 30],
       [nan, 2, nan],
       [8, nan, nan],
       [nan, 1, 16]], dtype=object)

In [66]: type(arr[0,1])
Out[66]: sympy.core.numbers.NaN

It converts to a float array without problem:

In [67]: arr.astype(float)
Out[67]: 
array([[ 1., nan, 30.],
       [nan,  2., nan],
       [ 8., nan, nan],
       [nan,  1., 16.]])

In [68]: type(_[0,1])
Out[68]: numpy.float64

In [69]: np.nansum(__,1)
Out[69]: array([31.,  2.,  8., 17.])

sympy.nan converts to python float without problem:

In [71]: type(nan)
Out[71]: sympy.core.numbers.NaN

In [72]: type(float(nan))
Out[72]: float

sympy.Matrix

If I make a sympy.Matrix from the array:

In [83]: M= Matrix(arr)

In [84]: M
Out[84]: 
⎡ 1   nan  30 ⎤
⎢             ⎥
⎢nan   2   nan⎥
⎢             ⎥
⎢ 8   nan  nan⎥
⎢             ⎥
⎣nan   1   16 ⎦

In [85]: M[0,1]
Out[85]: 
nan

In [86]: type(M[0,1])
Out[86]: sympy.core.numbers.NaN

I can use subs to replace the nan:

In [87]: M1 = M.subs({nan: 0})

In [88]: M1
Out[88]: 
⎡1  0  30⎤
⎢        ⎥
⎢0  2  0 ⎥
⎢        ⎥
⎢8  0  0 ⎥
⎢        ⎥
⎣0  1  16⎦

np.sum works on this:

In [89]: np.sum(M1,1)
Out[89]: array([31, 2, 8, 17], dtype=object)

Presumably np.sum first turns the Matrix in to an array:

In [90]: np.array(M1)
Out[90]: 
array([[1, 0, 30],
       [0, 2, 0],
       [8, 0, 0],
       [0, 1, 16]], dtype=object)

==

We don't need isnan to test for sp.nan. While its float value is np.nan (or python nan), it's a distinct sympy object, and == works just fine (that's shown in the sympy.nan docs).

In [92]: arr==sp.nan
Out[92]: 
array([[False,  True, False],
       [ True, False,  True],
       [False,  True,  True],
       [ True, False, False]])

In [93]: arr1 = np.where(arr==sp.nan,0,arr)

In [94]: arr1
Out[94]: 
array([[1, 0, 30],
       [0, 2, 0],
       [8, 0, 0],
       [0, 1, 16]], dtype=object)

or if you insist on using nansum:

In [95]: np.nansum(np.where(arr==sp.nan,np.nan,arr),1)
Out[95]: array([31, 2, 8, 17], dtype=object)

CodePudding user response:

I found the function np.nditer which allows you to iterate through each entry in an array. Using the fact that S.NaN raises a TypeError when a comparison is made, the following code correctly replaces any entries of it with a 0 (or anything else)

In[122]: B
Out[122]: 
matrix([[1, nan, 30],
        [nan, 2, nan],
        [8, nan, nan],
        [nan, 1, 16]], dtype=object)

In[123]: for x in np.nditer(B, flags=["refs_ok"], op_flags = ['readwrite']):
    try:
        x > 0
    except TypeError:
        x[...] = 0
        

In[124]: B
Out[124]: 
matrix([[1, 0, 30],
        [0, 2, 0],
        [8, 0, 0],
        [0, 1, 16]], dtype=object)
  • Related