Home > Blockchain >  Replace null values in pandas data frame column with 2D np.zeros() array
Replace null values in pandas data frame column with 2D np.zeros() array

Time:12-23

Assume the following data frame:

import pandas as pd
import numpy as np

vals = [1, 2, 3, 4, 5]

df = pd.DataFrame({'val': vals})
df['val'][[0, 3]] = np.nan

Gives:

    val
0   NaN
1   2.0
2   3.0
3   NaN
4   5.0

I need to be able to replace NaN values in the val column with a 2D numpy array of zeros. When I do the following:

z = np.zeros((10, 10))

df['val'][df['val'].isnull()] = z

The arrays are converted to scalars of value 0.0:

    val
0   0.0
1   2.0
2   3.0
3   0.0
4   5.0

I really need the array to be maintained (in this case, each NaN value - rows 0 and 3 from the original data frame - should be replaced with a 10x10 array of zeros). I've tried converting to object type first

df = df.astype(object)
df['val'][df['val'].isnull()] = z

With no success. Whhyyyyy

CodePudding user response:

It is cause by the object data type we have a way with fillna

df.val.fillna(dict(zip(df.index[df['val'].isnull()],z)),inplace=True)
df
                                                 val
0  [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...
1                                                2.0
2                                                3.0
3  [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...
4                                                5.0

CodePudding user response:

Try this:

df = df.astype(object)
mask = df['val'].isnull()
df.at[mask, 'val'] = z[mask[mask].index].tolist()

Output:

>>> df
                                                 val
0  [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...
1                                                2.0
2                                                3.0
3  [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...
4                                                5.0
  • Related