Is there an easy way to custom fill null values in Pandas?-CodePudding

I have the following DataFrame:

     A    B    C    D
0  0.0  0.0  0.0  0.0
1  1.0  1.0  1.0  1.0
2  NaN  2.0  2.0  2.0
3  NaN  3.0  3.0  3.0
4  NaN  4.0  4.0  NaN
5  NaN  NaN  5.0  NaN
6  NaN  NaN  6.0  NaN

I am working to generate visualizations with this data, and I need to fill the null values in a very specific way. I want to loop the existing values repeatedly for each column until the null values are all filled, so that the DataFrame looks like this:

     A    B    C    D
0  0.0  0.0  0.0  0.0
1  1.0  1.0  1.0  1.0
2  0.0  2.0  2.0  2.0
3  1.0  3.0  3.0  3.0
4  0.0  4.0  4.0  0.0
5  1.0  0.0  5.0  1.0
6  0.0  1.0  6.0  2.0

Is there any convenient way to do this in Pandas?

CodePudding user response：

You can apply a custom function on each column that obtains the values to be iterated and then extends them to the full length of the dataframe. This can be done using np.resize as follows:

def f(x):
    vals = x[~x.isnull()].values
    vals = np.resize(vals, len(x))
    return vals
    
df = df.apply(f, axis=0)

Result:

     A    B    C    D
0  0.0  0.0  0.0  0.0
1  1.0  1.0  1.0  1.0
2  0.0  2.0  2.0  2.0
3  1.0  3.0  3.0  3.0
4  0.0  4.0  4.0  0.0
5  1.0  0.0  5.0  1.0
6  0.0  1.0  6.0  2.0

CodePudding user response：

One option is with a for loop; the assumption is that the NaNs are at the end of each column, if any. Use np.place to fill the nulls :

 [np.place(df[col].to_numpy(), 
           df[col].isna(), 
           df[col].dropna().array) 
   for col in df 
   if df[col].hasnans]

[None, None, None]

df
     A    B    C    D
0  0.0  0.0  0.0  0.0
1  1.0  1.0  1.0  1.0
2  0.0  2.0  2.0  2.0
3  1.0  3.0  3.0  3.0
4  0.0  4.0  4.0  0.0
5  1.0  0.0  5.0  1.0
6  0.0  1.0  6.0  2.0

Note that np.place is an in place operation, no assignment is needed.