I want to do some operations on a matrix that has a lot of "nan" in it. The goal is to go through the matrix column-wise and fill all nan's in a column if there is a constant value in there. I.e. if a column has (beside of nan's) always the same value, then all the nan's in that column should be filled with that value.
I.e the input might look like the matrix below. The second column should now all be filled with 2.0, but the last should stay as it is. I include code that does this, but I would love to have it in a more efficient way, as it is going to be applied to large matrices.
#INPUT
[[0.15266373 nan 0.06203641 0.45986034 nan]
[0.92699705 nan 0.76849622 0.26920507 nan]
[0.09337326 2. 0.58961375 0.34334054 nan]
[0.62647321 2. 0.55225681 0.26886006 2. ]
[0.2229281 nan 0.39064809 0.19316241 3. ]]
# OUTPUT
[[0.15266373 2. 0.06203641 0.45986034 nan]
[0.92699705 2. 0.76849622 0.26920507 nan]
[0.09337326 2. 0.58961375 0.34334054 nan]
[0.62647321 2. 0.55225681 0.26886006 2. ]
[0.2229281 2. 0.39064809 0.19316241 3. ]]
# CODE FOR MOCK MATRIX AND FILLING OF NANs
# -----------------------------------------
import numpy as np
# PREP OF MOCK MATRIX
np.random.seed(777)
a = np.random.rand(5, 5)
a[:,1] = np.nan
a[[2, 3],1] = 2.0
a[:,4] = np.nan
a[4,4] = 3.0
a[3,4] = 2.0
# FILL THE WANTED STRUCTURE
for c in range(a.shape[1]):
values = np.unique(a[~np.isnan(a[:,c]),c])
if values.size == 1:
a[:,c] = values
Any help is appreciated. Best
CodePudding user response:
This is one way to do it:
colmin = np.nanmin(a, axis=0)
colmax = np.nanmax(a, axis=0)
b = (colmin == colmax)
a[:,b] = colmin[b]
A RuntimeWarning will be given if there are all-NaN columns, see here if you wish to hide these.