sample = np.array([[-1, 1, -1, 1], [-1, 2, -1, 2], [-1, 3, 3, 3] ,[-1, 4, 4, 4], [-1, 5, 5, 5], [6, 6, 6, -1], [7, 7, 7, -1], [8, 8, -1, -1], [9, 9, -1, -1]])
float_sample = sample.astype(np.float64)
for row in float_sample:
for ele in row:
if (ele == -1.):
float_sample[row][ele] = np.nan
I am trying to iterate through a 2D numpy array and whenever I see -1, convert it to NaN, np.nan. But whenever I try to do it doing the iteration I have above I get the following error message:
"Arrays used as indices must be of integer (or boolean) type"
How to I fix it so that I can iterate through a 2D numpy array of type float and whenever it finds a -1, convert it to NaN? I am doing this so that I can take the median of each column but you can't do that using masked arrays so I am stuck trying to do it with a normal numpy array.
CodePudding user response:
Use numpy's boolean indexing:
import numpy as np
sample = np.array(
[[-1, 1, -1, 1], [-1, 2, -1, 2], [-1, 3, 3, 3], [-1, 4, 4, 4], [-1, 5, 5, 5], [6, 6, 6, -1], [7, 7, 7, -1],
[8, 8, -1, -1], [9, 9, -1, -1]])
float_sample = sample.astype(np.float64)
float_sample[sample == -1] = np.nan
print(float_sample)
Output
[[nan 1. nan 1.]
[nan 2. nan 2.]
[nan 3. 3. 3.]
[nan 4. 4. 4.]
[nan 5. 5. 5.]
[ 6. 6. 6. nan]
[ 7. 7. 7. nan]
[ 8. 8. nan nan]
[ 9. 9. nan nan]]
As a side note the problem is that you are using the row
to index in:
float_sample[row][ele] = np.nan
and row
is an array of floats.
CodePudding user response:
Do not iterate (it is slow), use vectorial operations instead, like np.where
:
float_sample = np.where(sample == -1, np.nan, sample)
output:
array([[nan, 1., nan, 1.],
[nan, 2., nan, 2.],
[nan, 3., 3., 3.],
[nan, 4., 4., 4.],
[nan, 5., 5., 5.],
[ 6., 6., 6., nan],
[ 7., 7., 7., nan],
[ 8., 8., nan, nan],
[ 9., 9., nan, nan]])
CodePudding user response:
Try this one:
sample = np.array([[-1, 1, -1, 1], [-1, 2, -1, 2], [-1, 3, 3, 3] ,[-1, 4, 4, 4], [-1, 5, 5, 5], [6, 6, 6, -1], [7, 7, 7, -1], [8, 8, -1, -1], [9, 9, -1, -1]])
float_sample = sample.astype(np.float64)
for row in float_sample:
for i in range(len(row)):
if (row[i] == -1.):
row[i] = np.nan
Output:
[[nan 1. nan 1.]
[nan 2. nan 2.]
[nan 3. 3. 3.]
[nan 4. 4. 4.]
[nan 5. 5. 5.]
[ 6. 6. 6. nan]
[ 7. 7. 7. nan]
[ 8. 8. nan nan]
[ 9. 9. nan nan]]