I have a numpy array of dimensions (30435615,3) containing coordinates expressed for example (0.0 0.0 0.0 1) and I'm looking for a method to set to True the indexes that have coordinates contained in another array. I tried with numpy.where method but I'm having some problems. If I print the 50th element of my array I got:
>>> print(coordsRAS[50,:])
[-165.31173706 7.91322422 -271.87799072]
But if I search this point:
>>> import numpy as np
>>> print(np.where((coordsRAS[:,0]==-165.31173706) & (coordsRAS[:,1] == 7.91322422) & (coordsRAS[:,2] == -256.87799072)))
(array([], dtype=int64),)
I can't figure out why it can't find the point.
EDIT 1: Sorry I copied the wrong value above, -256.87799072 instead of -271.87799072. However the problem was in the approximation of the print, actually the value has more significant digits for this he could not find it. In this way works:
np.where((np.round(coordsRAS[:,0],8)==-165.31173706) & (np.round(coordsRAS[:,1],8) == 7.91322422) & (np.round(coordsRAS[:,2],8) == -271.87799072))
But now I have another problem. The other array I want to compare coordsRAS to is smaller, so when I try to compare == it gives me an error.
>>> coordsRAS = np.where(coordsRAS[:,:]==points[:,:3],True,False)
C:/Users/silvi/AppData/Local/Temp/xpython_8292/987583353.py:11: DeprecationWarning: elementwise comparison failed; this will raise an error in the future.
coordsRAS = np.where(coordsRAS [:,:]==points[:,:3],True,False)
How can I set coordsRAS values to True that are also present in points?
CodePudding user response:
it seems you have a typo in your when condition. The 3rd column value doesn't match what you shared with us, use the code below and it will work:
print(np.where((coordsRAS[:, 0] == -165.31173706), (coordsRAS[:, 1]
== 7.91322422), (coordsRAS[:, 2] == -271.87799072)))
CodePudding user response:
When you are working with floats, it is not a good idea to use equality statements to find numbers, because you are always dealing with numerical inaccuracies. The answer given by Majid will fail in case you multiply your coordsRAS with pi and then divide again by pi. Theoretically it should give you the same result, but it fails:
import numpy as np
coordsRAS = np.random.random((5, 3))
point = [-165.31173706, 7.91322422, -256.87799072]
coordsRAS[4, :] = point
coordsRAS *= np.pi
coordsRAS /= np.pi
result1 = np.where((coordsRAS[:, 0] == -165.31173706), (coordsRAS[:, 1] == 7.91322422), (coordsRAS[:, 2] == -271.87799072))
print(coordsRAS[result1])
We have divided and multiplied with the same number, but now we cannot find the point anymore, due to the numerical round off error. The result in this case is:
[]
So the result is empty, because your float has slightly changed due to numerical round off errors.
The solution is to calculate the difference of your array with the required point, and search for the location where your distance falls below a certain accuracy. So you should do:
distance = np.linalg.norm(coordsRAS - point, axis=-1)
row = np.where(distance < 1e-10)
result2 = coordsRAS[row]
Now the correct point can still be found:
print(result2)
[[-165.31173706 7.91322422 -256.87799072]]
EDIT1:
In case you want to get all the locations stored in an other smaller array, you have to iterate over the points. E.g. you have the following two arrays:
coordsRAS = np.random.random((10, 3))
points = np.random.random((3, 3))
coordsRAS[4:7, :] = points
where the locations of points are stored in the coordsRAS array as well, you can find the locations of points back in the coordsRAS array as
mask_total = None
for point in points[:]:
distance = np.linalg.norm(coordsRAS - point, axis=-1)
mask = distance < 1e-10
if mask_total is None:
mask_total = mask
else:
mask_total = mask_total | mask
result = coordsRAS[mask_total]