GOAL
I have values v
given at specific 3D coordinates x y z
. The data is stored as a pandas dataframe:
x y z v
0 -68.5 68.50 -10.00 0.297845
1 -68.5 -23.29 61.10 0.148683
2 -68.5 -23.29 63.47 0.142325
3 -68.5 -23.29 65.84 0.135908
4 -68.5 -23.29 68.21 0.129365
... ... ... ...
91804 68.5 23.29 151.16 0.118460
91805 68.5 23.29 153.53 0.119462
91806 68.5 23.29 155.90 0.120386
91807 68.5 23.29 139.31 0.112257
91808 68.5 -68.50 227.00 0.127948
I would like to find the values at new coordinates that are not part of the dataframe, hence I am looking into how to efficiently interpolate the data.
What I have done:
Since the coordinates are on a grid, I can use
I am wondering, what I am doing wrong??
Other:
Dataset
Please find the complete dataset here: https://filebin.net/u10lrw956enqhg5i
Visualization
from mayavi import mlab
# Create figure
fig = mlab.figure(1, fgcolor=(0, 0, 0), bgcolor=(0, 0, 0))
mlab.points3d(xs,ys,zs,output)
mlab.view(azimuth=270, elevation=90, roll=180, figure=fig)
# View plot
mlab.show()
CodePudding user response:
I strongly suspect that your data, while on a grid, is not ordered so as to allow a simple reshape of the values. You have two solutions available, both involving reordering the data in different ways.
Solution 1
Since you're already using np.unique
to extract the grid, you can get the correct ordering of vs
using the return_inverse
parameter:
px, ix = np.unique(xs, return_inverse=True)
py, iy = np.unique(ys, return_inverse=True)
pz, iz = np.unique(zs, return_inverse=True)
points = (px, py, pz)
values = np.empty_like(vs, shape=(px.size, py.size, pz.size))
values[ix, iy, iz] = vs
return_inverse
is sort of magical, largely because it's so counterintuitive. In this case, for each element of values, it tells you which unique, sorted gross location it corresponds to.
By the way, if you are missing grid elements, you may want to replace np.empty_like(vs, shape=(px.size, py.size, pz.size))
with either np.zeros_like(vs, shape=(px.size, py.size, pz.size))
or np.empty_like(vs, np.nan, shape=(px.size, py.size, pz.size))
. In the latter case, you could interpolate the nan
s in the grid first.
Solution 2
The more obvious solution would be to rearrange the indices so you can reshape vs
as you tried to do. That only works if you're sure that there are no missing grid elements. The easiest way would be to sort the whole dataframe, since the pandas methods are less annoying than np.lexsort
(IMO):
df.sort_values(['x', 'y', 'z'], inplace=True, ignore_index=True)
When you extract, do it efficiently:
xs, ys, zs, vs = df.to_numpy().T
Since everything is sorted, you don't need np.unique
to identify the grid any more. The number of unique x
values is:
nx = np.count_nonzero(np.diff(xs)) 1
And the unique values are:
bx = xs.size // nx
ux = xs[::bx]
y
values go through a full cycle every bx
elements, so
ny = np.count_nonzero(np.diff(ys[:bx])) 1
by = bx // ny
uy = ys[:bx:by]
And for z
(bz == 1
):
nz = np.count_nonzero(np.diff(zs[:by])) 1
uz = zs[:by]
Now you can construct your original arrays:
points = (ux, uy, uz)
values = vs.reshape(nx, ny, nz)