Understanding xarray.apply_ufunc-CodePudding

I have an xarray with multiple time dimensions slow_time, fast_time a dimension representing different objects object and a dimension reflecting the position of each object at each point in time coords.

The goal is now to apply a rotation using scipy.spatial.transform.Rotation to each position in this array, for every point in time.

I'm struggling to figure out how to use xarray.apply_ufunc to do what I want, mainly because the concept of input_core_dimensions isn't really clear to me.

The code below shows what I'm trying to do:

import numpy as np
import xarray as xr
from scipy.spatial.transform import Rotation

# dummy initial positions
initial_position = xr.DataArray(np.arange(6).reshape((-1,3)), dims=["object", "coords"])
# dummy velocities
velocity = xr.DataArray(np.array([[1, 0, 0], [0, 0.5, 0]]), dims=["object", "coords"])

slow_time = xr.DataArray(np.linspace(0, 1, 10, endpoint=False), dims=["slow_time"])
fast_time = xr.DataArray(np.linspace(0, 0.1, 100, endpoint=False), dims=["fast_time"])

# times where to evaluate my function
times = slow_time   fast_time

# this is the data I want to transform
positions = times * velocity   initial_position

# these are the rotation angles
theta = np.pi/20 * times
phi = np.pi/100 * times

def apply_rotation(vectors, theta, phi):
    R = Rotation.from_euler("xz", (theta, phi))
    result = R.apply(vectors)
    return result

rotated_positions = xr.apply_ufunc(apply_rotation,
    positions, theta, phi, ...??)

The behaviour I'm looking for is essentially like four nested for loops that apply the rotations to each point like this pseudocode

for pos, t, p in zip(positions, theta, phi):
    R = Rotation.from_euler("xz", (t, p))
    R.apply(pos)

But I'm unsure how to proceed.

Using this

rotated_positions = xr.apply_ufunc(apply_rotation,
    positions, theta, phi,
    input_core_dims=[["object"],[],[]], output_core_dims=[["object"]])

I thought the function would be applied along subarrays of the object dimension, but now I get entire arrays passed into my function which doesn't work.

The information on apply_ufunc in the xarray documentation doesn't really makes things very clear.

Any help is appreciated!

CodePudding user response：

Reference

First off, a helpful reference would be this documentation page on unvectorized ufunc

Solution

As I understand your question you want to apply a rotation to the position vector of each object at every time. The way you have set up your data already puts the coordinates as the final dimension of the array.

Translating your pseudocode to generate a reference dataset rotatedPositions yields:

rotatedPositions = positions.copy()
for slowTimeIdx in range( len(slow_time)):
    for fastTimeIdx in range( len(fast_time) ):
        for obj in range(2):
            pos = rotatedPositions.data[slowTimeIdx, fastTimeIdx, obj].copy()
            rotatedPositions.data[slowTimeIdx, fastTimeIdx, obj] =  apply_rotation(pos, theta.data[slowTimeIdx, fastTimeIdx], phi.data[slowTimeIdx, fastTimeIdx])

where I hard-coded the object dimension size.

In essence the apply_rotation function takes 1 3-vector (1D array of size 3) and 2 scalars and returns a 1D array of size 3 (3 vector).

Following the documentation mentioned above I arrive at the following call to apply_ufunc:

rotated = xr.apply_ufunc(apply_rotation, 
               positions,
                theta,
                phi,
               input_core_dims=[['coords'], [], []],
               output_core_dims=[['coords']],
               vectorize=True
              )

Testing via

np.allclose(rotatedPositions.data, rotated.data)

indicates success.

Explanation

As I understand the documentation cited above apply_ufunc will take the function to be applied as first argument, then all positional arguments in order.

Next one has to provide the dimension labels of each dataset that will correspond to the data that will be core to apply_rotation working. Here this is coords, as we manipulate coordinates. Since neither theta nor phi have this dimension we do not specify anything for them.

Next we have to specify the dimensions the output data will have, since we just transform the output data we keep output_core_dims=[['coords']]. Leaving this out would lead apply_ufunc to assume output data to be 0-dimensional (a scalar).

Finally,vectorize=True ensures that the function is executed over all dimensions not specified in input_core_dims.