Home > Blockchain >  Partitioning np.array into sub-arrays with no np.nan values
Partitioning np.array into sub-arrays with no np.nan values

Time:09-04

Say I have a np.array, e.g. a = np.array([np.nan, 2., 3., 4., 5., np.nan, np.nan, np.nan, 8., 9., 10., np.nan, 14., np.nan, 16.]). I want to obtain all sub-arrays with no np.nan value, i.e. my desired output is:

sub_arrays_list = [array([2., 3., 4., 5.]), array([8., 9., 10.]), array([14.]), array([16.])]

I kind of managed to solve this with the following but it is quite inefficient:

sub_arrays_list = []
start, end = 0, 0
while end < len(a) - 1:
    if np.isnan(a[end]).any():
        end  = 1
        start = end
    else:
        while not np.isnan(a[end]).any():
            if end < len(a) - 1:
                end  = 1
            else:
                sub_arrays_list.append(a[start:])
                break
        else:
            sub_arrays_list.append(a[start:end])
            start = end

Would anyone please suggest a faster and better alternative to achieve this? Many thanks!

CodePudding user response:

You can use:

# identify NaN values
m = np.isnan(a)
# array([ True, False, False, False, False,  True,  True,  True, False,
#        False, False,  True, False,  True, False])

# compute groups
idx = np.cumsum(m)
# array([1, 1, 1, 1, 1, 2, 3, 4, 4, 4, 4, 5, 5, 6, 6])

# remove NaNs, get indices of first non-NaN per group and split
out = np.split(a[~m], np.unique(idx[~m], return_index=True)[1][1:])

output:

[array([2., 3., 4., 5.]), array([ 8.,  9., 10.]), array([14.]), array([16.])]
  • Related