ValueError: all the input arrays must have same number of dims, but the arr at index 0 has 1 dimensi-CodePudding

I have array as follows

samples_data = [array([0., 0., 0., ..., 0., 0., 0.], dtype=float32)
 array([ 0.        ,  0.        ,  0.        , ..., -0.00020519,
        -0.00019427, -0.00107348], dtype=float32)
 array([ 0.0000000e 00,  0.0000000e 00,  0.0000000e 00, ...,
        -8.9004419e-07,  7.3998461e-07, -6.9706215e-07], dtype=float32)
 array([0., 0., 0., ..., 0., 0., 0.], dtype=float32)]

And I have a function like this

def generate_segmented_data_1(
    samples_data: np.ndarray, sampling_rate: int = 16000
) -> np.ndarray:

    new_data = []

    for data in samples_data:
        segments = segment_audio(data, sampling_rate=sampling_rate)
        new_data.append(segments)

    new_data = np.array(new_data)

    return np.concatenate(new_data)

It shows error like this

ValueError: all the input arrays must have same number of dimensions, but the array at index 0 has 1 dimension(s) and the array at index 11 has 2 dimension(s)

And the array at index 0 is like this

[array([ 0.        ,  0.        ,  0.        , ..., -0.00022057,
         0.00013752, -0.00114789], dtype=float32)
 array([-4.3174211e-04, -5.4488028e-04, -1.1238289e-03, ...,
         8.4724619e-05,  3.0450989e-05, -3.9514929e-05], dtype=float32)]

then the array at index 11 is like this

[[3.0856067e-05 3.0295929e-05 3.0955063e-05 ... 8.5010566e-03
  1.3315652e-02 1.5698154e-02]]

And then what should I do so all of the segments I produced being concatenated as an array of segments?

CodePudding user response：

I'm not quite sure I understand what you are trying to do.

b = np.array([[2]])
b.shape
# (1,1)

b = np.array([2])
b.shape
# (1,)

For the segment part of the question, it is unclear what your data structure is, but the code example is broken, as you are appending to a list that hasn't been created.

What is samples_data? is it a list? A list of what? Numpy arrays? If so, what are the dimensions of those arrays?

CodePudding user response：

how do I can get the shape of below array to be 1D instead of 2D?
b = np.array([[2]])
b_shape = b.shape
This will result (1, 1). But, I want it results (1, ) without flattening it?

I suspect the confusion stems from the fact that you chose an example which can be also seen as a scalar, so I'll instead use a different example:

b = np.array([[1,2]])

now, b.shape is (1,2). Removing the first "one" dimension in any way (be it b.flatten() or b.squeeze() or using b[0]) all result in the same:

assert (b.flatten() == b.squeeze()).all()
assert (b.flatten() == b[0]).all()

Now, for the real problem: it appears you're trying to concatenate "rows" from "segments", but the "segments" (which I believe from your sample dat are lists of np.arrays?) are inconsistently formed.

Your sample data is very chaotic: Segments 0-10 seem to be lists of 1D arrays; Segment 11, 18 and 19 are either 2D arrays or lists of lists of floats. This, plus the error code, suggest you have an issue in the data processing of the segments.

Now, to actually concatenate both types of data:

new_data = []
for data in samples_data:
    segments = function_a(data)     # it appears this doesn't return consistent data
    segments = np.asarray(segments) # force it to always be an array...
    if segments.ndim > 1:           # ...and append each row
        for row in segments:
            new_data.append(row)
    elif segments.ndim == 1:        # if just one row, append it directly
        new_data.append(segments)
    else:
        # function_a returned an empty list, do nothing
        pass

Given the shown data and code, this should work (but it's neither efficient, nor tested).