Insert new column from 1D array to Numpy 3D array-CodePudding

I have two arrays using the MNIST dataset. First array shape is (60000,28,28) and the second array is (60000,).

Is it possible to combine these and make a new array that is (60000,28,28,1)? I've tried reshaping, resizing, inserting, concatenating and a bunch of other methods to no avail!

Would really appreciate some help! TIA!

CodePudding user response：

I think this is not possible. To combine any two arrays, they must have the same dimensions. And any two dimensions in each array must be of the same size.

You can imagine (60,000, 28, 28) array as a cube. The surface looking at you has the dimension of 28 x 28. Thus, all same-size surfaces behind it are 60,000 in number. If you want to add a new entity to it, it must have the same 3-D dimension. And at least two dimensions must match those of the first cube. Otherwise, it won't get concatenated exactly.

To combine (60,000, 28, 28) with another array, the second array should have any two of 60,000, 28, 28 as its dimensions. Let's suppose, the second one has (60,000, 28, 14). Then, you can concatenate and get the result:

z = np.concatenate((array1, array2), axis=2)
z.shape

Output:

(60000, 28, 42)

Alternatively, if the second array is (30,000, 28, 28):

z = np.concatenate((array1, array2), axis=0)
z.shape

Output:

(90000, 28, 28)

CodePudding user response：

It seems like you might have misunderstood how numpy arrays work or how they should be used.

Each dimension(except for the inner most dimension) of a an array is essentially just an array of arrays. So for your example with dimension (60000, 28, 28). You have an array with 60000 arrays, which in turn are arrays with 28 arrays. The final array are then a array of 28 objects of some sort.(Integers in the mnist dataset I think).

You can convert this into a (60000, 28, 28, 1) by using numpys expand_dims method like so:

new_array = numpy.expand_dims(original_array, axis=-1)

However, this will only make the last array be an array of 1 objects, and will not include the other array in any way.

From what I can read from your question it seems like you want to map the labels of the mnist dataset with the corresponding image. You could do this by making the object of the outermost dimension a tuple of(image<28x28 numpy array>, label<int>), but this would remove the numpy functionality of the array. The best course of action is probably to keep it as is and using the index of an image to check the label.