Home > Software engineering >  How to convert a single dimension array to a multi dimension array with one column in Python
How to convert a single dimension array to a multi dimension array with one column in Python

Time:05-31

I have an algorithm that predicts values

when the labels are multi-label it returns a multi-D array

example

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=TestSize, random_state=42)

>>> y_test
array([[1, 0, 0, ..., 0, 0, 0],
       [0, 1, 0, ..., 0, 0, 0],
       [0, 0, 1, ..., 1, 1, 0],
       ...,
       [1, 0, 1, ..., 0, 1, 1],
       [0, 1, 0, ..., 0, 1, 1],
       [1, 0, 0, ..., 0, 0, 1]], dtype=uint8)

>>> y_test.shape
(100, 20)

mdl.fit(X_train, y_train)
y_hat = mdl.predict(X_test)

in this case the outcome is multi-D array

>>> y_hat
array([[0, 1, 1, ..., 0, 0, 0],
       [0, 0, 1, ..., 0, 0, 0],
       [0, 1, 0, ..., 0, 1, 0],
       ...,
       [0, 1, 1, ..., 0, 1, 1],
       [0, 0, 0, ..., 0, 1, 1],
       [0, 0, 1, ..., 0, 0, 1]], dtype=uint8)

>>> y_hat.shape
(100, 20)

This is good, and no issues here

but when I work with a single label

such as this example

example

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=TestSize, random_state=42)

>>> y_test
array([[1],
       [0],
       [0],
       ...,
       [1],
       [0],
       [1]], dtype=uint8)


>>> y_test.shape
(100, 1)

mdl.fit(X_train, y_train)
y_hat = mdl.predict(X_test)

in this case y_test is multi-D array with 1 column (100, 1)

but y_hat is a single dimension array (100,)

>>> y_hat
array([0, 1, 1, 0, 0, 0, 1, 0,
       1, 0, 1, 0, 1, 0, 1, 1,
       ...
       0, 1, 1, 0, 0, 0, 1, 0], dtype=uint8)

>>> y_hat.shape
(100,)

How can I convert y_hat to a multi-D array with 1 column (100, 1) only when y_hat is not the same dimension as y_test

CodePudding user response:

To automatically convert to 2D only if needed, you can use numpy.atleast_2d:

a = np.random.randint(0,2,100)
a.shape
# (100,)

a = np.atleast_2d(a.T).T
a.shape
# (100, 1)

CodePudding user response:

Commonly used methods include slicing and reshaping:

>>> ar
array([0, 1, 2])
>>> ar[:, np.newaxis]    # ar[:, None]
array([[0],
       [1],
       [2]])
>>> ar.reshape(-1, 1)    # ar.reshape(*ar.shape, 1)
array([[0],
       [1],
       [2]])

However, this will only create a new array view and will not modify the original array. If you want to modify the original array, you can directly assign a value to the shape property of array:

>>> ar
array([0, 1, 2])
>>> ar.shape = -1, 1
>>> ar
array([[0],
       [1],
       [2]])

Where, -1 means that the size of this dimension is calculated by numpy itself. It is usually obtained by dividing the length of the array by the size of other dimensions.

  • Related