How to make an index column in NumPy array?-CodePudding

I know, it seems a bit easy question to answer, however, I am just stuck in whether there is a way to do it or not.

I had a DataFrame (with index), and I inserted a new column to that frame which was able to group every 10 rows and had numbers from 1 to ... for every one group. I used this very basic code and it worked!

df1.insert(0, 'Data', (df.index // 10) 1)

The issue is; now, I have a NumPy array (unit8) that does not include an index in it, that is why the code above is not working for the same condition. I would like to do the same thing, the code will count every 10 rows, will group them, and will add a number to each of the groups in a newly added column.

CodePudding user response：

I'm not sure I understood your question (maybe could you please give an example of code you are working with). Anyway I think a possible solution could be to transform your array in to a Dataframe with just one column (and now you have indexes) and then apply your formula:

import pandas as pd
import numpy as np
arr = np.random.normal(size = 100) # just a random array
df = pd.DataFrame(arr, columns = ['arr'])
print(df)

you will obtain :

        arr
0  -0.834342
1   2.156343
2  -0.527963
3  -0.311767
4   1.029866
..       ...
95  0.047856
96 -1.009195
97 -0.239678
98  0.393085
99 -1.277784

CodePudding user response：

Use np.repeat:

m = np.arange(1, 24)

n = np.repeat(np.arange(1, np.ceil(len(m) / 10)   1), 10)[:len(m)]

Output:

>>> np.vstack([n, m]).T
array([[ 1.,  1.],
       [ 1.,  2.],
       [ 1.,  3.],
       [ 1.,  4.],
       [ 1.,  5.],
       [ 1.,  6.],
       [ 1.,  7.],
       [ 1.,  8.],
       [ 1.,  9.],
       [ 1., 10.],
       [ 2., 11.],
       [ 2., 12.],
       [ 2., 13.],
       [ 2., 14.],
       [ 2., 15.],
       [ 2., 16.],
       [ 2., 17.],
       [ 2., 18.],
       [ 2., 19.],
       [ 2., 20.],
       [ 3., 21.],
       [ 3., 22.],
       [ 3., 23.]])

CodePudding user response：

So if I understood your question right then you have to add acolumn to your (presumably) 1D array.

import numpy as np
array = np.random.randint(0, 100,size=100) # random numpy array (1D)
index = np.arange(array.shape[0]) # create index array for indexing
array_with_indices = np.c_[array, index]
array_with indices[:, 1] // 10   1 # taking second column as it contains the indices
# or we can convert it to a dataframe if you prefer
df = pd.DataFrame(array, index = index)
# then it should work perfectly
df.index//10   1

Then you can insert it to df1.