Home > Back-end >  Convert a vector into a numpy array; discritize them into bins
Convert a vector into a numpy array; discritize them into bins

Time:04-07

I have a vector, say v=[0.001, 0.13, 0.2, ..., .9] with length 365. All values between 0 and 1. I want to turn it into a 2D numpy array of size 365-by-100, i.e. create bins of size 0.01 and see to which bin a given element of v belongs to on a given day in 1-365.

Let me call the 2D array M. I want to have 1 in M[1, 0] because v[0] on first day belongs to the first bin.

It seems the following gives the location/indicies (i,j)'s in M that must turn into ```1````.

matrix_indecies = pd.cut(x=v, bins=np.arange(0, 1, 0.01), labels=False).to_frame().reset_index().to_numpy()

But I do not know how to convert proper M[i,j]'s into 1 without a for-lopp.

CodePudding user response:

Instead of making a large matrix where 99% of the matrix is holding 0's you may find it useful to simply round to the nearest 1/100th, ie, round to 2 digits:

np.round(arr, 2)

CodePudding user response:

I don't understand why you don't want to use a for loop in this situation. Seems like a simple solution.

However, here is a version not using a for loop. I'm not sure if a for loop would be faster or slower.

ones_indices = np.floor(v * n_bins)
M = np.zeros((len(v), n_bins), np.bool)
M[np.arange(len(v)), ones_indices] = 1

If this is part of a performance critical part of your code you might want to preallocate the M array and the arange array used for indexing.

OR, rewrite the function in numba if this functionality really is the bottleneck in your code.

Best of luck!

  • Related