I have a vector, say v=[0.001, 0.13, 0.2, ..., .9]
with length 365
. All values between 0
and 1
. I want to turn it into a 2D numpy array of size 365-by-100
, i.e. create bins of size 0.01 and see to which bin a given element of v
belongs to on a given day
in 1-365.
Let me call the 2D array M
. I want to have 1
in M[1, 0]
because v[0]
on first day belongs to the first bin.
It seems the following gives the location/indicies (i,j)
's in M
that must turn into ```1````.
matrix_indecies = pd.cut(x=v, bins=np.arange(0, 1, 0.01), labels=False).to_frame().reset_index().to_numpy()
But I do not know how to convert proper M[i,j]
's into 1 without a for-lopp.
CodePudding user response:
Instead of making a large matrix where 99% of the matrix is holding 0's you may find it useful to simply round to the nearest 1/100th, ie, round to 2 digits:
np.round(arr, 2)
CodePudding user response:
I don't understand why you don't want to use a for loop in this situation. Seems like a simple solution.
However, here is a version not using a for loop. I'm not sure if a for loop would be faster or slower.
ones_indices = np.floor(v * n_bins)
M = np.zeros((len(v), n_bins), np.bool)
M[np.arange(len(v)), ones_indices] = 1
If this is part of a performance critical part of your code you might want to preallocate the M array and the arange array used for indexing.
OR, rewrite the function in numba
if this functionality really is the bottleneck in your code.
Best of luck!