I have two numpy arrays with the same dimensions: weights, and percents. Percents is 'real' data, and the weights is how many of each 'real' data there is in the histogram.
Eg)
weights = [[0, 1, 1, 4, 2]
[0, 1, 0, 3, 5]]
percents = [[1, 2, 3, 4, 5]
[1, 2, 3, 4, 5]]
(every row of percents is the same)
I would like to "multiply" these together in such a way that I produce weights[x] * [percents[x]]:
results = [[0 * [1] 1 * [2] 1 * [3] 4 * [4] 2 * [5]
[0 * [1] 1 * [2] 0 * [3] 3 * [4] 5 * [5]]
= [[2, 3, 4, 4, 4, 4, 5, 5]
[2, 4, 4, 4, 5, 5, 5, 5, 5]]
Notice that the lengths of each row can be different.. Ideally this can be done in numpy but because of this it may end up being a list of lists.
Edit: I've been able to cobble together these nested for loops but obviously it's not ideal:
list_of_hists = []
for index in df.index:
hist = []
# Create a list of lists, later to be flattened to 'results'
for i, percent in enumerate(percents):
hist.append(
# For each percent, create a list of [percent] * weight
[percent]
* int(
df.iloc[index].values[i]
)
)
# flatten the list of lists in hist
results = [val for list_ in hist for val in list_]
list_of_hists.append(results)
CodePudding user response:
There is a np.repeat
designed for such kind of operations but it doesn't work in 2D case. So you need to work with flattened views of arrays instead.
weights = np.array([[0, 1, 1, 4, 2], [0, 1, 0, 3, 5]])
percents = np.array([[1, 2, 3, 4, 5], [1, 2, 3, 4, 5]])
>>> np.repeat(percents.ravel(), weights.ravel())
array([2, 3, 4, 4, 4, 4, 5, 5, 2, 4, 4, 4, 5, 5, 5, 5, 5])
And after that you need to select index locations where to split it:
>>> np.split(np.repeat(percents.ravel(), weights.ravel()), np.sum(weights, axis=1)[:-1])
[array([2, 3, 4, 4, 4, 4, 5, 5]), array([2, 4, 4, 4, 5, 5, 5, 5, 5])]
Note that np.split
is quite unefficient operation as well as your wish to make array out of rows of unequal lenghts.
CodePudding user response:
You can use list-comprehension and reduce
from functools
:
import functools
res=[functools.reduce(lambda x,y: x y,
[x*[y] for x, y in zip(w, p)])
for w, p in zip(weights, percents)]
OUTPUT:
[[2, 3, 4, 4, 4, 4, 5, 5],
[2, 4, 4, 4, 5, 5, 5, 5, 5]]
Or, just list-comprehension solution only:
res= [[j for i in [x*[y]
for x, y in zip(w, p)]
for j in i]
for w, p in zip(weights, percents)]
OUTPUT:
[[2, 3, 4, 4, 4, 4, 5, 5],
[2, 4, 4, 4, 5, 5, 5, 5, 5]]