Home > database >  Summation of indices with same element in an array in Python
Summation of indices with same element in an array in Python

Time:07-31

I have two lists Ii01 and Iv01. Ii01 consists of a numpy array with indices. Iv01 consists of values corresponding to these indices. For example, [2.1] in Iv01 corresponds to [3,1] in Ii01.

I want to sum indices with the same j. For example, values corresponding to [0, 3],[2, 3],[4, 3] in Iv01 are to be summed since there are 3 elements with j=3. Similarly, values corresponding to [0, 4],[2, 4] in Iv01 are to be summed since there are 2 elements with j=4. I present the expected output.

import numpy as np
Ii01 = [np.array([[3, 1],
       [0, 2],
       [0, 3],
       [2, 3],
       [4, 3],
       [0, 4],
       [2, 4]])]

Iv01 = [np.array([[2.1],
       [3.4],
       [1.5],
       [9.7],
       [6.5],
       [4.2],
       [1.7]])]

The expected output is

[np.array([[2.1],
       [3.4],
       [1.5 9.7 6.5],
       [4.2 1.7]])]

CodePudding user response:

One method is to use a property of least squares, although it might seem a little cryptic.

import numpy as np

a = np.array([[3, 1], [0, 2], [0, 3], [2, 3], [4, 3], [0, 4], [2, 4]])

uniq = np.unique(a[:, 1][None, :])

y = a[:, 0]
x = a[:, 1][:, None] == uniq

output = np.vstack((np.linalg.lstsq(x, y, rcond=None)[0], uniq)).T

print(output)
# [[3. 1.]
#  [0. 2.]
#  [2. 3.]
#  [1. 4.]]

Another option is not to use numpy but to use itertools.groupby. This assumes that the same group is formd by adjacent rows sharing the second element.

from itertools import groupby
from operator import itemgetter

def mean(xs):
    return sum(xs) / len(xs)

output = [[mean([x[0] for x in g]), k] for k, g in groupby(a, key=itemgetter(1))]

print(output) # [[3.0, 1], [0.0, 2], [2.0, 3], [1.0, 4]]

By the way, note that Ii01 is a singleton list of an array. In the above pieces of code, I assumed you are dealing with just one array, i.e., a = Ii01[0].

  • Related