Given an array, what's an efficient way to create an array of arrays where each of these subarr-CodePudding

Suppose I have an array of size n that has some float values. I want to create a new array that has subarrays in it, where each subarray would have the indices of all elements in the original array that have equal values. So, for example, given an array givenArray=[50,20,50,20,40], the answer would be resultArray=[[0,2],[1,3],[4]].

The brute force way is to iterate on the original array, and in each iteration, iterate on the result array, compare the value to the first value in each subarray; if equal to it, add its index there. If not equal to the first value of any of the subarrays, create a new subarray and put its index there. Code for this in python would be something like:

resultArray=[]
for i in range(0,len(givenArray)):
    flag=0
    for j in range(0,len(resultArray)):
        if(givenArray[i]==givenArray[resultArray[j][0]]):
            resultArray[j].append(i)
            flag=1
    if(flag==0):
        resultArray.append([i])

This solution has a complexity of O(n^2). Can this be done at a better complexity? How? Ideas and python code would be highly appreciated! Thanks a lot in advance!

Aly

CodePudding user response：

You could use a defaultdict and enumerate to do this in linear time:

from collections import defaultdict

result = defaultdict(list)

for i, n in enumerate(givenArray):
    result[n].append(i)
# {50: [0, 2], 20: [1, 3], 40: [4]}

result = [*result.values()]
# [[0, 2], [1, 3], [4]]

Note however, that your example has int values and not float. floats are less well-behaved as dictionary keys as they might be subject to rounding or precision errors, especially if they are results of some sort of calculations.

CodePudding user response：

@schwobaseggl's answer with a dict is probably the best, but for completeness, here is a solution using groupby.

This solution returns the groups in increasing order of the values.

import operator
import itertools

def group_indices(array):
  sorted_with_indices = sorted(enumerate(array), key=operator.itemgetter(1))
  groups = itertools.groupby(sorted_with_indices, key=operator.itemgetter(1))
  return [[i for i,v in g] for k,g in groups]

print(group_indices([50,20,50,20,40]))
# [[1, 3], [4], [0, 2]]

Relevant documentation: