Suppose I have an array of size n that has some float values. I want to create a new array that has subarrays in it, where each subarray would have the indices of all elements in the original array that have equal values. So, for example, given an array givenArray=[50,20,50,20,40], the answer would be resultArray=[[0,2],[1,3],[4]].
The brute force way is to iterate on the original array, and in each iteration, iterate on the result array, compare the value to the first value in each subarray; if equal to it, add its index there. If not equal to the first value of any of the subarrays, create a new subarray and put its index there. Code for this in python would be something like:
resultArray=[]
for i in range(0,len(givenArray)):
flag=0
for j in range(0,len(resultArray)):
if(givenArray[i]==givenArray[resultArray[j][0]]):
resultArray[j].append(i)
flag=1
if(flag==0):
resultArray.append([i])
This solution has a complexity of O(n^2). Can this be done at a better complexity? How? Ideas and python code would be highly appreciated! Thanks a lot in advance!
Aly
CodePudding user response:
You could use a defaultdict
and enumerate
to do this in linear time:
from collections import defaultdict
result = defaultdict(list)
for i, n in enumerate(givenArray):
result[n].append(i)
# {50: [0, 2], 20: [1, 3], 40: [4]}
result = [*result.values()]
# [[0, 2], [1, 3], [4]]
Note however, that your example has int
values and not float
. float
s are less well-behaved as dictionary keys as they might be subject to rounding or precision errors, especially if they are results of some sort of calculations.
CodePudding user response:
@schwobaseggl's answer with a dict is probably the best, but for completeness, here is a solution using groupby.
This solution returns the groups in increasing order of the values.
import operator
import itertools
def group_indices(array):
sorted_with_indices = sorted(enumerate(array), key=operator.itemgetter(1))
groups = itertools.groupby(sorted_with_indices, key=operator.itemgetter(1))
return [[i for i,v in g] for k,g in groups]
print(group_indices([50,20,50,20,40]))
# [[1, 3], [4], [0, 2]]
Relevant documentation: