I would like to create a function which loops through an array and combines the third element of each if they have the same first two elements, however only ways I could think of have a very high complexity, any recommended algorithm [python preferred, but any pseudo-code or algorithm will do]:
example input -> Delta = [ [0, 0, '1'], [0, 1, '1'], [1, 2, '0'], [1, 2, '1'], [2, 2, '0'], [2, 2, '1'] ]
expected output -> Delta = [ [0, 0, '1' ], [0, 1, '1' ], [1, 2, '0, 1' ], [2, 2, '0, 1' ] ]
Thanks for your time
CodePudding user response:
You could sort the lists and then use a loop:
from typing import List, Union
def merge_lists(lists: List[List[Union[int, str]]]) -> List[List[Union[int, str]]]:
"""Merges lists based on first two elements."""
if not lists:
return lists
sorted_lists = sorted(lists)
result = [sorted_lists[0]]
for sub_list in sorted_lists[1:]:
curr_first, curr_second, key = sub_list
prev_first, prev_second, *keys = result[-1]
if curr_first == prev_first and curr_second == prev_second and key not in keys:
result[-1].append(key)
else:
result.append(sub_list)
return [[first, second, ', '.join(keys)] for first, second, *keys in result]
lists = [[0, 0, '1'], [0, 1, '1'], [1, 2, '0'], [1, 2, '1'], [2, 2, '0'], [2, 2, '1']]
print(f'{lists = }')
print(f'{merge_lists(lists) = }')
Output:
lists = [[0, 0, '1'], [0, 1, '1'], [1, 2, '0'], [1, 2, '1'], [2, 2, '0'], [2, 2, '1']]
merge_lists(lists) = [[0, 0, '1'], [0, 1, '1'], [1, 2, '0, 1'], [2, 2, '0, 1']]
If the first two elements are strings instead of numbers, use something like natsort
.
CodePudding user response:
You could use itertools.groupby
for this (and operator.itemgetter
):
from itertools import groupby
from operator import itemgetter
delta = [ [0, 0, '1'], [0, 1, '1'], [1, 2, '0'], [1, 2, '1'], [2, 2, '0'], [2, 2, '1'] ]
result = [
[*key, ", ".join(map(itemgetter(2), group))]
for key, group in groupby(sorted(delta), key=itemgetter(0, 1))
]
NB: sorted
is only needed when input is not yet sorted -- your example was sorted.
CodePudding user response:
Grouping of sorted data can be done with itertools.groupby
: see trincot's answer.
Grouping of unsorted data can be done with a dict
of lists:
def combine_third_on_first_two(delta):
d = {}
for a,b,c in delta:
d.setdefault((a,b), []).append(c)
return [(a, b, ', '.join(l)) for (a,b),l in d.items()]
delta = [ [0, 0, '1'], [0, 1, '1'], [1, 2, '0'], [1, 2, '1'], [2, 2, '0'], [2, 2, '1'] ]
print(combine_third_on_first_two(delta))
# [(0, 0, '1'), (0, 1, '1'), (1, 2, '0, 1'), (2, 2, '0, 1')]
This grouping with dict
is very standard, and has been implemented in external module more_itertools
, as more_itertools.map_reduce
:
from more_itertools import map_reduce
from operator import itemgetter
delta = [ [0, 0, '1'], [0, 1, '1'], [1, 2, '0'], [1, 2, '1'], [2, 2, '0'], [2, 2, '1'] ]
d = map_reduce(delta,
keyfunc=itemgetter(0, 1), valuefunc=itemgetter(2), reducefunc=', '.join)
result = [(a,b,s) for (a,b), s in d.items()]
print(result)
# [(0, 0, '1'), (0, 1, '1'), (1, 2, '0, 1'), (2, 2, '0, 1')]