Keep duplciate items in list of tuples if only the first index matches between the tuples-CodePudding

Input [(1,3), (3,1), (1,5), (2,3), (2,4), (44,33), (33,22), (44,22), (22,33)]

Expected Output [(1,3), (1,5), (2,3), (2,4), (44,33), (44,22)]

I am trying to figure out the above and have tried lots of stuff. So far my only success has been,

for x in range(len(list1)):
if list1[0][0] == list1[x][0]:
    print(list1[x])

Output: (1, 3) \n (1, 5)

Any sort of advice or help would be appreciated.

CodePudding user response：

Use a collections.defaultdict(list) keyed by the first value, and keep only the values that are ultimately duplicated:

from collections import defaultdict  # At top of file, for collecting values by first element
from itertools import chain          # At top of file, for flattening result

dct = defaultdict(list)
inp = [(1,3), (3,1), (1,5), (2,3), (2,4), (44,33), (33,22), (44,22), (22,33)]
# For each tuple
for tup in inp:
   first, _ = tup  # Extract first element (and verify it's actually a pair)
   dct[first].append(tup)  # Collect with other tuples sharing the same first element

# Extract all lists which have two or more elements (first element duplicated at least once)
# Would be list of lists, with each inner list sharing the same first element
onlydups = [lst for firstelem, lst in dct.items() if len(lst) > 1]

# Flattens to make single list of all results (if desired)
flattened_output = list(chain.from_iterable(onlydups))

Importantly, this doesn't required ordered input, and scales well, doing O(n) work (scaling your solution naively would produce a O(n²) solution, considerably slower for larger inputs).

CodePudding user response：

Another approach is the following :

def sort(L:list):
    K = []
    for i in L :
        if set(i) not in K :
            K.append(set(i))
    output = [tuple(m) for m in K]
    return output

output :

[(1, 3), (1, 5), (2, 3), (2, 4), (33, 44), (33, 22), (44, 22)]