Home > Software design >  Remove Duplicate 3 value Tuples from list of tuples with similar elements
Remove Duplicate 3 value Tuples from list of tuples with similar elements

Time:11-12

I have a list of tuples where each tuple has 3 elements within it:

slices = [('location', 'region', 'sub_region'),
 ('location', 'sub_region', ' job_level'),
 ('sub_region', 'region', 'location')]

In the above example, the first tuple and the last tuple would be considered duplicates, because the elements within are the same (location, region, sub_region). I'd want to keep only one of them so that my desired output would become:

[('location', 'region', 'sub_region'),
 ('location', 'sub_region', ' job_level')]

I tried to do this with a list comprehension, but my output ends up being an empty list:

new_slices = [(x, y, z) for x, y, z in slices if (z, x, y) not in slices]

Current Output:

new_slices = []

Any thoughts on how I might be able to accomplish this?

CodePudding user response:

slices = [('location', 'region', 'sub_region'),
 ('location', 'sub_region', ' job_level'),
 ('sub_region', 'region', 'location')]

set(tuple(sorted(s)) for s in slices)

Output- {(' job_level', 'location', 'sub_region'), ('location', 'region', 'sub_region')}

You can convert this to list again, if you want list type

Note, you mentioned the first and last tuples are some. Actually they are not - because "location" and " location" are not same

CodePudding user response:

If changing the order of elements doesn't matter for you can do this without sorting.

(with sorting order is O((n^2)log(n)) but without sorting order is O(n))

you can use set then convert to tuple then get as set like below:

>>> set(tuple(set(slc)) for slc in slices)
{('location', 'sub_region', ' job_level'),
 ('region', 'location', 'sub_region')}

>>> list(set(tuple(set(slc)) for slc in slices))
[('region', 'location', 'sub_region'),
 ('location', 'sub_region', ' job_level')]
  • Related