I have a list of tuples where each tuple has 3 elements within it:
slices = [('location', 'region', 'sub_region'),
('location', 'sub_region', ' job_level'),
('sub_region', 'region', 'location')]
In the above example, the first tuple and the last tuple would be considered duplicates, because the elements within are the same (location, region, sub_region). I'd want to keep only one of them so that my desired output would become:
[('location', 'region', 'sub_region'),
('location', 'sub_region', ' job_level')]
I tried to do this with a list comprehension, but my output ends up being an empty list:
new_slices = [(x, y, z) for x, y, z in slices if (z, x, y) not in slices]
Current Output:
new_slices = []
Any thoughts on how I might be able to accomplish this?
CodePudding user response:
slices = [('location', 'region', 'sub_region'),
('location', 'sub_region', ' job_level'),
('sub_region', 'region', 'location')]
set(tuple(sorted(s)) for s in slices)
Output- {(' job_level', 'location', 'sub_region'), ('location', 'region', 'sub_region')}
You can convert this to list again, if you want list type
Note, you mentioned the first and last tuples are some. Actually they are not - because "location" and " location" are not same
CodePudding user response:
If changing the order of elements doesn't matter for you can do this without sorting
.
(with sorting order is O((n^2)log(n)) but without sorting order is O(n))
you can use set
then convert to tuple
then get as set
like below:
>>> set(tuple(set(slc)) for slc in slices)
{('location', 'sub_region', ' job_level'),
('region', 'location', 'sub_region')}
>>> list(set(tuple(set(slc)) for slc in slices))
[('region', 'location', 'sub_region'),
('location', 'sub_region', ' job_level')]