I am trying to create a program which takes an input of a list of lists, and gives an output of lists with only distinct elements. For example, if I had this list:
[[1,2,3,4],[1,3,6,7],[5,8,9]]
my output should just be
[5,8,9]
because only [5,8,9] contain elements which are not found in any other list.
I have created a program which seems to work, but I was wondering if there is a more reliable way to get unique values.
viablepath=[[1,2,3,4],[1,3,6,7],[5,8,9]]
unique=[]
flattenedpath=[]
for element in viablepath:
if element[0] not in flattenedpath:
unique.append(element)
if element[0] in flattenedpath:
for list in unique:
if element[0] in list:
unique.remove(list)
for item in element:
flattenedpath.append(item)
print(flattenedpath)
print(unique)
enter code here
This code works by basically flattening the input list of lists and appending to unique any value that is not found in list of lists to unique.
i have no idea if that is a reliable strategy if im working with larger data sets which includes around 50 lists within a single list.
CodePudding user response:
Using collections.Counter
and itertools.chain.from_iterable
:
from collections import Counter
from itertools import chain
lists = [[1, 2, 3, 4], [1, 3, 6, 7], [5, 8, 9]]
counts = Counter(chain.from_iterable(lists))
unique = [
element
for element in lists
if all(counts[e] == 1 for e in element)
]
print(unique)
# [[5, 8, 9]]
CodePudding user response:
Using a python dictionary to count the frequency of each number and then check every list to see if it is unique. Using list methods like .remove or checking if the element is in a list takes O(n) while a hashmap(python dictionary) on average takes O(1) which is much faster.
def is_unique(list_of_numbers,numbers_frequency):
for number in list_of_numbers:
if numbers_frequency[number] > 1 :
return False
return True
def find_unique_lists(list_of_lists):
numbers_frequency = {}
for list_of_numbers in list_of_lists:
for number in list_of_numbers:
if number not in numbers_frequency:
numbers_frequency[number] = 0
numbers_frequency[number] = 1
result = []
for list_of_numbers in list_of_lists:
if is_unique(list_of_numbers,numbers_frequency):
result.append(list_of_numbers)
return result
input_1 = [ [1,2,3,4],
[1,3,6,7],
[5,8,9] ]
expected_output1 = [[5,8,9]]
input_2 = [[1,2,3,4],
[5,6,7,8],
[9,10,11,12]]
expected_output2 = [[1,2,3,4],
[5,6,7,8],
[9,10,11,12]]
input_3 = [[10,13,14],
[10,11,12],
[8,9,10]]
expected_output3 = []
input_4 = [[1,2,3,i] for i in range(100)]
input_4.append([100,200,300])
input_4.append([101,110,111])
expected_output4 = [[100,200,300],
[101,110,111]]
print(find_unique_lists(input_1) == expected_output1 )
print(find_unique_lists(input_2) == expected_output2 )
print(find_unique_lists(input_3) == expected_output3 )
print(find_unique_lists(input_4) == expected_output4 )
#output
#True
#True
#True
#True