I have the below nested list:
sample = [['Ban', 'App'], ['Ban', 'Ora'], ['Gra', 'App'], ['Gra', 'Ora'], ['Kiw','App'], ['Kiw', 'Ora'], ['Man', 'Blu'], ['Pin', 'App']]
I need to consider items in each sub-list of the nested list, sample
, that don't appear in any other sub-lists.
For example, my output list needs to contain the first element of the nested_list. I need to compare ['Ban', 'App'] with the rest of the list. As "Ban" in element 2 and "App" in element 3 are present in ['Ban', 'App']
, we do not consider them. My next output element will is ['Gra', 'Ora']
as these items are not in ['Ban', 'App']
.
Now my output is [['Ban', 'App'], ['Gra', 'Ora']]
and I have to compare the rest of the nested list with these two elements. My next elements are ['Kiw','App']
and ['Kiw', 'Ora']
. As 'App' is in ['Ban', 'App']
, and 'Ora' is in ['Gra', 'Ora']
, this won't be in the output list.
My output list is still [['Ban', 'App'], ['Gra', 'Ora']]
. My next element is ['Man', 'Blu']
and these are brand new items, this will be added in my output list.
My new output list is [['Ban', 'App'], ['Gra', 'Ora'], ['Man', 'Blu']]
. The last element is ['Pin', 'App']
and as "App" is in ['Ban', 'App']
, we don't consider this item even though "Pin" is a new item.
My final output should be [['Ban', 'App'], ['Gra', 'Ora'], ['Man', 'Blu']]
.
final_output = [['Ban', 'App'], ['Gra', 'Ora'], ['Man', 'Blu']]
I started with the below code but this doesn't do exactly what I need it to do:
j =0
for i in range(len(sample)):
#print ("I:", str(i))
#print ("J" ,str(j))
i = j
for j in range(1, len(sample)):
if sample[i][0] == sample[j][0] or sample[i][0] == sample[j][1] or sample[i][1] == sample[j][0] or sample[i][1] == sample[j][1]:
pass
else:
print (sample[i], sample[j])
#print (j)
i = j
break
CodePudding user response:
I would keep a set that keeps track of items already seen and only add the pair to the final list if there is no intersection with that set.
st = set()
final_output = []
for pair in sample:
if not st.intersection(pair):
final_output.append(pair)
st.update(pair)
print(final_output)
# [['Ban', 'App'], ['Gra', 'Ora'], ['Man', 'Blu']]
CodePudding user response:
You should use a set to hold the values you've already looked at. You can then iterate over each item in each sub-list and check if they're in the set:
seen = set()
filtered = []
for sublist in sample:
if sublist[0] in seen or sublist[1] in seen:
continue
filtered.append(sublist)
seen.add(sublist[0])
seen.add(sublist[1])
This code works by iterating over sample
and checking if any of the items in each sublist
therein is in the set. If it is, then we'll ignore that item and continue on. Otherwise, add sublist
to the filtered list and add the items to the set. This code will run much faster than what you have (O(n) vs. O(n^2)).
One thing this code does not consider is the case where your sublist has one item that has been seen and one which hasn't. You may need to make modifications to your code to handle that case.