I have a list which can contains hundreds of thousands of lists where int are stored. Let's say for example:
list = [ [0,5,9], [1,2,4], [1,2,7,4], [3,100,42] ... ]
I need to create a new list that contains all the elements where a specific element is present.
For example my new_list[0]
will be a flat list of all list where element 0 exists.
A dumb for-for loops would be like:
# list_ref <- my list of list
gr_cl=[]
for i in range(len(list_ref)):
clust=[]
for j in list_ref:
if i in j:
clust.append(j)
gr_cl.append([item for sublist in clust for item in sublist]) #flat it
# set
gr_cl_set = [list(set(item)) for item in gr_cl]
I tried to implement it as list comprehension, but it still takes too much time to make my code efficient.
Any idea?
CodePudding user response:
Maybe, but the question miss a contraint to limit the output list is missing.
The code below give all sublist to the maximum of the from collections import defaultdict
from collections import defaultdict
inputlist= [ [0,5,9], [1,2,4], [1,2,7,4], [3,100,42] ]
# create a dictionary in which :
# keys : value of the elements of the sublists
# values : index of the sublists of inputlist which contains the key
elt_refs = defaultdict(list)
max_value = 0
for i, sublist in enumerate(inputlist):
for elt in sublist:
if elt > max_value: max_value = elt
elt_refs[elt].append(i)
# build the result by iterating on the list of the element of the dictionnary
# and filling the gaps
result = []
result_i = 0
for k, refs in sorted(elt_refs.items()):
# fill the gaps
gap = k - result_i - 1
for _ in range(gap):
result.append([])
result_i = k
# flatten refs
flat = []
for ref in refs:
flat.extend(inputlist[ref])
result.append(list(set(flat)))
print(result)
CodePudding user response:
Why not put the lists in dicts? So they you say my_dict[#] is list of indices in the original list containing #. Likely faster and you can still get the output you want (a list of indices).
CodePudding user response:
You were pretty close to implementing it as a list comprehension, actually. This is the modified version:
# sample numbers
list_of_nums = [[0,5,9], [1,2,4], [1,2,7,4], [3,100,42], [0,1,3], [1,2,3,4], [10,0,9]]
contains_zero = [sublist for sublist in list_of_nums if 0 in sublist]
print(contains_zero) # outputs: `[[0, 5, 9], [0, 1, 3], [10, 0, 9]]`