Traverse a list with another list-CodePudding

I have two lists where the elements of list A are contained in elements of list B. Note the order in this example is fairly important.

A = ['pent', 'tri', 'rec', 'oct', 'hex']
B = ['triangle', 'rectangle', 'pentangle', 'hexagon', 'octagon']

I would like to traverse A and B and wherever A is found in B, add that to a dictionary and then add that to a dictionary.

d = {'prefix': a, 'shape':b}

l = [{'prefix': 'pent', 'shape':'pentangle'}, {'prefix':'tri' , 'shape':'triangle'}, {'prefix': 'rec', 'shape':'rectangle'},...]

I tried using the zip function, but I think that because B is unordered with respect to A, it doesn't work

dict_list = []
for i,j in zip(A,B):
    if i in j:
        d = {'prefix': i, 'shape':j}
        dict_list.append(d)

I know that I could just do something like "for i in A if i in B" but then I dont know the syntax to get the matching value into my dictionary.

I think this is a pretty basic question, I just haven't been able to get it to work. Should this work with zip? I suppose it's also possible to pre-populate prefix and then somehow use that to find shape, but again, I'm not sure the syntax. The lists I'm using are 1000 records in some instances so I can't do this manually.

CodePudding user response：

You can use list comprehension. This might not be the most efficient method, but at least the syntax is easy to understand.

A = ['pent', 'tri', 'rec', 'oct', 'hex']
B = ['triangle', 'rectangle', 'pentangle', 'hexagon', 'octagon']

dict_list = [{'prefix': a, 'shape': b} for a in A for b in B if b.startswith(a)]

print(dict_list) # [{'prefix': 'pent', 'shape': 'pentangle'}, {'prefix': 'tri', 'shape': 'triangle'}, {'prefix': 'rec', 'shape': 'rectangle'}, {'prefix': 'oct', 'shape': 'octagon'}, {'prefix': 'hex', 'shape': 'hexagon'}]

CodePudding user response：

You could try a list comprehension with a generator:

[{'prefix': x, 'shape': next((y for y in B if y.startswith(x)))} for x in A]

Output:

[{'prefix': 'pent', 'shape': 'pentangle'},
 {'prefix': 'tri', 'shape': 'triangle'},
 {'prefix': 'rec', 'shape': 'rectangle'},
 {'prefix': 'oct', 'shape': 'octagon'},
 {'prefix': 'hex', 'shape': 'hexagon'}]

Or you could first sort B to be the same order as A:

B = sorted(B, key=lambda x: next((i for i, v in enumerate(A) if x.startswith(v))))

Then just zip:

[{'prefix': x, 'shape': y} for x, y in zip(A, B)]

CodePudding user response：

Not as concise as j1-lee's version, but much better complexity (O(#A log #A #B log #B) instead of O(#A * #B)):

from collections import deque  # heap would be more efficient, but verbose

prefixes = sorted(A)  # sorting is the most expensive part
shapes = deque(sorted(B))

l = []  # now it's just linear scan
for prefix in prefixes:
    while shapes and prefix<shapes[0] and shapes[0].startswith(prefix):
        l.append({'prefix': prefix, 'shape': shapes.popleft()})

Note: it does not preserve order of A. It is achievable by sorting and operating indexes, but will make the code a bit obscure. After execution, l will be:

[{'prefix': 'hex', 'shape': 'hexagon'},
 {'prefix': 'oct', 'shape': 'octagon'},
 {'prefix': 'pent', 'shape': 'pentangle'},
 {'prefix': 'rec', 'shape': 'rectangle'},
 {'prefix': 'tri', 'shape': 'triangle'}]