Home > Software engineering >  Match strings of different length in two lists of different length
Match strings of different length in two lists of different length

Time:11-17

Say I have two flat lists of strings:

a = ["today", "I", "want", "to", "eat", "some", "cake."]
b = ["to", "da", "y", "I", "wa", "nt", "to", "ea", "t", "some", "ca", "ke", "."]

Where in list b some strings (not all) of list a are split into multiple substrings. Note that the substrings in b that correspond to the strings in a are adjacent and in the same order, as in the example above.

I want to obtain a list c where the substrings in b that correspond to a single string in a are put together in a sublist:

c = [["to", "da", "y"], ["I"], ["wa", "nt"], ["to"], ["ea", "t"], ["some"], ["ca", "ke", "."]]

Unfortunately I don't have any code to share since I don't know how to approach this problem.

Thanks!

CodePudding user response:

a = ["today", "I", "want", "to", "eat", "some", "cake."]
b = ["to", "da", "y", "I", "wa", "nt", "to", "ea", "t", "some", "ca", "ke", "."]
c = []

for element in a:
    temp_list = []
    while "".join(temp_list) != element:
        temp_list.append(b.pop(0))
    c.append(temp_list)

Value of c:

[['to', 'da', 'y'],
 ['I'],
 ['wa', 'nt'],
 ['to'],
 ['ea', 't'],
 ['some'],
 ['ca', 'ke', '.']]

I don't know if there is any other clever way to do it. Just use .pop(0) to save u a bit of code

CodePudding user response:

@Ben.S.'s answer works but costs O(m x n x k) in time complexity, where m and n are the lengths of the two lists and k is the average length of the words.

A more efficient approach that solves the problem in a time complexity of O(n) is to keep appending word fragments from b to the last sub-list of a new list, but keep track of the total length of fragments appended to the last sub-list and append a new sub-list when the total length equals the length of the corresponding word in a:

c = []
for i in b:
    if not c or length == target:
        length = 0
        target = len(a[len(c)])
        c.append([])
    length  = len(i)
    c[-1].append(i)

Demo: https://replit.com/@blhsing/ExoticConsiderateSearchservice

  • Related