Home > database >  joining two strings in a list of strings based, if two specific string occur after one another
joining two strings in a list of strings based, if two specific string occur after one another

Time:02-25

I'm trying to combine two strings in a list, where the combination of the terms have a different meaning compared to when they are tokenized individually.

An example of this would be:

['I', 'want','to','join','a','football','team','in','2022']

The goal is to join the strings 'football' and 'team', with a _ if the two terms occur after one another, resulting in this string football_team.

The final list would looks like this:

['I', 'want','to','join','a','football_team','in','2022']

Any help is appreciated as I only get to the point where I can join the whole list.

EDIT:

I have been trying to join terms using this: Is there a way to combine two tokens from a list of tokens?

Also I have tried this, but it joins every element in the list: How to concatenate items in a list to a single string?

EDIT 2:

In answer to a question in the comments, "How would one understand which words would hold some meaning?"

I have a pre-defined list of string combinations that need to be merged.

CodePudding user response:

" ".join(['I', 'want','to','join','a','football','team','in','2022']).replace("football team","football_team").split(" ")

CodePudding user response:

By this way you can define an arbitrary set of pairs.

pairs = {"football" : "team"}

a_list = ['I', 'want','to','join','a','football','team','in','2022']
list_copy = a_list.copy()

for one, another in zip(list_copy.copy()[:-1], list_copy.copy()[1:]):
    if one in pairs and another == pairs[one]:
        i = a_list.index(one)
        del a_list[i:i 2]
        a_list.insert(i, one   "_"   another)
  • Related