Home > Back-end >  Group substrings after two words
Group substrings after two words

Time:10-15

I have a string:

s = """YES string1 string2 YES string3 string4 string5 YES string6 NO String7 NO string8 string9 YES string10 string11"""

I need output like that:

wanted_output = {
"YES": [
    "string1 string2", 
    "string3 string4 string5", 
    "string6", 
    "string10 string11",
],
"NO" : ["String7", "string8 string9"]

}

I have working function for that, but it looks not elegant for me. Do you know more elegant way to solve it?

def convert(text):
    words = text.split()
    yes = "YES"
    no = "NO"
    yes_list = []
    no_list = []
    current = ""
    for word in words:
        if word == yes:
            current = yes
            yes_list.append("|")
            continue
        if word == no:
            current = no
            no_list.append("|")
            continue
        if current == yes:
            yes_list.append(word)
        elif current == no:
            no_list.append(word)
    yes_str = " ".join(yes_list)
    no_str = " ".join(no_list)
    yes_list = yes_str.split("|")
    no_list = no_str.split("|")
    yes_list = [yes_str.strip() for yes_str in yes_list if yes_str]
    no_list = [no_str.strip() for no_str in no_list if no_str]

    return {"YES": yes_list, "NO": no_list}

CodePudding user response:

replace the yes and no with characters(make sure it will not come in text) and then split.

s = """YES string1 string2 YES string3 string4 string5 YES string6 NO String7 NO string8 string9 YES string10 string11"""


def convert(text):
    data = s.replace('YES', '*YES*').replace('NO', '*NO*').split('*')
    data_strip = [i.strip() for i in data if i.strip()]
    yes_list = []
    no_list = []
    for ind, val in enumerate(data_strip):
        if 'YES' in val:
            yes_list.append(data_strip[ind   1])
        if 'NO' in val:
            no_list.append(data_strip[ind   1])
    return {"YES": yes_list, "NO": no_list}


print(convert(s))
>>> {'YES': ['string1 string2', 'string3 string4 string5', 'string6', 'string10 string11'], 'NO': ['String7', 'string8 string9']}
  • Related