Home > database >  Extract space separated words from a sentence in Python
Extract space separated words from a sentence in Python

Time:06-23

I have list of strings say, x1 = ['esk','wild man','eskimo', 'sta','( )-6-[amina(4-chlora)(1-metha-1h-imidol-5-yl)mhyl]-4-(3-chlora)-1-methyl-2(1h)-quinoa'] I need to extract the x1s present in few sentences.

My sentence is "eskimo lives as a wild man in wild jungle and he stands as a guard". In the sentence, I need to extract first word eskimo and the seventh and eighth words wild man and they are separate words as in x1. I should not extract "stands" even though sta is present in stands.

def get_name(input_str):

 prod_name= []
    for row in x1:
        if (row.strip().lower()in input_str.lower().strip()) or (len([x for x in input_str.split() if "\b" x in row])>0):
            prod_name.append(row) 
return list(set(prod_name))

The function get_name("eskimo lives as a wild man in wild jungle and he stands as a guard") returns

[esk, eskimo,wild man,sta]

But the expected is

[eskimo,wild man]

May I know what has to be changed in the code?

CodePudding user response:

You can use regular expressions

import re

x1 = ['esk','wild man','eskimo', 'sta']

my_str = "eskimo lives as a wild man in wild jungle and he stands as a guard"
my_list = []

for words in x1:
    if re.search(r'\b'   words   r'\b', my_str):
        my_list.append(words)
print(my_list)

CodePudding user response:

You could simply use str.split(" ") to get a list of all the words in the sentence, and then do the following:

s = "eskimo lives as a wild man in wild jungle and he stands as a guard"

l = s.split(" ")

x1 = ['esk','wild man','eskimo', 'sta','( )-6-[amina(4-chlora)(1-metha-1h-imidol-5-yl)mhyl]-4-(3-chlora)-1-methyl-2(1h)-quinoa']
new_x1 = [word.split(" ") for word in x1 if " " in word]   [word for word in x1 if " " not in word]

ans = []

for x in new_x1:
    if type(x) == str:
        if x in l:
            ans.append(x)
    else:
        temp = ""
        for i in x:
            temp  = i   " "
        temp = temp[:-1]
        if all(sub_x in l for sub_x in x) and temp in s:
            ans.append(temp)

print(ans)
  • Related