features = ['ar','ar urticaria','urticaria','including','allergic']
rubbish_words = ['including','ar']
words = []
for line in features:
new_words = ' '.join([word for word in line.split() if not any([zeyda in word for zeyda in rubbish_words])])
words.append(new_words)
print(words)
['', '', '', '', 'allergic']
Result expected : ['',' urticaria','urticaria','','allergic']
CodePudding user response:
import re
features = ['ar', 'ar urticaria', 'urticaria', 'including', 'allergic']
rubbish_words = ['including', 'ar']
rubbish_words = '|'.join(rubbish_words)
new_features = [re.sub(rubbish_words, '',word).strip() for word in features]
print(new_features)
['', 'urticia', 'urticia', '', 'allergic']
CodePudding user response:
Iterate over the features. Split each token. Build a list from those tokens but exclude any that are found in rubbish_words. Reconstruct (join) and append to new output list.
features = ['ar','ar urticaria','urticaria','including','allergic']
rubbish_words = ['including','ar']
new_words = []
for w in features:
tl = [w_ for w_ in w.split() if w_ not in rubbish_words]
new_words.append(''.join(tl))
print(new_words)
Output:
['', 'urticaria', 'urticaria', '', 'allergic']