I have to lists:
keywords = ['critic', 'argu', 'dog', 'cat']
splitSentences = ['Add', 'critical', 'argument', 'birds']
I need to find how many words in splitSentence
begin with words of keywords
. In my example, that would be 2
(for critical matching "critic" and argument matching "argu").
The problem is that doing set(keywords).intersection(splitSentences)
returns 0
. I tried prefixing every word in keywords
with ^
, but it still returns 0.
Apologies, quite new on Python. I'm working on a Jupyter notebook.
CodePudding user response:
With regex:
import re
for i in keywords:
count = 0
pref = '^' i
for word in splitSentences:
if re.match(pref, word):
count = 1
print(count)
The semi one liner:
for i in keywords:
print(sum([1 for word in splitSentences if word.startswith(i)]))
The one liner:
print({el:sum([1 for word in splitSentences if word.startswith(el)]) for el in keywords})
CodePudding user response:
keywords = ['critic', 'argu', 'dog', 'cat']
splitSentences = ['Add', 'critical', 'argument', 'birds']
for s in splitSentences:
for k in keywords:
if s.startswith(k):
print(s)
Pretty much self-explanatory. Iterate on splitSentences
and for each word in splitSentences
iterate on keywords
and check if it starts with the keyword.
One-liner:
[s for k in keywords for s in splitSentences if s.startswith(k)]
Time complexity: O(sk)
. Trie
data-structure will be more efficient: O(s k)