I have tried researching this but can't find anything like what I am looking for.
I am trying to find a specific phrase in a list. Here is a test list:
data = {"text":["Map:","Internet",
"Subscriptions","","Map:",
"Adult","Literacy","and",
"Numeracy","|","8"]}
I want to get the index position of the first word in a phrase I am looking for like: Map: Adult Literacy and Numeracy
. The answer for this would be 4
since the first word of the phrase is Map:
. However, there are 2 Maps:
in the list and I only need to find the one that is apart of the whole phrase Map: Adult Literacy and Numeracy
.
Here is what I tried:
teststring = "Map: Adult Literacy and Numeracy"
teststring_split = teststring.split(" ")
data = {"text":["Map:","Internet",
"Subscriptions","","Map:",
"Adult","Literacy","and",
"Numeracy","|","8"]}
if teststring in " ".join(data["text"]):
idx = data["text"].index(teststring.split(' ')[0])
print(idx)
However it comes out it with 0
which makes sense because I am not sure how to get the specific Maps:
that is apart of the phrase.
EDIT I am close because of @Alexander 's answer. I would have accepted his answer as correct but his answer only checks the first two index values in the phrase's split string. I would need to check the value as the list and phrases are dynamic and some phrases are very similar in wording.
Here is the code I have so far now:
for i in range(len(data['text'])):
if data['text'][i] == teststring_split[0]:
for m in range(len(teststring_split)):
if data['text'][i m] == teststring_split[m]:
print(teststring_split[m])
This will output:
Map:
Map:
Adult
Literacy
and
Numeracy
So I can get a confirmation on the phrase as it prints out but I am not sure how to get the index of 4 after confirming the last word Numeracy
CodePudding user response:
List comprehension will work. Just iterate through the data searching for values where the index == Map:
and the following index is the second term of the teststring.
teststring = "Map: Adult Literacy and Numeracy"
teststring_split = teststring.split(" ")
data = {"text":["Map:","Internet",
"Subscriptions","","Map:",
"Adult","Literacy","and",
"Numeracy","|","8"]}
idxs = [i for i in range(len(data['text']))
if data['text'][i] == teststring_split[0]
and data['text'][i:i len(teststring_split)] == teststring_split]
print(idxs)
Output:
[4]
CodePudding user response:
You should probably start by, instead of converting teststring
to a list, user "".join(data)
to make data
one string. This makes it much easier to scan through.
Then, use regular expressions to search for your phrase:
import re
teststring = "Map: Adult Literacy and Numeracy"
data = {"text":["Map:","Internet",
"Subscriptions","","Map:",
"Adult","Literacy","and",
"Numeracy","|","8"]}
data = "".join(data)
match = re.search(teststring, data)
print(match)
CodePudding user response:
I came up with an answer at the same time @alexander fixed his answer. His is better as its less code but here is the version I came up with before I saw his answer:
for i in range(len(data['text'])):
if data['text'][i] == teststring_split[0]:
testindexchecker = 0
for m in range(len(teststring_split)):
if data['text'][i m] == teststring_split[m]:
print(teststring_split[m])
testindexchecker = testindexchecker 1
if testindexchecker == len(teststring_split):
idxs = i
print(i)