I made a dictionary using .groupdict() function, however, I am having a problem regarding elimination of certain output dictionaries. For example my code looks like this (tweet is a string that contains 5 elements separated by || :
def somefuntion(pattern,tweet):
pattern = "^(?P<username>.*?)(?:\|{2}[^|] ){2}\|{2}(?P<botprob>.*?)(?:\|{2}|$)"
for paper in tweet:
for item in re.finditer(pattern,paper):
item.groupdict()
This produces an output in the form:
{'username': 'yashrgupta ', 'botprob': ' 0.30794588629999997 '}
{'username': 'sterector ', 'botprob': ' 0.39391528649999996 '}
{'username': 'MalcolmXon ', 'botprob': ' 0.05630123819 '}
{'username': 'ryechuuuuu ', 'botprob': ' 0.08492567222000001 '}
{'username': 'dpsisi ', 'botprob': ' 0.8300337045 '}
But I would like it to only return dictionaries whose botprob is above 0.7. How do I do this?
CodePudding user response:
Specifically, as @WiktorStribizew notes, just skip iterations you don't want:
pattern = "^(?P<username>.*?)(?:\|{2}[^|] ){2}\|{2}(?P<botprob>.*?)(?:\|{2}|$)"
for paper in tweet:
for item in re.finditer(pattern,paper):
item = item.groupdict()
if item["botprob"] < 0.7:
continue
print(item)
This could be wrapped in a generator expression to save the explicit continue
, but there's enough going on as it is without making it harder to read (in this case).
UPDATE since you are apparently in a function:
pattern = "^(?P<username>.*?)(?:\|{2}[^|] ){2}\|{2}(?P<botprob>.*?)(?:\|{2}|$)"
items = []
for paper in tweet:
for item in re.finditer(pattern,paper):
item = item.groupdict()
if float(item["botprob"]) > 0.7:
items.append(item)
return items
Or using comprehensions:
groupdicts = (item.groupdict() for paper in tweet for item in re.finditer(pattern, paper))
return [item for item in groupdicts if float(item["botprob"]) > 0.7]
CodePudding user response:
I would like it to only return dictionaries whose botprob is above 0.7.
entries = [{'username': 'yashrgupta ', 'botprob': ' 0.30794588629999997 '},
{'username': 'sterector ', 'botprob': ' 0.39391528649999996 '},
{'username': 'MalcolmXon ', 'botprob': ' 0.05630123819 '},
{'username': 'ryechuuuuu ', 'botprob': ' 0.08492567222000001 '},
{'username': 'dpsisi ', 'botprob': ' 0.8300337045 '}]
filtered_entries = [e for e in entries if float(e['botprob'].strip()) > 0.7]
print(filtered_entries)
output
[{'username': 'dpsisi ', 'botprob': ' 0.8300337045 '}]