Home > database >  Limiting the output
Limiting the output

Time:09-28

I made a dictionary using .groupdict() function, however, I am having a problem regarding elimination of certain output dictionaries. For example my code looks like this (tweet is a string that contains 5 elements separated by || :

 def somefuntion(pattern,tweet):
    pattern = "^(?P<username>.*?)(?:\|{2}[^|] ){2}\|{2}(?P<botprob>.*?)(?:\|{2}|$)"
      for paper in tweet:
         for item in re.finditer(pattern,paper):
              item.groupdict()

This produces an output in the form:

{'username': 'yashrgupta ', 'botprob': ' 0.30794588629999997 '}
{'username': 'sterector ', 'botprob': ' 0.39391528649999996 '}
{'username': 'MalcolmXon ', 'botprob': ' 0.05630123819 '}
{'username': 'ryechuuuuu ', 'botprob': ' 0.08492567222000001 '}
{'username': 'dpsisi ', 'botprob': ' 0.8300337045 '}

But I would like it to only return dictionaries whose botprob is above 0.7. How do I do this?

CodePudding user response:

Specifically, as @WiktorStribizew notes, just skip iterations you don't want:

pattern = "^(?P<username>.*?)(?:\|{2}[^|] ){2}\|{2}(?P<botprob>.*?)(?:\|{2}|$)"
  for paper in tweet:
     for item in re.finditer(pattern,paper):
          item = item.groupdict()
          if item["botprob"] < 0.7:
              continue
          print(item)

This could be wrapped in a generator expression to save the explicit continue, but there's enough going on as it is without making it harder to read (in this case).

UPDATE since you are apparently in a function:

pattern = "^(?P<username>.*?)(?:\|{2}[^|] ){2}\|{2}(?P<botprob>.*?)(?:\|{2}|$)"
  items = []
  for paper in tweet:
     for item in re.finditer(pattern,paper):
          item = item.groupdict()
          if float(item["botprob"]) > 0.7:
              items.append(item)
              
  return items

Or using comprehensions:

groupdicts = (item.groupdict() for paper in tweet for item in re.finditer(pattern, paper))
return [item for item in groupdicts if float(item["botprob"]) > 0.7]

CodePudding user response:

I would like it to only return dictionaries whose botprob is above 0.7.

entries = [{'username': 'yashrgupta ', 'botprob': ' 0.30794588629999997 '},
           {'username': 'sterector ', 'botprob': ' 0.39391528649999996 '},
           {'username': 'MalcolmXon ', 'botprob': ' 0.05630123819 '},
           {'username': 'ryechuuuuu ', 'botprob': ' 0.08492567222000001 '},
           {'username': 'dpsisi ', 'botprob': ' 0.8300337045 '}]

filtered_entries = [e for e in entries if float(e['botprob'].strip()) > 0.7]
print(filtered_entries)

output

[{'username': 'dpsisi ', 'botprob': ' 0.8300337045 '}]
  • Related