Home > OS >  Check if dictionary items occurs "similarly"
Check if dictionary items occurs "similarly"

Time:10-24

I am trying to implement a function which checks whether a counter contains "similar" percentage of each items. That is

from collections import Counter

c = Counter(["Dog", "Cat", "Dog", "Horse", "Dog"])
size = 5
lst = list(c.values())
percentages = [x / size * 100 for x in lst]  # [60.0, 20.0, 20.0]

How can I check whether those percentages are all "similar"? I would like to apply the math.isclose method with abs_tol=2 but it takes two arguments not the entire list.

In the example, items do not occurs similarly.

This method will be used for checking whether a training set of labels is balanced or not.

CodePudding user response:

One way is to pick the minimum and maximum value of the percentages list and pass those to isclose()

from math import isclose
from collections import Counter


def is_balanced(lst, abs_tol):
    c = Counter(lst)
    total = c.total()
    percentages = [(v / total) * 100 for v in c.values()]
    return isclose(min(percentages), max(percentages), abs_tol=abs_tol)


lst1 = ["Dog", "Cat", "Dog", "Horse", "Dog"]
lst2 = ["Dog", "Cat", "Horse"]

print(is_balanced(lst1, 2))  # False
print(is_balanced(lst2, 2))  # True

CodePudding user response:

Using np.isclose():

from collections import Counter
import numpy as np

def is_balanced(lst) -> bool:
   c = Counter(lst)
   fractions = np.asarray(list(c.values())) / len(lst)
   return np.isclose(fractions, 1 / len(c)).all()

See the doc of np.isclose() for arguments like atol, rtol, etc.

  • Related