Home > Software engineering >  In Python, how can I check if an array contains all elements of another array/list, including duplic
In Python, how can I check if an array contains all elements of another array/list, including duplic

Time:08-19

There appear to be several ways to determine if a set is a subset of another set, but I haven't been able to find something concise that determines whether all elements of a list (or array), including duplicate values, appear in another list (or array). For example, for the hypothetical function contains_all(A, B) which checks whether all elements of B are contained in A, these are some expected outputs:

contains_all([11, 4, 11, 6], [6, 11]) returns True (order doesn't matter)

contains_all([11, 4, 11, 6], [11, 9]) returns False (no 9)

contains_all([11, 4, 11, 6], [11, 11]) returns True

contains_all([11, 4, 11, 6], [11, 11, 11]) returns False (only two 11's)

contains_all([11, 4, 11, 6], [6, 6]) returns False (only one 6)

contains_all([11, 4, 11, 6], [11, 4, 6, 11]) returns True

contains_all([11, 4, 11, 6], [11, 4, 11, 6, 5]) returns False (no 5)

The fourth and fifth examples above, specifically, are what I'm having trouble implementing. set(B).issubset(A) or list comprehensions cover the other cases, but not these, since sets don't have duplicate elements. Is there a concise way to do this? If not, what would be the best way to approach writing a function that does this? It seems something like this may be possible with collections.Counter objects or multisets, but I'm not sure how to go about it.

CodePudding user response:

In Python 3.10, collections.Counter has this functionality built-in:

def contains_all(A, B):
    return Counter(A) >= Counter(B)

However, due to supporting negative counts, >= needs to iterate over all keys in both counters, which is inefficient when you know all counts are nonnegative. Manually checking just the keys from the B counter (as in Guy's answer) should be faster. (I don't have a Python 3.10 handy, so I can't provide timings.)

CodePudding user response:

You are correct, collections.Counter is a good way to go. You just need to go over the B counter and check if the the value is smaller or equal. Do it in all() to check all the key-value pairs

def contains_all(a, b):
    counter_a = Counter(a)
    counter_b = Counter(b)
    return all(v <= counter_a[k] for k, v in counter_b.items())

Edit

user2357112 answer is much nicer, but apply to Pyton3.10 or newer. For older versions you can use this answer.

CodePudding user response:

Try this

def contains(A, B):
contains_all = []
data_missing = []
for i in B:
    if i in A and B.count(i) <= A.count(i):
        contains_all.append(i)
    else:
        data_missing.append(i)
if data_missing:
    return False
else:
    return True

>>> print(contains([11, 4, 11, 6], [11, 4, 11, 6, 5]))
>>> False
  • Related