Home > Blockchain >  How to find the smallest threshold from one list to fit another list
How to find the smallest threshold from one list to fit another list

Time:12-16

I have two lists of marks for the same set of students. For example:

A = [22, 2, 88, 3, 93, 84]
B = [66, 0, 6, 33, 99, 45]

If I accept only students above a threshold according to list A then I can look up their marks in list B. For example, if I only accept students with at least a mark of 80 from list A then their marks in list B are [6, 99, 45].

I would like to compute the smallest threshold for A which gives at least 90% of students in the derived set in B getting at least 50. In this example the threshold will have to be 93 which gives the list [99] for B.

Another example:

A = [3, 36, 66, 88, 99, 52, 55, 42, 10, 70]
B = [5, 30, 60, 80, 80, 60, 45, 45, 15, 60]

In this case we have to set the threshold to 66 which then gives 100% of [60, 80, 80, 60] getting at least 50.

CodePudding user response:

This is an O(nlogn m) approach (due to sorting) where n is the length of A and m is the length of B:

from operator import itemgetter
from itertools import accumulate


def find_threshold(lst_a, lst_b):
    # find the order of the indices of lst_a according to the marks
    indices, values = zip(*sorted(enumerate(lst_a), key=itemgetter(1)))

    # find the cumulative sum of the elements of lst_b above 50 sorted by indices
    cumulative = list(accumulate(int(lst_b[j] > 50) for j in indices))

    for i, value in enumerate(values):
        # find the index where the elements above 50 is greater than 90%
        if cumulative[-1] - cumulative[i - 1] > 0.9 * (len(values) - i):
            return value

    return None


print(find_threshold([22, 2, 88, 3, 93, 84], [66, 0, 6, 33, 99, 45]))
print(find_threshold([3, 36, 66, 88, 99, 52, 55, 42, 10, 70], [5, 30, 60, 80, 80, 60, 45, 45, 15, 60]))

Output

93
66

CodePudding user response:

First, define a function that will tell you if 90% of students in a set scored more than 50:

def setb_90pc_pass(b):
    return sum(score >= 50 for score in b) >= len(b) * 0.9

Next, loop over scores in a in ascending order, setting each of them as the threshold. Filter out your lists according that threshold, and check if they fulfill your condition:

for threshold in sorted(A):
    filtered_a, filtered_b = [], []
    for ai, bi in zip(A, B):
        if ai >= threshold:
            filtered_a.append(ai)
            filtered_b.append(bi)
    if setb_90pc_pass(filtered_b):
        break

print(threshold)    
  • Related