I'm looking for a percentile function that accepts an array and an element where it would return closest percentile of the element.
Some examples
percentile([1,2,3,4,5], 2) => 40%
percentile([1,2,3,4,5], 2.5) => 40%
percentile([1,2,3,4,5], 6) => 100%
Does anything like this or similar exist within python or numpy?
Numpy does this np.percentile(a=[1,2,3,4,5], q=3) => 1.12
which is not desired.
CodePudding user response:
np.percentile(a, q)
tells you the q
th percentile in the a
array. This is the inverse of what you want. I don't think numpy has a function to do what you want, but it's easy enough to make your own.
The percentile tells you the percentage of elements of the array that are smaller than the given element, so just do that:
def percentile(lst: list, val) -> float:
return sum(i <= val for i in lst) / len(lst)
If you have a numpy array, you don't need to iterate over it since <=
will broadcast over the array:
def percentile(arr: np.ndarray, val) -> float:
return (arr <= val).sum() / len(arr)
>>> percentile([1,2,3,4,5], 2)
# 0.4
>>> percentile([1,2,3,4,5], 2.5)
# 0.4
>>> percentile([1,2,3,4,5], 6)
# 1.0
CodePudding user response:
you can also calculate like this: it sorts the array, then finds the index of the element in the sorted array. and then it calculates the percentile by taking the index of the element, adding 1 to it .
def percentile(arr, element):
arr = sorted(arr)
index = arr.index(element)
percentile = (index 1) / len(arr) * 100
return percentile