Home > Software engineering >  Finding percentile for each data point in a numpy array
Finding percentile for each data point in a numpy array

Time:04-21

I have the following line of code:

threshold_value = numpy.percentile(a, q)

where a is my data and q is set at 95 let us say.

And let us say that if I changed q to be 90, I would get a different threshold value.

Well for each data point in a, I would like to calculate what value of q would yield threshold_value equal to a. So what I am interested in is perhaps a data point in a is below the threshold_value, but I want a percentile value to see exactly where is it at. When I have a test dataset, I compare each value to the threshold value to see if it exceeds it or not. So I don't want to give a q value, I want to be told what the q value is for a data point.

So I want function maybe a_percentile = function(a) where a_percentile is a transformation of the original data value to the percentile.

CodePudding user response:

Use scipy.stats.percentileofscore https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.percentileofscore.html:

import numpy as np
from scipy.stats import percentileofscore
a = np.linspace(0, 10, 10)
[percentileofscore(a, i, kind='strict') for i in a]

Output:

[0.0, 10.0, 20.0, 30.0, 40.0, 50.0, 60.0, 70.0, 80.0, 90.0]
  • Related