I have this graph with a line that follows y = x
function and shows points in there. Like in the following image:
My question is: Given n points in the graph (columns x and y), how can I get the percent of points below and above the line?
What I tried is this:
def function(df, y):
total = len(df)
count = (df[y] > df.index).sum()
percent = (count*100)/total
return percent
Where total
is the total of points of a dataframe and count
is the sum of all values of the column y greater than the index. That point of view is wrong.
What I want is, for example, given 10 points, says 70% of the points are below of the line and can count 7 points below the line.
CodePudding user response:
To get the percentage of points below the line, you can use
(df[y] <= df.index).mean() * 100
CodePudding user response:
Points below the line satisfy the equation x > y
. So, the percentage is:
df[df.x > df.y].size / df[[x, y]].size * 100
CodePudding user response:
For a point to be below the line, its x coordinate must be greater than its y coordinate:
(df['x'] > df['y']).sum() / len(df) * 100
CodePudding user response: