Find average of a column in a dataframe given conditions on another column-CodePudding

Say i have the dataframe above, and I wish to write a function

    def ave(pd,minx,maxx):

which calculates the average of the y values for respective x values between minx and maxx, ie in the following example:

    ave(file, 2, 3) #where file is wherever I import these x and y values from

it would return 3.3857...

I have tried the following:

def ave(pd,minx,maxx):
x = list(data.iloc[:, 0].values)
y = list(data.iloc[:, 1].values)
lst=[]
for i in x:
    if x[i]>xmin and x[i]<xmax:
        lst =y[i]
return (sum(lst)/len(list))

but this gives the error: list indices must be integers or slices, not numpy.float64

CodePudding user response：

Why not just select rows where those conditions are true? You really should avoid looping as much as possible when working with dataframes.

def y_average(df, min_x, max_x):
    return df[(df["x"] > min_x) & (df["x"] < max_x)]["y"].mean()

Usage:

In [3]: avg(df, 2, 3)
Out[3]: 3.3857142857142857