Home > Back-end >  Find average of a column in a dataframe given conditions on another column
Find average of a column in a dataframe given conditions on another column

Time:04-23

     x      y
    1.2    3.1
    1.4    3.5
    1.5    3.2
    2.2    3.6
    2.2    2.8
    2.3    3.3
    2.4    3.5
    2.5    3.8
    2.7    3.4
    2.8    3.3

Say i have the dataframe above, and I wish to write a function

    def ave(pd,minx,maxx):

which calculates the average of the y values for respective x values between minx and maxx, ie in the following example:

    ave(file, 2, 3) #where file is wherever I import these x and y values from

it would return 3.3857...

I have tried the following:

def ave(pd,minx,maxx):
x = list(data.iloc[:, 0].values)
y = list(data.iloc[:, 1].values)
lst=[]
for i in x:
    if x[i]>xmin and x[i]<xmax:
        lst =y[i]
return (sum(lst)/len(list))

but this gives the error: list indices must be integers or slices, not numpy.float64

CodePudding user response:

Why not just select rows where those conditions are true? You really should avoid looping as much as possible when working with dataframes.

def y_average(df, min_x, max_x):
    return df[(df["x"] > min_x) & (df["x"] < max_x)]["y"].mean()

Usage:

In [3]: avg(df, 2, 3)
Out[3]: 3.3857142857142857
  • Related