Count Number of Times the Sale of a Toy is greater than Average for that Column-CodePudding

I have a dataset where in I have to identify if the Sale value of a Toy is greater than the average in the column and count in how many different sale areas, value is greater than the average.

For Ex: find the average of column "Sale B" - 2.5, check in how many rows the value is greater than 2.5 and then perform the same exercise for "SaleA" and "SaleC" and then add all of it

input_data = pd.DataFrame({'Toy': ['A','B','C','D'],
                      'Color': ['Red','Green','Orange','Blue'],
                      'SaleA': [1,2,0,1],
                      'SaleB': [1,3,4,2],
                      'SaleC': [5,2,3,5]})

New column "Count_Sale_Average" is created, For ex: for toy "A" only at one position the sale was greater than average.

output_data = pd.DataFrame({'Toy': ['A','B','C','D'],
                      'Color': ['Red','Green','Orange','Blue'],
                      'SaleA': [1,2,0,1],
                      'SaleB': [1,3,4,2],
                      'SaleC': [5,2,3,5],
                       'Count_Sale_Average':[1,2,1,1]})

My code is working and giving the desired output. Any suggestions on other ways of doing it, may be more efficient and in less number of lines.

list_var = ['SaleA','SaleB','SaleC']
df = input_data[list_var]

for i in range(0,len(list_var)):
    var = list_var[i]
    mean_var = df[var].mean()
    df[var] = df[var].apply(lambda x: 1 if x > mean_var else 0)
    
df['Count_Sale_Average'] = df[list_var].sum(axis=1)
output_data = pd.concat([input_data, df[['Count_Sale_Average']]], axis=1)
output_data

CodePudding user response：

You could filter, find mean and sum on axis:

filtered = input_data.filter(like='Sale')
input_data['Count_Sale_Average'] = filtered.gt(filtered.mean()).sum(axis=1)

Output:

  Toy   Color  SaleA  SaleB  SaleC  Count_Sale_Average
0   A     Red      1      1      5                   1
1   B   Green      2      3      2                   2
2   C  Orange      0      4      3                   1
3   D    Blue      1      2      5                   1