I am using a csv file with a table of all of the different pokemon types, with all different pokemon names in the rows and pieces of information such as their typing, and my goal is to find the average of the pokemon Total(sum of all of their stats) which are of a certain typing. I have attempted that with this coding
pokemons = pd.read_csv('pokemon.csv')
print('Bug pokemon average:', pokemons['Total'][(pokemons['Type 1'] == "Bug") | (pokemons['Type 2'] == "Bug")].mean())
And I obtain the following result
Bug pokemon average: 379.52777777777777
When I was expecting the average to be 365.3611111 which I had worked out in an excel document earlier.
I am wondering if I am doing something blatantly wrong? The csv file is this one https://gist.github.com/armgilles/194bcff35001e7eb53a2a8b441e8b2c6.
CodePudding user response:
The AVERAGEIF
function in Excel does not consider the range C2:D801
as an OR
function. That selection is not filtering the lines that you want. A functioning solution that gives the same results as you expect is =(SUMIF(C2:C801,"Bug",E1:E801) SUMIF(D2:D801,"Bug",E1:E801))/(COUNTIF(C2:C801,"Bug") COUNTIF(D2:D802,"Bug"))
.
Basically you have to compute the sums separately, then average. This gives the expected result of 379.5278
.
I don't think the AVERAGEIF
function works readily with OR
conditions. I hope it helps :)
I found this related answer: Excel averageifs with or function