Home > Software design >  How to get sum and average of values from a pandas dataframe using values in multiple lists?
How to get sum and average of values from a pandas dataframe using values in multiple lists?

Time:09-03

I've the following dataframe -

df1
Location Office ROP
Barcelona Head Office 4.3%
Bengaluru Corporate Office 9.6%
Chicago Head Office 12.5%
Luxembourg Corporate Office 14.1%
Paris Head Office 12.7%
Toronto Head Office 11.5%
Berlin Corporate Office 14.3%
Bengaluru Head Office 4.6%
Luxembourg Head Office 7.1%
Berlin Head Office 5.3%
Luxembourg Virtual Center 10.1%
Berlin Virtual Center 12.3%
Paris Virtual Center 9.7%
: : :
: : :
: : :
a = ['Berlin','Paris','Luxembourg',...]
b = ['Head Office','Corporate Office',..]

Say there are multiple values in lists a and b, how do I find the sum and average of ROP based on the values in the lists and the given dataframe?

Example:

Say we have data from above mentioned dataframe in 'df2'.

df2 has just the visible 13 rows from dataframe 'df1'.

a = ['Berlin','Paris','Luxembourg']

b = ['Head Office','Corporate Office']

Expected output:

Sum: 14.3% 5.3% 12.7% 7.1% 14.1% = 53.5%

Average: (14.3% 5.3% 12.7% 7.1% 14.1%)/5 = 10.7%

CodePudding user response:

Try:

# convert ROP column to float:
df["ROP_int"] = df["ROP"].str.strip("%").astype(float)

a = ["Berlin", "Paris", "Luxembourg"]
b = ["Head Office", "Corporate Office"]

# create a mask
m = df["Location"].isin(a) & df["Office"].isin(b)

# compute sum and average from the mask and ROP_int column:
s = df.loc[m, "ROP_int"].sum()
avg = df.loc[m, "ROP_int"].mean()

print(s)
print(avg)

Prints:

53.49999999999999
10.7
  • Related