How to get sum and average of values from a pandas dataframe using values in multiple lists?-CodePudding

I've the following dataframe -

df1

Location	Office	ROP
Barcelona	Head Office	4.3%
Bengaluru	Corporate Office	9.6%
Chicago	Head Office	12.5%
Luxembourg	Corporate Office	14.1%
Paris	Head Office	12.7%
Toronto	Head Office	11.5%
Berlin	Corporate Office	14.3%
Bengaluru	Head Office	4.6%
Luxembourg	Head Office	7.1%
Berlin	Head Office	5.3%
Luxembourg	Virtual Center	10.1%
Berlin	Virtual Center	12.3%
Paris	Virtual Center	9.7%
:	:	:
:	:	:
:	:	:

a = ['Berlin','Paris','Luxembourg',...]
b = ['Head Office','Corporate Office',..]

Say there are multiple values in lists a and b, how do I find the sum and average of ROP based on the values in the lists and the given dataframe?

Example:

Say we have data from above mentioned dataframe in 'df2'.

df2 has just the visible 13 rows from dataframe 'df1'.

a = ['Berlin','Paris','Luxembourg']

b = ['Head Office','Corporate Office']

Expected output:

Sum: 14.3% 5.3% 12.7% 7.1% 14.1% = 53.5%

Average: (14.3% 5.3% 12.7% 7.1% 14.1%)/5 = 10.7%

CodePudding user response：

Try:

# convert ROP column to float:
df["ROP_int"] = df["ROP"].str.strip("%").astype(float)

a = ["Berlin", "Paris", "Luxembourg"]
b = ["Head Office", "Corporate Office"]

# create a mask
m = df["Location"].isin(a) & df["Office"].isin(b)

# compute sum and average from the mask and ROP_int column:
s = df.loc[m, "ROP_int"].sum()
avg = df.loc[m, "ROP_int"].mean()

print(s)
print(avg)

Prints:

53.49999999999999
10.7