Multiplying a pandas data frame by a dictionary-CodePudding

TownA = [35, 43, 36, 39, 28, 28, 29, 25, 38, 27, 26, 32, 29, 40, 35, 41, 37, 31, 45, 34]
TownB = [27, 15, 4, 41, 49, 25, 10, 30]
Rural = [8, 14, 12, 15, 30, 32, 21, 20, 34, 7, 11, 24]
tv_viewing = TownA   TownB   Rural

labels = np.array(["TownA", "TownB", "Rural"])
stratum = np.repeat(labels, [len(TownA), len(TownB), len(Rural)], axis=0)
sample = pd.DataFrame()
sample["viewership"] = tv_viewing
sample["stratum"] = stratum
sample.groupby("stratum").mean()

The above code generates a data frame that looks like this:

stratum	Viewership
Rural	19.00
TownA	33.9
TownB	25.125

And my goal is to multiply this data frame with the dictionary below:

Population = {"TownA": 155, "TownB": 62, "Rural": 93}

To get the desired result of:

stratum	Viewership
Rural	1767
TownA	5254.5
TownB	1557.75

I am not to picky about the final result, the product can be a new column.

I was able to get a solution with below code:

a = sample.groupby("stratum").mean().reset_index()
b = pd.DataFrame.from_dict(N, orient='index').reset_index()
ab = pd.merge(a, b, left_on='stratum', right_on='index')
ab["product"] = ab["viewership"]*ab[0]

The code produces this table:

	stratum	viewership	index	0	product
0	Rural	19.000	Rural	93	1767.00
1	TownA	33.900	TownA	155	5254.50
2	TownB	25.125	TownB	62	1557.75

I am wondering if there is a more elegant way to solve this without resetting the indexes and using some thing like apply.

I have tried this code:

a = sample.groupby("stratum").mean().apply(lambda x: x.viewership * N[x.stratum])

Only to to get this error: 'Series' object has no attribute 'viewership'

CodePudding user response：

use the mul method with axis = 0 :

out = sample.groupby("stratum").mean()
out.mul(Population, axis = 0 )

         viewership
stratum
Rural       1767.00
TownA       5254.50
TownB       1557.75

CodePudding user response：

You can transform the index to a Series, and then use .map() on it:

sample["result"] = sample["viewership"] * sample.index.to_series().map(Population)
print(sample)

This outputs:

  stratum  viewership   result
0   Rural      19.000  1767.00
1   TownA      33.900  5254.50
2   TownB      25.125  1557.75