Home > Enterprise >  Multiplying a pandas data frame by a dictionary
Multiplying a pandas data frame by a dictionary

Time:06-24

TownA = [35, 43, 36, 39, 28, 28, 29, 25, 38, 27, 26, 32, 29, 40, 35, 41, 37, 31, 45, 34]
TownB = [27, 15, 4, 41, 49, 25, 10, 30]
Rural = [8, 14, 12, 15, 30, 32, 21, 20, 34, 7, 11, 24]
tv_viewing = TownA   TownB   Rural

labels = np.array(["TownA", "TownB", "Rural"])
stratum = np.repeat(labels, [len(TownA), len(TownB), len(Rural)], axis=0)
sample = pd.DataFrame()
sample["viewership"] = tv_viewing
sample["stratum"] = stratum
sample.groupby("stratum").mean()

The above code generates a data frame that looks like this:

stratum Viewership
Rural 19.00
TownA 33.9
TownB 25.125

And my goal is to multiply this data frame with the dictionary below:

Population = {"TownA": 155, "TownB": 62, "Rural": 93}

To get the desired result of:

stratum Viewership
Rural 1767
TownA 5254.5
TownB 1557.75

I am not to picky about the final result, the product can be a new column.

I was able to get a solution with below code:

a = sample.groupby("stratum").mean().reset_index()
b = pd.DataFrame.from_dict(N, orient='index').reset_index()
ab = pd.merge(a, b, left_on='stratum', right_on='index')
ab["product"] = ab["viewership"]*ab[0]

The code produces this table:

stratum viewership index 0 product
0 Rural 19.000 Rural 93 1767.00
1 TownA 33.900 TownA 155 5254.50
2 TownB 25.125 TownB 62 1557.75

I am wondering if there is a more elegant way to solve this without resetting the indexes and using some thing like apply.

I have tried this code:

a = sample.groupby("stratum").mean().apply(lambda x: x.viewership * N[x.stratum])

Only to to get this error: 'Series' object has no attribute 'viewership'

CodePudding user response:

use the mul method with axis = 0 :

out = sample.groupby("stratum").mean()
out.mul(Population, axis = 0 )

         viewership
stratum
Rural       1767.00
TownA       5254.50
TownB       1557.75

CodePudding user response:

You can transform the index to a Series, and then use .map() on it:

sample["result"] = sample["viewership"] * sample.index.to_series().map(Population)
print(sample)

This outputs:

  stratum  viewership   result
0   Rural      19.000  1767.00
1   TownA      33.900  5254.50
2   TownB      25.125  1557.75
  • Related