Home > Software design >  column multiplication based on a mapping
column multiplication based on a mapping

Time:12-10

I have the following two dataframes. The first one, maps some nodes to area number and the maximum electric load of that node.

bus = pd.DataFrame(data={'Node':[101, 102, 103, 104, 105], 'Area':[1, 1, 2, 2, 3], 'Load':[10, 15, 12, 20, 25]})

which gives us:

   Bus  Area  Load
0  101     1    10
1  102     1    15
2  103     2    12
3  104     2    20
4  105     3    25

The second dataframe, shows the total electric load of each area over a time period (from hour 0 to 5). The column names are the areas (matching the column Area in dataframe bus.

load = pd.DataFrame(data={1:[20, 18, 17, 19, 22, 25], 2:[23, 25,24, 27, 30, 32], 3:[10, 14, 19, 25, 22, 20]})

which gives us:

    1   2   3
0  20  23  10
1  18  25  14
2  17  24  19
3  19  27  25
4  22  30  22
5  25  32  20

I would like to have a dataframe that shows the electric load of each bus over the 6 hours. Assumption: The percentage of the load over time is the same as the percentage of the maximum load shown in bus; e.g., bus 101 has 10/(10 15)=0.4 percent of the electric load of area 1, therefore, to calculate its hourly load, 10/(10 15) should be multiplied by the column corresponding to area 1 in load.

The desired output should be of the following format:

    101     102     103     104     105
0   8       12      8.625   14.375  10
1   7.2     10.8    9.375   15.625  14
2   6.8     10.2    9       15      19
3   7.6     11.4    10.125  16.875  25
4   8.8     13.2    11.25   18.75   22
5   10      15      12      20      20

For column 101, we have 0.4 multiplied by column 1 of load.

Any help is greatly appreaciated.

CodePudding user response:

You can calculate the ratio in bus, transpose load, merge the two and multiply the ratio by the load, here goes:

bus['area_sum'] = bus.groupby('Area')['Load'].transform('sum')
bus['node_ratio'] = bus['Load'] / bus['area_sum']

full_data = bus.merge(load.T.reset_index(), left_on='Area', right_on='index')

result = pd.DataFrame([full_data['node_ratio'] * full_data[x] for x in range(6)])
result.columns = full_data['Node'].values

result:

101 102 103 104 105
0 8 12 8.625 14.375 10
1 7.2 10.8 9.375 15.625 14
2 6.8 10.2 9 15 19
3 7.6 11.4 10.125 16.875 25
4 8.8 13.2 11.25 18.75 22
5 10 15 12 20 20

CodePudding user response:

One option is to get the Load divided by the sum, then pivot, get the index matching for both load and bus, before multiplying on the matching levels:

(bus.assign(Load = bus.Load.div(bus.groupby('Area').Load.transform('sum')))
    .pivot(None, ['Area', 'Node'], 'Load')
    .reindex(load.index)
    .ffill() # get the data spread into all rows
    .bfill()
    .mul(load, level=0)
    .droplevel(0,1)
    .rename_axis(columns=None)
) 
    101   102     103     104   105
0   8.0  12.0   8.625  14.375  10.0
1   7.2  10.8   9.375  15.625  14.0
2   6.8  10.2   9.000  15.000  19.0
3   7.6  11.4  10.125  16.875  25.0
4   8.8  13.2  11.250  18.750  22.0
5  10.0  15.0  12.000  20.000  20.0
  • Related