I have the following two dataframes. The first one, maps some nodes to area number and the maximum electric load of that node.
bus = pd.DataFrame(data={'Node':[101, 102, 103, 104, 105], 'Area':[1, 1, 2, 2, 3], 'Load':[10, 15, 12, 20, 25]})
which gives us:
Bus Area Load
0 101 1 10
1 102 1 15
2 103 2 12
3 104 2 20
4 105 3 25
The second dataframe, shows the total electric load of each area over a time period (from hour 0 to 5). The column names are the areas (matching the column Area
in dataframe bus
.
load = pd.DataFrame(data={1:[20, 18, 17, 19, 22, 25], 2:[23, 25,24, 27, 30, 32], 3:[10, 14, 19, 25, 22, 20]})
which gives us:
1 2 3
0 20 23 10
1 18 25 14
2 17 24 19
3 19 27 25
4 22 30 22
5 25 32 20
I would like to have a dataframe that shows the electric load of each bus over the 6 hours.
Assumption: The percentage of the load over time is the same as the percentage of the maximum load shown in bus
; e.g., bus 101 has 10/(10 15)=0.4 percent of the electric load of area 1, therefore, to calculate its hourly load, 10/(10 15) should be multiplied by the column corresponding to area 1 in load
.
The desired output should be of the following format:
101 102 103 104 105
0 8 12 8.625 14.375 10
1 7.2 10.8 9.375 15.625 14
2 6.8 10.2 9 15 19
3 7.6 11.4 10.125 16.875 25
4 8.8 13.2 11.25 18.75 22
5 10 15 12 20 20
For column 101, we have 0.4 multiplied by column 1 of load
.
Any help is greatly appreaciated.
CodePudding user response:
You can calculate the ratio in bus, transpose load, merge the two and multiply the ratio by the load, here goes:
bus['area_sum'] = bus.groupby('Area')['Load'].transform('sum')
bus['node_ratio'] = bus['Load'] / bus['area_sum']
full_data = bus.merge(load.T.reset_index(), left_on='Area', right_on='index')
result = pd.DataFrame([full_data['node_ratio'] * full_data[x] for x in range(6)])
result.columns = full_data['Node'].values
result:
101 | 102 | 103 | 104 | 105 | |
---|---|---|---|---|---|
0 | 8 | 12 | 8.625 | 14.375 | 10 |
1 | 7.2 | 10.8 | 9.375 | 15.625 | 14 |
2 | 6.8 | 10.2 | 9 | 15 | 19 |
3 | 7.6 | 11.4 | 10.125 | 16.875 | 25 |
4 | 8.8 | 13.2 | 11.25 | 18.75 | 22 |
5 | 10 | 15 | 12 | 20 | 20 |
CodePudding user response:
One option is to get the Load
divided by the sum
, then pivot, get the index matching for both load
and bus
, before multiplying on the matching levels:
(bus.assign(Load = bus.Load.div(bus.groupby('Area').Load.transform('sum')))
.pivot(None, ['Area', 'Node'], 'Load')
.reindex(load.index)
.ffill() # get the data spread into all rows
.bfill()
.mul(load, level=0)
.droplevel(0,1)
.rename_axis(columns=None)
)
101 102 103 104 105
0 8.0 12.0 8.625 14.375 10.0
1 7.2 10.8 9.375 15.625 14.0
2 6.8 10.2 9.000 15.000 19.0
3 7.6 11.4 10.125 16.875 25.0
4 8.8 13.2 11.250 18.750 22.0
5 10.0 15.0 12.000 20.000 20.0