Column route represents the point for each location
So I have two tables like this, my question is how to calculate the sum of the load difference between the two tables if the route has to match with every location [1,2,3,4,5]
for example:
df1['Route'] = [1,5] and df2['Route'] = [2,3,4] then the result is [[1,5],[2,3,4]] with sum of load difference = 368.93 (-10325.19) = -9956,26
df1
index | Route | Total Distance | Total Demand | Total Time | Total Volume | Load Difference |
---|---|---|---|---|---|---|
8 | 1,5 | 2257 | 642.46 | 173 | 4631.07 | 368.93 |
18 | 1,3,4 | 2179 | 633.65 | 167 | 4409.15 | 590.85 |
7 | 1,4 | 2152 | 538.2 | 165 | 3700.13 | 1299.87 |
24 | 3,4,5 | 2086 | 508.09 | 160 | 3387.27 | 1612.73 |
6 | 1,3 | 2146 | 479.46 | 165 | 3296.29 | 1703.71 |
0 | 1 | 2104 | 384.01 | 161 | 2576.07 | 2423.93 |
14 | 4,5 | 1775 | 412.64 | 136 | 2338.29 | 2661.71 |
13 | 3,5 | 2029 | 353.9 | 156 | 2300.35 | 2699.65 |
12 | 3,4 | 1881 | 249.64 | 144 | 1497.84 | 3502.16 |
4 | 5 | 1570 | 258.45 | 120 | 1292.25 | 3707.75 |
3 | 4 | 1570 | 154.19 | 120 | 770.95 | 4229.05 |
2 | 3 | 1848 | 95.45 | 142 | 564.75 | 4435.25 |
19 | 1,3,5 | 2327 | 737.91 | 179 | 5503.58 | -503.58 |
20 | 1,4,5 | 2357 | 796.65 | 181 | 6008.07 | -1008.07 |
28 | 1,3,4,5 | 2384 | 892.1 | 183 | 6802.26 | -1802.26 |
1 | 2 | 1960 | 2510.39 | 150 | 15689.94 | -10689.94 |
9 | 2,3 | 1986 | 2605.84 | 152 | 16503.65 | -11503.65 |
10 | 2,4 | 1994 | 2664.58 | 153 | 16986.7 | -11986.7 |
21 | 2,3,4 | 2019 | 2760.03 | 155 | 17825.19 | -12825.19 |
11 | 2,5 | 2097 | 2768.84 | 161 | 18574.3 | -13574.3 |
5 | 1,2 | 2140 | 2894.4 | 164 | 19778.4 | -14778.4 |
22 | 2,3,5 | 2167 | 2864.29 | 166 | 19811.34 | -14811.34 |
23 | 2,4,5 | 2199 | 2923.03 | 169 | 20583.0 | -15583.0 |
15 | 1,2,3 | 2166 | 2989.85 | 166 | 20679.8 | -15679.8 |
16 | 1,2,4 | 2174 | 3048.59 | 167 | 21213.11 | -16213.11 |
29 | 2,3,4,5 | 2224 | 3018.48 | 171 | 21506.67 | -16506.67 |
25 | 1,2,3,4 | 2199 | 3144.04 | 169 | 22139.28 | -17139.28 |
17 | 1,2,5 | 2277 | 3152.85 | 175 | 22989.53 | -17989.53 |
26 | 1,2,3,5 | 2347 | 3248.3 | 180 | 24362.25 | -19362.25 |
27 | 1,2,4,5 | 2379 | 3307.04 | 183 | 25216.18 | -20216.18 |
30 | 1,2,3,4,5 | 2404 | 3402.49 | 184 | 26085.76 | -21085.76 |
df2
index | Route | Total Distance | Total Demand | Total Time | Total Volume | Load Difference |
---|---|---|---|---|---|---|
28 | 1,3,4,5 | 2384 | 892.1 | 183 | 6802.26 | 697.74 |
20 | 1,4,5 | 2357 | 796.65 | 181 | 6008.07 | 1491.93 |
19 | 1,3,5 | 2327 | 737.91 | 179 | 5503.58 | 1996.42 |
8 | 1,5 | 2257 | 642.46 | 173 | 4631.07 | 2868.93 |
18 | 1,3,4 | 2179 | 633.65 | 167 | 4409.15 | 3090.85 |
7 | 1,4 | 2152 | 538.2 | 165 | 3700.13 | 3799.87 |
24 | 3,4,5 | 2086 | 508.09 | 160 | 3387.27 | 4112.73 |
6 | 1,3 | 2146 | 479.46 | 165 | 3296.29 | 4203.71 |
0 | 1 | 2104 | 384.01 | 161 | 2576.07 | 4923.93 |
14 | 4,5 | 1775 | 412.64 | 136 | 2338.29 | 5161.71 |
13 | 3,5 | 2029 | 353.9 | 156 | 2300.35 | 5199.65 |
12 | 3,4 | 1881 | 249.64 | 144 | 1497.84 | 6002.16 |
4 | 5 | 1570 | 258.45 | 120 | 1292.25 | 6207.75 |
3 | 4 | 1570 | 154.19 | 120 | 770.95 | 6729.05 |
2 | 3 | 1848 | 95.45 | 142 | 564.75 | 6935.25 |
1 | 2 | 1960 | 2510.39 | 150 | 15689.94 | -8189.94 |
9 | 2,3 | 1986 | 2605.84 | 152 | 16503.65 | -9003.65 |
10 | 2,4 | 1994 | 2664.58 | 153 | 16986.7 | -9486.7 |
21 | 2,3,4 | 2019 | 2760.03 | 155 | 17825.19 | -10325.19 |
11 | 2,5 | 2097 | 2768.84 | 161 | 18574.3 | -11074.3 |
5 | 1,2 | 2140 | 2894.4 | 164 | 19778.4 | -12278.4 |
22 | 2,3,5 | 2167 | 2864.29 | 166 | 19811.34 | -12311.34 |
23 | 2,4,5 | 2199 | 2923.03 | 169 | 20583.0 | -13083.0 |
15 | 1,2,3 | 2166 | 2989.85 | 166 | 20679.8 | -13179.8 |
16 | 1,2,4 | 2174 | 3048.59 | 167 | 21213.11 | -13713.11 |
29 | 2,3,4,5 | 2224 | 3018.48 | 171 | 21506.67 | -14006.67 |
25 | 1,2,3,4 | 2199 | 3144.04 | 169 | 22139.28 | -14639.28 |
17 | 1,2,5 | 2277 | 3152.85 | 175 | 22989.53 | -15489.53 |
26 | 1,2,3,5 | 2347 | 3248.3 | 180 | 24362.25 | -16862.25 |
27 | 1,2,4,5 | 2379 | 3307.04 | 183 | 25216.18 | -17716.18 |
30 | 1,2,3,4,5 | 2404 | 3402.49 | 184 | 26085.76 | -18585.76 |
CodePudding user response:
Answer
import numpy as np
prime_map = {k:v for k,v in zip("12345", [2,3,5,7,11])}
def road_key_to_number(x):
return np.prod(list(prime_map[k] for k in x.split(",")))
df1["key"] = df1["Route"].apply(road_key_to_number)
df2["key"] = df2["Route"].apply(road_key_to_number)
df2["pair_key"] = np.prod(list(prime_map.values())) / df2["key"]
data = df1.merge(df2, left_on = "key", right_on="pair_key")
data["diff"] = (data["Load Difference_x"] data["Load Difference_y"])
Output
Route_x Route_y diff
0 1,5 2,3,4 -9956.26
1 1,3,4 2,5 -10483.45
2 1,4 2,3,5 -11011.47
3 3,4,5 1,2 -10665.67
4 1,3 2,4,5 -11379.29
5 1 2,3,4,5 -11582.74
6 4,5 1,2,3 -10518.09
7 3,5 1,2,4 -11013.46
8 3,4 1,2,5 -11987.37
9 5 1,2,3,4 -10931.53
10 4 1,2,3,5 -12633.20
11 3 1,2,4,5 -13280.93
12 1,3,5 2,4 -9990.28
13 1,4,5 2,3 -10011.72
14 1,3,4,5 2 -9992.20
15 2 1,3,4,5 -9992.20
16 2,3 1,4,5 -10011.72
17 2,4 1,3,5 -9990.28
18 2,3,4 1,5 -9956.26
19 2,5 1,3,4 -10483.45
20 1,2 3,4,5 -10665.67
21 2,3,5 1,4 -11011.47
22 2,4,5 1,3 -11379.29
23 1,2,3 4,5 -10518.09
24 1,2,4 3,5 -11013.46
25 2,3,4,5 1 -11582.74
26 1,2,3,4 5 -10931.53
27 1,2,5 3,4 -11987.37
28 1,2,3,5 4 -12633.20
29 1,2,4,5 3 -13280.93
Explanation
- The key idea is to match two
df.Route
to calculate the diff between them. - Here
prime_map
is using a method calledArithmeticization
, it has been used inMetamathematics
to proveGödel's incompleteness theorems
- Each node in the route has a unique prime number, so, we can use a unique product from them to represent a road, and it is order-independent, which means
1,2,4
equal4,2,1
- You also could use another method like
intersection
orunion
to represent it.
- Each node in the route has a unique prime number, so, we can use a unique product from them to represent a road, and it is order-independent, which means