I have a pandas dataframe df
as shown (Note: df.to_dict()
and df2.to_dict()
is given at the end of the question):
Characters Sum
L1 b3 b1
L2 b5 b1 b2 b6 b9
L3 b1 b2 b3 b6 b7 b9
L4 b1 b5 b12
The characters b1, b2, b3 and so on have numeric values embedded with them from another dataframe df2
.
Buses Values
0 bus_1 2
1 bus_2 3
2 bus_3 3
3 bus_4 1
4 bus_5 6
5 bus_6 4
6 bus_7 3
7 bus_8 7
8 bus_9 2
9 bus_10 4
10 bus_11 6
11 bus_12 7
For example,
b1 = df2["Values"][0]
b2 = df2["Values"][1]
b3 = df2["Values"][2]
b4 = df2["Values"][3]
b5 = df2["Values"][4]
b6 = df2["Values"][5]
b7 = df2["Values"][6]
b8 = df2["Values"][7]
b9 = df2["Values"][8]
b10 = df2["Values"][9]
b11 = df2["Values"][10]
b12 = df2["Values"][11]
In dataframe df
, in column Sum
, I'd like to have the real sum of the values of the characters which are given in Characters
column. For example, for index L1, I'd like to have value 5 which is corresponding to b3 b1 which is specified in the Characters
column.
I can split the characters from the string using split()
method. For example,
df["Characters"][0].split(" ")
gives me ['b3', 'b1']
But how can I get the numeric values of b3 and b1 and get their sum in another column in df
? Is it possible to get the sum using any other approaches like map()
?
df.to_dict()
is as shown:
{
'Characters': {'L1': 'b3 b1',
'L2': 'b5 b1 b2 b6 b9',
'L3': 'b1 b2 b3 b6 b7 b9',
'L4': 'b1 b5 b12'},
'Sum': {'L1': '', 'L2': '', 'L3': '', 'L4': ''}}
df2.to_dict()
is as shown:
{'Buses': {0: 'bus_1',
1: 'bus_2',
2: 'bus_3',
3: 'bus_4',
4: 'bus_5',
5: 'bus_6',
6: 'bus_7',
7: 'bus_8',
8: 'bus_9',
9: 'bus_10',
10: 'bus_11',
11: 'bus_12'},
'Demand (MW)': {0: 2,
1: 3,
2: 3,
3: 1,
4: 6,
5: 4,
6: 3,
7: 7,
8: 2,
9: 4,
10: 6,
11: 7}}
CodePudding user response:
One questionable solution would be to just
eval('b1 b2 b3 b6 b7 b9'.replace("b",""))
I would probably use regex here, but that might be overkill
import re
sum(int(i) for i in re.findall('\d ', 'b1 b2 b3 b6 b7 b9'))
CodePudding user response:
IIUC, you want to map the matching Values of Buses of df2 in the sums of df.
An approach would be to split
the string on
, explode
the column, then map
each bx
to the corresponding values of df2 (for this we need to rework a bit df2 to be a Series with the bx
names as indexes. Finally, groupby
"Character" and sum
:
(df.assign(Sum=df['Sum'].str.split(' '))
.explode('Sum')
.assign(Sum=lambda d: d['Sum'].map(df2.assign(Buses=df2['Buses'].str.replace('us_', ''))
.set_index('Buses')['Values']))
.groupby('Characters', as_index=False).sum()
)
output:
Characters Sum
0 L1 5
1 L2 17
2 L3 17
3 L4 15