Say I have a dataframe like this:
data = {'state':['FL', 'FL', 'FL'],
'territory':[1, 2, 3],
'proportion':['A-0.7/B-0.3', 'A-0.7/B-0.3', 'A-0.7/B-0.3'],
'value':[10, 10, 10]}
df = pd.DataFrame(data)
state territory proportion value
0 FL 1 A-0.7/B-0.3 10
1 FL 2 A-0.7/B-0.3 10
2 FL 3 A-0.7/B-0.3 10
For each row, I want to split the 'value' into two rows based on the 'proportion'
data = {'state':['FL', 'FL', 'FL', 'FL', 'FL', 'FL'],
'territory':[1, 1, 2, 2, 3, 3],
'proportion':['A', 'B', 'A', 'B', 'A', 'B'],
'value':[7, 3, 7, 3, 7, 3,]}
pd.DataFrame(data)
state territory proportion value
0 FL 1 A 7
1 FL 1 B 3
2 FL 2 A 7
3 FL 2 B 3
4 FL 3 A 7
5 FL 3 B 3
How do I do this?
CodePudding user response:
You could split
proportion on /
, then explode
to create the new rows.
Then split
again to get the letter and number, assign the letter, convert the number to float and multiply to the original value:
df2 = (
df.assign(proportion=df['proportion'].str.split('/'))
.explode('proportion')
.reset_index(drop=True)
)
d = df2['proportion'].str.split('-', expand=True)
df2['proportion'] = d[0]
df2['value'] *= d[1].astype(float)
output:
state territory proportion value
0 FL 1 A 7.0
1 FL 1 B 3.0
2 FL 2 A 7.0
3 FL 2 B 3.0
4 FL 3 A 7.0
5 FL 3 B 3.0
CodePudding user response:
We do split
2 times , one follow with explode
to add additional rows another need to adding the additional columns
out = df.explode('proportion').reset_index(drop=True)
out = out.join(out.pop('proportion').str.split('-',expand=True).rename(columns = {0:'proportion',1:'values'}))
out
Out[217]:
state territory value proportion values
0 FL 1 10 A 0.7
1 FL 1 10 B 0.3
2 FL 2 10 A 0.7
3 FL 2 10 B 0.3
4 FL 3 10 A 0.7
5 FL 3 10 B 0.3