I want to split the record if it contains /
below into 2 rows with the same volume and premium values but with group values of 4942 and 5350. The second transformation
Current dataframe
Group Volume Premium
4941.0 5721.0 5057.76
4942/5350 6154.0 5462.46
transformed
Group Volume Premium
4941.0 5721.0 5057.76
4942 6154.0 5462.46
5350 6154.0 5462.46
I tried the code below and the first record group value should be 4941 but it returns nan instead
dfRosters=(dfRosters.set_index(['Volume', 'Premium'])
.apply(lambda x: x.str.split('/').explode())
.reset_index())
Volume Premium Group
0 5721.0 5057.76 NaN
1 6154.0 5462.46 4942
2 6154.0 5462.46 5350
CodePudding user response:
This should work:
df = pd.DataFrame({"group": ["4941.0", "4942/5350"], "volume": ["5721.0", "6154.0"], "Premium": ["5057.76", "5462.46"]})
df['group'] = df['group'].apply(lambda x: x.split("/"))
df = df.explode("group")
The explode method converts a list into a separate rows and duplicates the other rows for each item in the list.
CodePudding user response:
Use str.split
for vectorial and error-proof splitting, then explode
:
out = (df.assign(**{'Group': df['Group'].str.split('/')})
.explode('Group', ignore_index=True)
)
output:
Group Volume Premium
0 4941.0 5721.0 5057.76
1 4942 6154.0 5462.46
2 5350 6154.0 5462.46
To also convert the types:
out = (df
.assign(**{'Group': df['Group'].str.split('/')})
.explode('Group', ignore_index=True)
.astype({'Group': float})
)
output:
Group Volume Premium
0 4941.0 5721.0 5057.76
1 4942.0 6154.0 5462.46
2 5350.0 6154.0 5462.46
CodePudding user response:
dfRosters['Group'] = dfRosters['Group'].astype(str)
dfRosters=(dfRosters.set_index(['Volume', 'Premium'])
.apply(lambda x: x.str.split('/').explode())
.reset_index())
Had to format group as string to resolve the issue