I want to drop the relevant rows for the first 3 columns where the "Match1" column contains value "No" and insert them into new rows of the same data frame.
df2 = pd.DataFrame({ 'Name':['John', 'Tom', 'Tom' ,'Ole','Ole','Tom'],
'SomeQty':[100, 200, 300, 500,600, 400],
'Match':['Yes', 'No', 'Yes','No','No','No'],
'SomeValue':[100, 200, 200, 500, 600, 200],
'Match1':['Yes', 'Yes','Yes', 'No','No', 'Yes'],
})
The way I followed to do this is;
# Define a intermediary dataframe
df4 = pd.DataFrame(columns=['Name','SomeQty','Match','Match1','SomeValue'])
# Copy the relevant data in order to drop and assign
df4 = df4.append(df2.loc[df2['Name']== 'Ole',['Name','SomeQty','Match','Match1']].copy())
# Drop the data from main table
df2.iloc[:, 0:3] = df2.iloc[:, 0:3].drop(df2[df2['Name']== 'Ole'].index)
# Append the relevant data from intermediary table
df2 = df2.append([df4], ignore_index=True, sort=False)
del df4
I like to know a better way to achieve this. TIA
CodePudding user response:
A simpler version using a boolean mask would be:
cols = ['Name','SomeQty','Match']
mask = df2['Match1'].eq('No')
out = pd.concat(
[df2.mask(mask, df2.drop(cols, axis=1)),
df2.loc[mask, cols]
], ignore_index=True)
Output:
Name SomeQty Match SomeValue Match1
0 John 100.0 Yes 100.0 Yes
1 Tom 200.0 No 200.0 Yes
2 Tom 300.0 Yes 200.0 Yes
3 NaN NaN NaN 500.0 No
4 NaN NaN NaN 600.0 No
5 Tom 400.0 No 200.0 Yes
6 Ole 500.0 No NaN NaN
7 Ole 600.0 No NaN NaN