Home > Software engineering >  Drop a part of rows based on condition & insert them into a new rows of pandas data frame
Drop a part of rows based on condition & insert them into a new rows of pandas data frame

Time:02-28

I want to drop the relevant rows for the first 3 columns where the "Match1" column contains value "No" and insert them into new rows of the same data frame.

df2 = pd.DataFrame({ 'Name':['John', 'Tom', 'Tom' ,'Ole','Ole','Tom'],
                    'SomeQty':[100, 200, 300, 500,600, 400],
                     'Match':['Yes', 'No', 'Yes','No','No','No'],
                    'SomeValue':[100, 200, 200, 500, 600, 200],
                    'Match1':['Yes', 'Yes','Yes', 'No','No', 'Yes'],
                    })

My expected result is; enter image description here

The way I followed to do this is;

# Define a intermediary dataframe
df4 = pd.DataFrame(columns=['Name','SomeQty','Match','Match1','SomeValue'])
# Copy the relevant data in order to drop and assign
df4 = df4.append(df2.loc[df2['Name']== 'Ole',['Name','SomeQty','Match','Match1']].copy())

# Drop the data from main table
df2.iloc[:, 0:3] = df2.iloc[:, 0:3].drop(df2[df2['Name']== 'Ole'].index)

# Append the relevant data from intermediary table
df2 = df2.append([df4], ignore_index=True, sort=False)

del df4

I like to know a better way to achieve this. TIA

CodePudding user response:

A simpler version using a boolean mask would be:

cols = ['Name','SomeQty','Match']

mask = df2['Match1'].eq('No')

out = pd.concat(
      [df2.mask(mask, df2.drop(cols, axis=1)),
       df2.loc[mask, cols]
       ], ignore_index=True)

Output:

   Name  SomeQty Match  SomeValue Match1
0  John    100.0   Yes      100.0    Yes
1   Tom    200.0    No      200.0    Yes
2   Tom    300.0   Yes      200.0    Yes
3   NaN      NaN   NaN      500.0     No
4   NaN      NaN   NaN      600.0     No
5   Tom    400.0    No      200.0    Yes
6   Ole    500.0    No        NaN    NaN
7   Ole    600.0    No        NaN    NaN
  • Related