How to drop rows whoesspecidic columns have speical text?-CodePudding

I am trying to drop the rows whoes "Column_A" includes ".2." and "Column_B" is "COMMIT".

data = {'Column_A':['L.9922070.128.1.020','L.9922080.125.2.001','F.1622002.001.2.001','F.1622002.001.2.001','F.1622002.001.2.001'],
      'Column_B':['COMMIT','COMMIT','Release','Release','Release']}
R003_data = pd.DataFrame(data)
R003_data.drop(R003_data[~((R003_data['Column_B']  == 'COMMIT') & (R003_data['Column_A'].str.contains(".2.", regex=False)))].index)
print(R003_data)

However, the output is as follows.

             Column_A Column_B
0  L.9922070.128.1.020   COMMIT
1  L.9922080.125.2.001   COMMIT
2  F.1622002.001.2.001  Release
3  F.1622002.001.2.001  Release
4  F.1622002.001.2.001  Release

The output that I want is as follows.

             Column_A Column_B
0  L.9922070.128.1.020   COMMIT
2  F.1622002.001.2.001  Release
3  F.1622002.001.2.001  Release
4  F.1622002.001.2.001  Release

Please help me to find the error. Thank you.

CodePudding user response：

Two things:

drop doesn't modify the original dataframe. So either you have to assign the output back to your original df or include a parameter inplace=True which I prefer.
If you want to drop the rows then don't include the ~ sign because those rows you would like to drop. If you include that, the other rows are dropped and NOT retained.

With those two, your third line in the code before print statement looks like:

R003_data.drop(R003_data[(R003_data['Column_B'] == 'COMMIT')
               & R003_data['Column_A'].str.contains('.2.',
               regex=False)].index, inplace=True)