I am trying to drop the rows whoes "Column_A" includes ".2." and "Column_B" is "COMMIT".
data = {'Column_A':['L.9922070.128.1.020','L.9922080.125.2.001','F.1622002.001.2.001','F.1622002.001.2.001','F.1622002.001.2.001'],
'Column_B':['COMMIT','COMMIT','Release','Release','Release']}
R003_data = pd.DataFrame(data)
R003_data.drop(R003_data[~((R003_data['Column_B'] == 'COMMIT') & (R003_data['Column_A'].str.contains(".2.", regex=False)))].index)
print(R003_data)
However, the output is as follows.
Column_A Column_B
0 L.9922070.128.1.020 COMMIT
1 L.9922080.125.2.001 COMMIT
2 F.1622002.001.2.001 Release
3 F.1622002.001.2.001 Release
4 F.1622002.001.2.001 Release
The output that I want is as follows.
Column_A Column_B
0 L.9922070.128.1.020 COMMIT
2 F.1622002.001.2.001 Release
3 F.1622002.001.2.001 Release
4 F.1622002.001.2.001 Release
Please help me to find the error. Thank you.
CodePudding user response:
Two things:
drop
doesn't modify the original dataframe. So either you have to assign the output back to your original df or include a parameterinplace=True
which I prefer.If you want to drop the rows then don't include the
~
sign because those rows you would like to drop. If you include that, the other rows are dropped and NOT retained.
With those two, your third line in the code before print statement looks like:
R003_data.drop(R003_data[(R003_data['Column_B'] == 'COMMIT')
& R003_data['Column_A'].str.contains('.2.',
regex=False)].index, inplace=True)