I have a Python Pandas dataframe with many rows. The first column is "test_idx". For example:
df =
test_idx sample value
1 1 1.2
1 2 -3.0
1 3 4.7
2 1 1.5
2 2 2.8
etc...
Assume I know that experiments invalid_tests = [2,3,7]
are invalid. I would like to create a new Pandas dataframe cdf
which contains only the valid tests.
There is a straight-forward way to do it, as I did it here:
valid_tests_idx = [] # indices of rows with valid tests
for i in range(len(df)):
if not df["test_idx"].iloc[i] in invalid_tests:
valid_tests_idx.append(i)
cdf = df.iloc[valid_tests_idx]
It works fine, but I ask if there is a more elegant way or an one-liner way to do it.
CodePudding user response:
Use pandas tilde (~) operator for negation :
df[~df.test_idx.isin(invalid_tests)]