Home > database >  Pandas selecting rows with multiple conditions
Pandas selecting rows with multiple conditions

Time:04-14

Let's make a dataframe:

>>> np.random.seed(0)
>>> df = pd.DataFrame(np.random.randn(5,3), columns=list('ABC'))
>>>
          A         B         C
0  1.764052  0.400157  0.978738
1  2.240893  1.867558 -0.977278
2  0.950088 -0.151357 -0.103219
3  0.410599  0.144044  1.454274

I want to get all the rows where column 'C' is between 0 and 1 inclusive.

This code works:

>>> df[(df['C'] >= 0) & (df['C'] <= 1)]
          A         B         C
0  1.764052  0.400157  0.978738
4  0.761038  0.121675  0.443863

But this (what I feel is equivalent) code doesn't:

>>> df[(0 <= df['C'] <= 1)]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\panda\anaconda3\lib\site-packages\pandas\core\generic.py", line 1537, in __nonzero__
    raise ValueError(
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Do I really have to split any multi condition booleans into separate conditions in pandas? Is there a better way to accomplish this?

CodePudding user response:

You can use between. By default, it's both sides inclusive.

out = df[df['C'].between(0,1)]

If you want only one side inclusive, you can select that as well. For example, the following is only right-side inclusive:

out = df[df['C'].between(0,1, inclusive='right')]

Output:

          A         B         C
0  1.764052  0.400157  0.978738
  • Related