Let's make a dataframe:
>>> np.random.seed(0)
>>> df = pd.DataFrame(np.random.randn(5,3), columns=list('ABC'))
>>>
A B C
0 1.764052 0.400157 0.978738
1 2.240893 1.867558 -0.977278
2 0.950088 -0.151357 -0.103219
3 0.410599 0.144044 1.454274
I want to get all the rows where column 'C' is between 0 and 1 inclusive.
This code works:
>>> df[(df['C'] >= 0) & (df['C'] <= 1)]
A B C
0 1.764052 0.400157 0.978738
4 0.761038 0.121675 0.443863
But this (what I feel is equivalent) code doesn't:
>>> df[(0 <= df['C'] <= 1)]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\panda\anaconda3\lib\site-packages\pandas\core\generic.py", line 1537, in __nonzero__
raise ValueError(
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
Do I really have to split any multi condition booleans into separate conditions in pandas? Is there a better way to accomplish this?
CodePudding user response:
You can use between
. By default, it's both sides inclusive.
out = df[df['C'].between(0,1)]
If you want only one side inclusive, you can select that as well. For example, the following is only right-side inclusive:
out = df[df['C'].between(0,1, inclusive='right')]
Output:
A B C
0 1.764052 0.400157 0.978738