I have a Pandas df with a column of True False values. I am trying to construct an if statement that tests that column, but am not getting the desired result. I think I am using the .bool method incorrectly. The basic idea is to check if the value of the current row Col1is True, and if any of the three prior rows Col1 was False, return True in Col2
from pandas import DataFrame
names = {'col1': [False, False, False, False, False, True, True,
True, False, False]}
df = DataFrame(names, columns =['col1'])
if df.col1.bool == True:
if df.col1.shift(1).bool == False:
df['col2'] = True
elif df.col1.shift(2).bool == False:
df['col2'] = True
elif df.col1.shift(3).bool == False:
df['col2'] = True
else:
df['col2'] = False
df
CodePudding user response:
from pandas import DataFrame
names = {'col1': [False, False, False, False, False, True, True,
True, False, False]}
df = DataFrame(names, columns =['col1'])
df['col2'] = False
df['col2'][df['col1']==False] = True
df['col2'][df['col1'].shift(1)==False] = True
df['col2'][df['col1'].shift(2)==False] = True
df['col2'][df['col1'].shift(3)==False] = True
df
or if you want to compact it a bit
from pandas import DataFrame
names = {'col1': [False, False, False, False, True, True, True,
True, False, False]}
df = DataFrame(names, columns =['col1'])
df['col2'] = False
df['col2'][(df['col1']==False) | (df['col1'].shift(1)==False) | (df['col1'].shift(2)==False) | (df['col1'].shift(3)==False)] = True
CodePudding user response:
here is one way to do it, using np.where and pd.rolling and by taking the sum of boolean values, which would be less than 3 unless all previous three values are true IIUC, the previous three, excluding the current row
df['col2']=np.where((df['col1'].shift(1).rolling(3).sum()<3) &(df['col1']==True),
True,
False)
df
col1 col2
0 False False
1 False False
2 False False
3 False False
4 False False
5 True True
6 True True
7 True True
8 False False
9 False False
CodePudding user response:
The lines df['col2'] = True
and df['col2'] = False
set the whole column to True and False, respectively. Since you want element-wise operations, you need to use the overloaded bitwise operations &
(for AND) and |
(for OR).
Your new column should be true when the current value of col1
is True AND at least one of the previous 3 values are False, so that would be encoded as:
df['col2'] = df.col1 & (
(df.col1.shift(1) == False) |
(df.col1.shift(2) == False) |
(df.col1.shift(3) == False)
)
Be careful with the operator precedence when using the bitwise operations as they have lower priority than the comparison operators, so being liberal with parenthesis is advised.