Say I have a dataframe. (Original dataframe has 91 columns 1000 rows)
0 1 2 3
0 False False False True
1 True False False False
2 True False False False
3 False False True False
4 False True True False
5 False False False False
6 True True True True
I need to get the AND/OR
values for all the columns in my dataframe. So the resultant OR, AND
values would be.
OR AND
0 True False
1 True False
2 True False
3 True False
4 True False
5 False False
6 True True
I can do this by looping over all my columns and calculate the boolean for each column but I was looking for a more dataframe level approach without actually going through the columns.
CodePudding user response:
You can use any
and all
.
df = df.assign(OR=df.any(axis=1), AND=df.all(axis=1))
CodePudding user response:
You can sum along the columns and then the OR
is indicated by sum > 0
, and AND
is indicated by sum == len(df.columns)
:
total = df.sum(axis=1)
res = pd.DataFrame({"OR": total > 0, "AND": total == len(df.columns)})
If you have many columns this is more efficient as it only iterates over the entire matrix once (in the worst case, depending on the input distribution and implementation of any/all iterating twice can be faster).