Home > Blockchain >  Pandas : Get binary OR/AND for all the columns in a dataframe
Pandas : Get binary OR/AND for all the columns in a dataframe

Time:08-04

Say I have a dataframe. (Original dataframe has 91 columns 1000 rows)

       0      1     2      3
0  False  False   False    True
1   True  False   False   False
2   True  False   False   False
3  False  False    True   False
4  False   True    True   False
5  False  False   False   False 
6   True   True    True    True         

I need to get the AND/OR values for all the columns in my dataframe. So the resultant OR, AND values would be.

      OR     AND
0    True   False
1    True   False
2    True   False
3    True   False
4    True   False
5    False  False
6    True    True

I can do this by looping over all my columns and calculate the boolean for each column but I was looking for a more dataframe level approach without actually going through the columns.

CodePudding user response:

You can use any and all.

df = df.assign(OR=df.any(axis=1), AND=df.all(axis=1))

CodePudding user response:

You can sum along the columns and then the OR is indicated by sum > 0, and AND is indicated by sum == len(df.columns):

total = df.sum(axis=1)
res = pd.DataFrame({"OR": total > 0, "AND": total == len(df.columns)})

If you have many columns this is more efficient as it only iterates over the entire matrix once (in the worst case, depending on the input distribution and implementation of any/all iterating twice can be faster).

  • Related