Home > Mobile >  remove rows in dataframe which are not all 1 or all 0
remove rows in dataframe which are not all 1 or all 0

Time:03-31

I need to retain rows in the dataframe which has all row values as 0 or all 1.

a = np.repeat(0,10)
b = np.repeat(1,10)
ab = pd.DataFrame({'col1':a,'col2':b}).transpose()

CodePudding user response:

One option, get the diff and ensure the result is always 0:

import numpy as np
np.all(np.diff(ab.values, 1)==0, 1)

Output:

array([ True,  True])

Then use this to slice:

ab[np.all(np.diff(ab.values, 1)==0, 1)]

Other option, use nunique:

ab[ab.nunique(1).eq(1)]

CodePudding user response:

Possible solution is the following:

import pandas as pd

# create test dataframe
df = pd.DataFrame({'col1':[0,0,0,0],'col2':[1,1,1,1],'col3':[0,1,0,1],'col4':['a','b',0,1],'col5':['a','a','a','a']}).transpose()
df

enter image description here

# filter rows of dataframe
df = df[df.eq(0).all(axis=1) | df.eq(1).all(axis=1)]
df

Returns

enter image description here

  • Related