Home > other >  Pandas not all but two of True
Pandas not all but two of True

Time:06-13

I want to delete rows if they has two True value like index 1: enter image description here

Note: this is the the tip of the dataframe iceberg. There are more rows. So i need a condition or sth else...

CodePudding user response:

Generate the dataframe:

import pandas as pd
df = pd.DataFrame({
    "0": [True, True, True, True, True],
    "1": [False, False, False, False, False],
    "2": [False, False, False, False, False],
    "3": [False, False, False, False, False],
    "4": [False, True, False, False, False]

})

Get the number of true per row:

df = df.assign(number_of_true= lambda x: x.sum(axis=1))

      0      1      2      3      4  number_of_true
0  True  False  False  False  False               1
1  True  False  False  False   True               2
2  True  False  False  False  False               1
3  True  False  False  False  False               1
4  True  False  False  False  False               1

Select row that do not have two True on one row:

df = df.query("number_of_true != 2")

One liner:

(
df
.assign(number_of_true= lambda x: x.sum(axis=1))
.query("number_of_true != 2")
.drop(columns="number_of_true") #clean dataframe
)
Output:
      0      1      2      3      4
0  True  False  False  False  False
2  True  False  False  False  False
3  True  False  False  False  False
4  True  False  False  False  False

CodePudding user response:

one approach can be to count True occurences and then drop rows on basis of count like so: df['count'] = df[['1', '2']].sum(axis=1) then dropping like so: df3 = df[df['count'] > 2]

i hope i didn't solve your homework xD

CodePudding user response:

import pandas as pd


df = {
    "0": [True, True, True, True, True],
    "1": [False, False, False, False, False],
    "2": [False, False, False, False, False],
    "3": [False, False, False, False, False],
    "4": [False, True, False, False, False]

}

df = pd.DataFrame(df)
print(df)

for i in range(df.shape[0]):
    criteria_value = 2
    count = 0
    for column_name in df.columns:
        if df.loc[i][column_name]:
            count  = 1
    if count == criteria_value:
        df.drop(index=i, inplace=True)

print(df)

Output

      0      1      2      3      4
0  True  False  False  False  False
2  True  False  False  False  False
3  True  False  False  False  False
4  True  False  False  False  False
  • Related