Home > database >  Why lambda x: all(x[x == 0]) doesn't work as expected in pandas
Why lambda x: all(x[x == 0]) doesn't work as expected in pandas

Time:08-06

So I have a dataframe, sample of which is as below:

    ID  Date        Week_1  Week_2  Week_3  Week_4  Week_5  Week_6  Week_7  Week_8  Week_9
0   A   2022-01-01  0       0       0       1       0       0       0       1       1
1   A   2022-01-02  0       0       0       1       0       0       0       1       1
2   B   2022-01-01  0       0       0       0       0       1       1       1       1
3   A   2022-02-03  0       0       0       1       0       0       0       1       1
4   B   2022-01-04  0       0       0       0       0       1       1       1       1
5   B   2022-01-15  0       0       0       0       0       1       1       1       1
6   C   2022-01-02  0       0       0       0       0       0       0       1       1
7   C   2022-01-02  0       0       0       0       0       0       0       1       1
8   D   2022-01-02  0       0       0       0       0       0       0       1       1
9   D   2022-01-01  0       0       0       0       0       0       0       1       1
10  E   2022-03-01  0       0       0       0       0       0       0       0       0

I get expected output when I try to find out if any of Week_ columns is 1 I use:

df.loc[df.iloc[:, 2:].apply(lambda x : any(x[x == 1]), axis = 1), :]

But when I try to find out if not even a single 1 is present across all columns, i.e. if all values in Week_ columns are 0s using:

df.loc[df.iloc[:, 2:].apply(lambda x : all(x[x == 0]), axis = 1), :]

I even tried below codes and still don't get the expected result.

df.loc[df.iloc[:, 2:].apply(lambda x : all(x[x != 1]), axis = 1), :]
df.loc[df.iloc[:, 2:].apply(lambda x : all(x[x < 1]), axis = 1), :]

I get an empty result set but I should be getting row 10 with ID E.

Could someone please let me know where I am going wrong.

CodePudding user response:

Rename x to row for better meaning

df.loc[df.iloc[:, 2:].apply(lambda row: all(row[row == 0]), axis = 1), :]

row == 0 will return True if value is 0. So row[row == 0] will only select 0. all on 0 array will return False in any case.

In fact, row == 0 already returns boolean Series, you can use

df.loc[df.iloc[:, 2:].apply(lambda row: row.eq(0).all(), axis = 1), :]

Personally speaking, I would write as

m = df.filter(like='Week').eq(0).all(axis=1)
out = df[m]
print(out)

   ID        Date  Week_1  Week_2  Week_3  Week_4  Week_5  Week_6  Week_7  Week_8  Week_9
10  E  2022-03-01       0       0       0       0       0       0       0       0       0
  • Related