Home > Software design >  how to check not na and not empty list in a dataframe column?
how to check not na and not empty list in a dataframe column?

Time:10-04

d = {'status': {0: 'No', 1: 'No', 2: 'Yes', 3: 'No'}, 'time': {0: "['Morning', 'Midday', 'Afternoon']", 1: nan, 2: "[]", 3: nan}, 'id': {0: 1, 1: 5, 2: 2, 3: 3}}
df = pd.DataFrame(d)

df is the dataframe. All are object types.

I need to check not na and not empty list from all the columns of dataframe. I did below attempts -

df['no_nans'] = ~pd.isna(df).any(axis = 1)
print(df['no_nans'])

True
False
True
False

It should be as below -

True
False
False
False

As the time column has [] blank list in the third row , its not checking through isna().

Is there a simple and easy way to put this check properly? Thanks in advance for any help.

CodePudding user response:

Since you sometimes have empty lists instead of NaN, you can replace [] by Nan to get expected result like so :

df = df.replace('[]', np.nan)
df['no_nans'] = ~pd.isna(df).any(axis = 1)

output :

0     True
1    False
2    False
3    False

CodePudding user response:

As you have strings, you need to compare to '[]':

~(df.eq('[]')|df.isna()).any(axis=1)

Output:

0     True
1    False
2    False
3    False
dtype: bool

If you really had lists:

m1 = (df.select_dtypes(object)
        .apply(lambda s: s.str.len().eq(0))
        .reindex_like(df)
        .fillna(False)
      )

m2 = df.isna()

~(m1|m2).any(axis=1)

Alternative input for lists:

d = {'status': {0: 'No', 1: 'No', 2: 'Yes', 3: 'No'}, 'time': {0: ['Morning', 'Midday', 'Afternoon'], 1: nan, 2: [], 3: nan}, 'id': {0: 1, 1: 5, 2: 2, 3: 3}}
df = pd.DataFrame(d)
  • Related