d = {'status': {0: 'No', 1: 'No', 2: 'Yes', 3: 'No'}, 'time': {0: "['Morning', 'Midday', 'Afternoon']", 1: nan, 2: "[]", 3: nan}, 'id': {0: 1, 1: 5, 2: 2, 3: 3}}
df = pd.DataFrame(d)
df is the dataframe. All are object types.
I need to check not na and not empty list from all the columns of dataframe. I did below attempts -
df['no_nans'] = ~pd.isna(df).any(axis = 1)
print(df['no_nans'])
True
False
True
False
It should be as below -
True
False
False
False
As the time column has [] blank list in the third row , its not checking through isna().
Is there a simple and easy way to put this check properly? Thanks in advance for any help.
CodePudding user response:
Since you sometimes have empty lists instead of NaN, you can replace [] by Nan to get expected result like so :
df = df.replace('[]', np.nan)
df['no_nans'] = ~pd.isna(df).any(axis = 1)
output :
0 True
1 False
2 False
3 False
CodePudding user response:
As you have strings, you need to compare to '[]'
:
~(df.eq('[]')|df.isna()).any(axis=1)
Output:
0 True
1 False
2 False
3 False
dtype: bool
If you really had lists:
m1 = (df.select_dtypes(object)
.apply(lambda s: s.str.len().eq(0))
.reindex_like(df)
.fillna(False)
)
m2 = df.isna()
~(m1|m2).any(axis=1)
Alternative input for lists:
d = {'status': {0: 'No', 1: 'No', 2: 'Yes', 3: 'No'}, 'time': {0: ['Morning', 'Midday', 'Afternoon'], 1: nan, 2: [], 3: nan}, 'id': {0: 1, 1: 5, 2: 2, 3: 3}}
df = pd.DataFrame(d)