Home > Software engineering >  Why do I get warning with this function?
Why do I get warning with this function?

Time:09-08

I am trying to generate a new column containing boolean values of whether a value of each row is Null or not. I wrote the following function,

def not_null(row):
   null_list = []
   for value in row:
       null_list.append(pd.isna(value))
   return null_list

df['not_null'] = df.apply(not_null, axis=1)

But I get the following warning message,

A value is trying to be set on a copy of a slice from a DataFrame.

Is there a better way to write this function?

Note: I want to be able to apply this function to each row regardless of knowing the header row name or not

Final output ->

Column1 | Column2 | Column3 | null_idx
NaN     |   Nan   |   Nan   | [0, 1, 2]
1       |   23    |    34   | []
test1   |   Nan   |   Nan   | [1, 2]

CodePudding user response:

First your error means there is some filtering before in your code and need DataFrame.copy:

df = df[df['col'].gt(100)].copy()

Then your solution should be improved:

df = pd.DataFrame({'a':[np.nan, 1, np.nan],
                   'b':[np.nan,4,6],
                   'c':[4,5,3]})   

df['list_boolean_for_missing'] = [x[x].tolist() for x in df.isna().to_numpy()]

print (df) 
     a    b  c list_boolean_for_missing
0  NaN  NaN  4             [True, True]
1  1.0  4.0  5                       []
2  NaN  6.0  3                   [True]

Your function:

dd = lambda x: [pd.isna(value) for value in x]
df['list_boolean_for_missing'] = df.apply(not_null, axis=1)

If need:

I am trying to generate a new column containing boolean values of whether a value of each row is Null or not

df['not_null'] = df.notna().all(axis=1)

print (df) 
     a    b  c  not_null
0  NaN  NaN  4     False
1  1.0  4.0  5      True
2  NaN  6.0  3     False

EDIT: For list of positions create helper array by np.arange and filter it:

arr = np.arange(len(df.columns))
df['null_idx'] = [arr[x].tolist() for x in df.isna().to_numpy()]

print (df) 
     a    b  c null_idx
0  NaN  NaN  4   [0, 1]
1  1.0  4.0  5       []
2  NaN  6.0  3      [0]
  • Related