I have sample of my code in Python like below:
...
for col in df.columns.tolist():
if val in df[f"{col}"].values:
if val.isna():
my_list.append(col)
So, if some column from my DataFrame contains NaN value add name of this column to "my_list".
I know that in my DF are columns with NaN values, but my code generate empty "my_list", probably the error is in line: if val.isna():
, how can I modify that? How can I "tell" Python to take NaN values from columns ?
CodePudding user response:
Just use a if col statement like this
for col in df.columns.tolist():
if val in df[f"{col}"].values:
if col == False:
my_list.append(col)
I am not giving you the best way of doing it, just fixing your little list loop
CodePudding user response:
By iterating over the values in the column, adding the column name to my_list and then breaking you get this:
my_list = ['col1','col3']
My code:
import pandas as pd
from numpy import NaN
df = pd.DataFrame(data={
"col1":[10,2.5,NaN],
"col2":[10,2.5,3.5],
"col3":[5,NaN,1]})
my_list = []
for col in df.columns:
for val in df[col].values:
if pd.isna(val):
my_list.append(col)
break
print(f"{my_list=}")
CodePudding user response:
You can fix your code with changes that @Orange mentioned. I'm just adding this as an alternative. When working with data you want to allow the data base/data analysis software to do the heavy lifting. Looping over a cursor is something you should try to avoid as best as you can.
The code you have can be changed to:
for col in df.columns:
if df[col].hasnans:
my_list.append(col)
The code below functionally does the same thing:
df.columns[[df[col].hasnans for col in df.columns]].to_list()
The code below calculates hasnans
using isna
and sum
.
df.columns[df.isna().sum() > 0].to_list()