How to delete rows which has nan or empty value in SPECIFIC column?-CodePudding

I have a dataframe which has nan or empty cell in specific column for example column index 2. unfortunately I don't have subset. I just have index. I want to delete the rows which has this features. in stackoverflow there are too many soluntions which are using subset

This is the dataframe for example:

12 125 36 45 665

15 212 12 65 62

65 9 nan 98 84

21 54 78 5 654

211 65 58 26 65

...

output:

12 125 36 45 665

15 212 12 65 62

21 54 78 5 654

211 65 58 26 65

CodePudding user response：

If need test third column (with index=2) use boolean indexing if nan is missing value np.nan or string nan:

idx = 2


df1 = df[df.iloc[:, idx].notna() & df.iloc[:, idx].ne('nan')]

#if no value is empty string or nan string or missing value NaN/None
#df1 = df[df.iloc[:, idx].notna() & ~df.iloc[:, idx].isin(['nan',''])]
print (df1)
     0    1     2   3    4
0   12  125  36.0  45  665
1   15  212  12.0  65   62
3   21   54  78.0   5  654
4  211   65  58.0  26   65

If nans are missing values:

df1 = df.dropna(subset=df.columns[[idx]])
print (df1)
     0    1     2   3    4
0   12  125  36.0  45  665
1   15  212  12.0  65   62
3   21   54  78.0   5  654
4  211   65  58.0  26   65

CodePudding user response：

Not sure what you mean by

there are too many soluntions which are using subset

but the way to do this would be

df[~df.isna().any(axis=1)]

CodePudding user response：

You can use notnull()

df = df.loc[df[df.columns[idx]].notnull()]