Home > Software engineering >  delete rows which does not have any values in their columns in pandas dataframe
delete rows which does not have any values in their columns in pandas dataframe

Time:11-23

I am dealing with the dataframe where some of the rows have no value inside just like a below dataframe (look at the third row). The picture below shows only one row which does not have any value but total I have lot of rows in which one or two of their column does not have any value. I want to delete such kind of rows which has no value in atleast one column.

df

    Thick   Max     Mean
19  0.7889  8172.58 2197.091
20  1.0603  9366.3  2781.3216
21  '-        '-       '-
22  1.0577  9347.46 2774.4086
23  0.8125  8243.45 2241.2326
24  0.924   8461.7  2484.9097

How can I delete these columns?

CodePudding user response:

  1. You could first replace the missing values with np.nan
  2. Remove the rows containing np.nan using dropna()

Here is the full code:

df = pd.DataFrame({ 'Thick': ['0.7889', '1.0603', "'-", '1.0577', '0.8125', '0.924'],
                    'Max': ['8172.58', '9366.3', "'-", '9347.46', '8243.45', '8461.7'],
                    'Mean': ['2197.091', '2781.3216', "'-", '2774.4086', '2241.2326', '2484.9097']})

df = df.replace("'-", np.nan)
df = df.dropna()

print(df)

OUTPUT:

    Thick      Max       Mean
0  0.7889  8172.58   2197.091
1  1.0603   9366.3  2781.3216
3  1.0577  9347.46  2774.4086
4  0.8125  8243.45  2241.2326
5   0.924   8461.7  2484.9097


You can do both actions on one line if necessary:

df = df.replace("'-", np.nan).dropna()

CodePudding user response:

Compare all values for not equal string and filter in boolean indexing if all Trues per rows by DataFrame.all:

df = df[df.ne("'-").all(axis=1)]
print (df)
    Thick      Max       Mean
0  0.7889  8172.58   2197.091
1  1.0603   9366.3  2781.3216
3  1.0577  9347.46  2774.4086
4  0.8125  8243.45  2241.2326
5   0.924   8461.7  2484.9097
  • Related