I am dealing with the dataframe where some of the rows have no value inside just like a below dataframe (look at the third row). The picture below shows only one row which does not have any value but total I have lot of rows in which one or two of their column does not have any value. I want to delete such kind of rows which has no value in atleast one column.
df
Thick Max Mean
19 0.7889 8172.58 2197.091
20 1.0603 9366.3 2781.3216
21 '- '- '-
22 1.0577 9347.46 2774.4086
23 0.8125 8243.45 2241.2326
24 0.924 8461.7 2484.9097
How can I delete these columns?
CodePudding user response:
- You could first replace the missing values with
np.nan
- Remove the rows containing
np.nan
usingdropna()
Here is the full code:
df = pd.DataFrame({ 'Thick': ['0.7889', '1.0603', "'-", '1.0577', '0.8125', '0.924'],
'Max': ['8172.58', '9366.3', "'-", '9347.46', '8243.45', '8461.7'],
'Mean': ['2197.091', '2781.3216', "'-", '2774.4086', '2241.2326', '2484.9097']})
df = df.replace("'-", np.nan)
df = df.dropna()
print(df)
OUTPUT:
Thick Max Mean
0 0.7889 8172.58 2197.091
1 1.0603 9366.3 2781.3216
3 1.0577 9347.46 2774.4086
4 0.8125 8243.45 2241.2326
5 0.924 8461.7 2484.9097
You can do both actions on one line if necessary:
df = df.replace("'-", np.nan).dropna()
CodePudding user response:
Compare all values for not equal string and filter in boolean indexing
if all Trues per rows by DataFrame.all
:
df = df[df.ne("'-").all(axis=1)]
print (df)
Thick Max Mean
0 0.7889 8172.58 2197.091
1 1.0603 9366.3 2781.3216
3 1.0577 9347.46 2774.4086
4 0.8125 8243.45 2241.2326
5 0.924 8461.7 2484.9097