How to delete any row for which a corresponding column has a negative value using Pandas?-CodePudding

I have been given a CSV data file. Using JupyterHub, Python and Pandas I have been able to read the dataframe and have deleted any rows with NaN values. I am looking to do the same for any values that are negative. I have tried to search a similar problem on thsi site, but can't seem to find a solution to try that fits well. Below is how I deleted the rows with NaNs Please help!

df=pd.read_csv("cereal.csv")
df1=df.dropna(how='any',axis =0).reset_index(drop=True)
df1.shape
df1.head()

CodePudding user response：

You can drop rows for which in a specific column the value is negative using pandas.DataFrame.drop as follows:

import pandas as pd
df = pd.DataFrame({
    'colA': [-1, 2, 3, 4, None],
    'colB': [True, True, False, False, True],
})

df = df.drop(df.index[df['colA'] < 0])

Output:

>>> df
   colA   colB
1   2.0   True
2   3.0  False
3   4.0  False
4   NaN   True

CodePudding user response：

Another option, similar syntax but doesn't use .drop. Retain the rows where condition is not met (~ works as a negation):

>>> df.loc[~(df['colA'] < 0)]  
   colA   colB
1   2.0   True
2   3.0  False
3   4.0  False
4   NaN   True

And another one, for you to choose depending if you want the NaN values in your "colA" or not:

>>> df.loc[df['colA'] >= 0)]
   colA   colB
1   2.0   True
2   3.0  False
3   4.0  False