When I serached a way to remove an entire column in pandas if there is a null/NaN value, the only appropriate function I found was dropna(). For some reason, it's not removing the entire row as intended, but instead replacing the null values with zero. As I want to discard the entire row to then make a mean age of the animals from the dataframe, I need a way to not count the NaN values. Here's the code:
import numpy as np
import pandas as pd
data = {'animal': ['cat', 'cat', 'snake', 'dog', 'dog', 'cat', 'snake', 'cat', 'dog', 'dog'],
'age': [2.5, 3, 0.5, np.nan, 5, 2, 4.5, np.nan, 7, 3],
'visits': [1, 3, 2, 3, 2, 3, 1, 1, 2, 1],
'priority': ['yes', 'yes', 'no', 'yes', 'no', 'no', 'no', 'yes', 'no', 'no']}
labels = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']
df = pd.DataFrame(data, labels)
df.dropna(inplace= True)
df.head()
In this case, I need to delete the Dog 'd' and Cat 'h'. But the code that comes out is:
To note I have also done this, and it didn't work either:
df2 = df.dropna()
CodePudding user response:
you have to specify the axis = 1 and any to remove column see : https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.dropna.html
df.dropna(axis=1, inplace= True, how='any')
if you want just delet the row :
df.dropna(inplace= True, how='any')