Home > database >  Trouble removing blank spaces and filling nans in a two column data frame using columns.fillna(colum
Trouble removing blank spaces and filling nans in a two column data frame using columns.fillna(colum

Time:10-13

I'm trying to remove the nans and blank spaces from two columns and replace them with mean values from the respective columns using columns.fillna(column.mean), but it tells me that "columns is not defined" when I implement the following code.

How do I define the columns I've defined as a parameter in my data frame so that the columns.fillna(column.mean) methods apply?

import pandas as pd
from pandas import DataFrame
import matplotlib.pyplot as plt
from sklearn.cluster import KMeans

points = data = pd.read_csv (r'brain_diseases.csv', index_col='id')

df = pd.DataFrame(data, columns= ['cancer','prions'])
columns.fillna(cancer.mean())
columns.fillna(pryons.mean())

kpoints = KMeans(n_clusters=3, init='random').fit(data)
center = kpoints.cluster_centers_
print(center)

plt.scatter(data['trestbps'], data['chol'], c=kpoints.labels_.astype(float), s=50, alpha=0.5)
plt.scatter(center[:, 0], center[:, 1], c='black', s=50)
plt.show()

Any help greatly appreciated.

CodePudding user response:

columns is not defined in your code,

the fillna function can be called on on the dataframe:

import pandas as pd
from pandas import DataFrame
import matplotlib.pyplot as plt
from sklearn.cluster import KMeans

points = data = pd.read_csv (r'brain_diseases.csv', index_col='id')

df = pd.DataFrame(data, columns= ['cancer','prions'])
df.fillna(cancer.mean())
df.fillna(pryons.mean())  # fill on df instead 

...
  • Related