I have a data frame I am trying to do predictions on house-prices based on the data such as square footage or if it has central air. I need to convert string values to numbers in order to model it. For example the values in the CentralAir column are 'N' or 'Y' which I want to be 0 and 1 respectively.
# pull data into target (y) and predictors (X) using other standard predictors;
train_y2 = df_train.SalePrice
#convert strings to float so we can use predictors like the neighborhood and building type
central_air_mapping = {'N':0, 'Y':1}
df_train['CentralAir'] = df_train.map(central_air_mapping)
predictor_cols2 = ['CentralAir']
# Create training predictors data
train_X2 = df_train[predictor_cols2]
my_model2 = RandomForestRegressor()
my_model2.fit(train_X2, train_y2)
Then it returns
AttributeError: 'DataFrame' object has no attribute 'map'
CodePudding user response:
def ToNum(c):
if c == "Y":
return 1
else:
return 0
df_train["CentralAir"] = df_train["CentralAir"].apply(ToNum)
CodePudding user response:
.map() is defined for Series, not DataFrames. This is why you are getting an error.
central_air_mapping = {'N':0, 'Y':1}
df_train['CentralAir'] = df_train['CentralAir'].map(central_air_mapping)
^^^^
this was missing