I've trying to make my model, when I use different Dataset and different encode it works, but in another code, I use different encode and different Dataset for my model but it seems to show an error like this:
Traceback (most recent call last):
File "C:\Users\user\AppData\Local\conda\conda\envs\myenv\lib\site-packages\pandas\core\indexes\base.py", line 3361, in get_loc
return self._engine.get_loc(casted_key)
File "pandas\_libs\index.pyx", line 76, in pandas._libs.index.IndexEngine.get_loc
File "pandas\_libs\index.pyx", line 108, in pandas._libs.index.IndexEngine.get_loc
File "pandas\_libs\hashtable_class_helper.pxi", line 5198, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas\_libs\hashtable_class_helper.pxi", line 5206, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'fbs'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "heart_disease.py", line 11, in <module>
dummy = pd.get_dummies(df[col], prefix=col)
File "C:\Users\user\AppData\Local\conda\conda\envs\myenv\lib\site-packages\pandas\core\frame.py", line 3455, in __getitem__
indexer = self.columns.get_loc(key)
File "C:\Users\user\AppData\Local\conda\conda\envs\myenv\lib\site-packages\pandas\core\indexes\base.py", line 3363, in get_loc
raise KeyError(key) from err
KeyError: 'fbs'
am I getting wrong with my code or getting wrong with different dataset? here is my code and my dataset
CodePudding user response:
The problem is that the columns names are not those you would be expecting as they include spaces.
From your code:
# your DataFrame
penguins = pd.read_csv('file.csv')
Printing
penguins.columns
Returns
Index(['age', ' sex', ' cp', ' trestbps', ' chol', ' fbs', ' restecg',
' thalach', ' exang', ' oldpeak', ' slope', ' thal', ' diagnosis'],
dtype='object')
As you can see the columns have spaces in their name. We can solve this by doing the following just after :
penguins.columns = penguins.columns.str.replace(' ', '')
which will solve your error.