Home > Blockchain >  How to resolve KeyError: "['column_name'] not in index"?
How to resolve KeyError: "['column_name'] not in index"?

Time:11-05

I want to create a dataframe having specific inputs, while executing the code getting the following error.

Let me explain the sequence:

  1. Checking the columns in train_df.

    Code:

    train_df.columns
    

    Output:

    Index(['fare_amount', 'pickup_datetime', 'pickup_longitude', 'pickup_latitude',
       'dropoff_longitude', 'dropoff_latitude', 'passenger_count',
       'pickup_datetime_year', 'pickup_datetime_month', 'pickup_datetime_day',
       'pickup_datetime_weekday', 'pickup_datetime_hour', 'trip_distance',
       'jkf_drop_distance', 'lga_drop_distance', 'ewr_drop_distance',
       'met_drop_distance', 'wtc_drop_distance'],
      dtype='object')
    
  2. Selecting only the input columns required by for model.

    Code:

    input_cols = ['pickup_longitude', 'pickup_latitude',
       'dropoff_longitude', 'dropoff_latitude', 'passenger_count',
       'pickup_datetime_year', 'pickup_datetime_month', pickup_datetime_day',
       'pickup_datetime_weekday', 'pickup_datetime_hour', 'trip_distance',
       'jfk_drop_distance', 'lga_drop_distance', 'ewr_drop_distance',
       'met_drop_distance', 'wtc_drop_distance']
    
  3. Creation of training dataframe from the above specific columns.

    Code:

    train_inputs = train_df[input_cols]
    

    I'm getting the error in the 3rd step the traceback is:

        ---------------------------------------------------------------------------
    KeyError                                  Traceback (most recent call last)
    <ipython-input-111-7f39184b2836> in <module>
    ----> 1 train_inputs = train_df[input_cols]
    
    ~\anaconda3\lib\site-packages\pandas\core\frame.py in __getitem__(self, key)
       3462             if is_iterator(key):
       3463                 key = list(key)
    -> 3464             indexer = self.loc._get_listlike_indexer(key, axis=1)[1]
       3465 
       3466         # take() does not accept boolean indexers
    
    ~\anaconda3\lib\site-packages\pandas\core\indexing.py in _get_listlike_indexer(self, key, axis)
       1312             keyarr, indexer, new_indexer = ax._reindex_non_unique(keyarr)
       1313 
    -> 1314         self._validate_read_indexer(keyarr, indexer, axis)
       1315 
       1316         if needs_i8_conversion(ax.dtype) or isinstance(
    
    ~\anaconda3\lib\site-packages\pandas\core\indexing.py in _validate_read_indexer(self, key, indexer, axis)
       1375 
       1376             not_found = list(ensure_index(key)[missing_mask.nonzero()[0]].unique())
    -> 1377             raise KeyError(f"{not_found} not in index")
       1378 
       1379 
    
    KeyError: "['jfk_drop_distance'] not in index"
    

CodePudding user response:

You need to ensure that the items in the input_cols are all in train_df.columns, none of these items meet the conditions: ['fare_amount', 'pickup_datetime', 'jkf_drop_distance']

CodePudding user response:

These 3 columns from you input_cols don't exist (hence why you're getting that error):

'fare_amount'
'jkf_drop_distance'
'dropoff_latitude'
  • Related