Home > Back-end >  Python OOP using sklearn API
Python OOP using sklearn API

Time:07-15

I want to learn more advanced OOP methods and create a class using sklearn APIs, my idea is to integrate feature selection methods. I want to be able to call a feature selection method by name, and next fit and transform data.

import inspect
import numpy as np
import pandas as pd
import sklearn.feature_selection as skfs

from sklearn.utils import all_estimators
from typing import Any, Dict

class FeatureSelection:

    def __init__(self, method_name: str, method_params: Dict[str, Any]):

        self.method_name = method_name
        self.method_params = method_params
        self.method = None


    def fit(self, X, y=None, **kwargs):

        if self.method_name in list(inspect.getmembers()):
            self.method = inspect.getmembers()[self.method_name]
        else:
            raise Exception("Method not available yet.")

        self.method.fit(X, y, **kwargs)
        
        
    def transform(self, X):
        return self.method.transform(X=X)


class SklearnFeatureSelection(FeatureSelection):

    def __init__(self, method_name, method_params=Dict[str, Any]):
        super().__init__(method_name=method_name, method_params=method_params)
        self._check_sklearn_methods()
        self._init_sklearn_method_object()

    
    def _check_sklearn_methods(self):
        estimators = all_estimators(type_filter="feature_selection")
        if self.method_name not in estimators:
            raise ValueError("The value is not found")

    # Instantiate the sklearn object:

    def _init_sklearn_method_object(self):
        self.method = getattr(skfs, self.method_name)(**self.method_params)
        

    def fit(self, X):
        self.method.fit(X)
   
        
    def transform(self, X):
        if self.method is None:
            raise Exception("Method not fitted yet.")
        return self.method.transform(X)


###Call methods

train_data = [[0, 0, 1], [0, 1, 0], [1, 0, 0], [0, 1, 1], [0, 1, 0], [0, 1, 1]]
sklearn_fs = SklearnFeatureSelection(method_name="variance_threshold", method_args={"threshold": 0.7})
sklearn_fs.fit(X=train_data)
sklearn_fs.transform(X=new_data)

I am not sure, what I am doing wrong but currently, I have the following error that I am not able to solve:


AttributeError                            Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_24040/4237973329.py in <module>
      2 sklearn_fs = SklearnFeatureSelection(method_name="variance_threshold")
      3 train_data = [[0, 0, 1], [0, 1, 0], [1, 0, 0], [0, 1, 1], [0, 1, 0], [0, 1, 1]]

4 sklearn_fs.fit(X=train_data) 5 sklearn_fs.transform(X=train_data)

~\AppData\Local\Temp/ipykernel_24040/2489316783.py in fit(self, X)
     48 
     49     def fit(self, X):
---> 50         self.method.fit_transform(X)
     51 
     52 

AttributeError: 'NoneType' object has no attribute 'fit_transform'

CodePudding user response:

You never call _init_sklearn_method_object (nor _check_sklearn_methods), so the instance attribute method remains None from the parent class's __init__.

Separately, FeatureSelection.fit won't ever be run in what you've shown (maybe you intend to use the parent class directly at some point though?). And SklearnFeatureSelection.fit confusingly calls the skl method's fit_transform rather than just fit, and doesn't return anything.

  • Related