tensorflow keras RandomForestModel get_config() is empty-CodePudding

I want to be able to review the hyperparameters passed to keras' RandomForestModel. I think this should be possible with model.get_config(). However, after creating and training the model, get_config() always returns an empty dictionary.

This is the function that creates the model in my RandomForestWrapper class:

def add_new_model(self, model_name, params):

    self.train_test_split()

        model = tfdf.keras.RandomForestModel(
            random_seed=params["random_seed"],
            num_trees=params["num_trees"],
            categorical_algorithm=params["categorical_algorithm"],
            compute_oob_performances=params["compute_oob_performances"],
            growing_strategy=params["growing_strategy"],
            honest=params["honest"],
            max_depth=params["max_depth"],
            max_num_nodes=params["max_num_nodes"]
           )

    print(model.get_config())
    self.models.update({model_name: model})
    print(f"{model_name} added")

Example parameters:

params_v2 = {
    "random_seed": 123456,
    "num_trees": 1000,
    "categorical_algorithm": "CART",
    "compute_oob_performances": True,
    "growing_strategy": "LOCAL",
    "honest": True,
    "max_depth": 8,
    "max_num_nodes": None
}

I then instantiate the class and train the model:

rf_models = RF(data, obs_col="obs", class_col="cell_type")
rf_models.add_new_model("model_2", params_v2)
rf_models.train_model("model_2", verbose=False, metrics=["Accuracy"])

model = rf_models.models["model_2"]
model.get_config()

##
{}

In the model summary I can see that the parameters are accepted.

CodePudding user response：

Regarding get_config(), notice what the docs state:

Returns the config of the Model.

Config is a Python dictionary (serializable) containing the configuration of an object, which in this case is a Model. This allows the Model to be be reinstantiated later (without its trained weights) from this configuration.

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Developers of subclassed Model are advised to override this method, and continue to update the dict from super(MyModel, self).get_config() to provide the proper configuration of this Model. The default config is an empty dict. Optionally, raise NotImplementedError to allow Keras to attempt a default serialization.

I think what you can do is just call model.learner_params to get the details you want:

import tensorflow_decision_forests as tfdf
import pprint

params_v2 = {
    "random_seed": 123456,
    "num_trees": 1000,
    "categorical_algorithm": "CART",
    "compute_oob_performances": True,
    "growing_strategy": "LOCAL",
    "honest": True,
    "max_depth": 8,
    "max_num_nodes": None
}

model = tfdf.keras.RandomForestModel().from_config(params_v2)
pprint.pprint(model.learner_params)

{'adapt_bootstrap_size_ratio_for_maximum_training_duration': False,
 'allow_na_conditions': False,
 'bootstrap_size_ratio': 1.0,
 'bootstrap_training_dataset': True,
 'categorical_algorithm': 'CART',
 'categorical_set_split_greedy_sampling': 0.1,
 'categorical_set_split_max_num_items': -1,
 'categorical_set_split_min_item_frequency': 1,
 'compute_oob_performances': True,
 'compute_oob_variable_importances': False,
 'growing_strategy': 'LOCAL',
 'honest': True,
 'honest_fixed_separation': False,
 'honest_ratio_leaf_examples': 0.5,
 'in_split_min_examples_check': True,
 'keep_non_leaf_label_distribution': True,
 'max_depth': 8,
 'max_num_nodes': None,
 'maximum_model_size_in_memory_in_bytes': -1.0,
 'maximum_training_duration_seconds': -1.0,
 'min_examples': 5,
 'missing_value_policy': 'GLOBAL_IMPUTATION',
 'num_candidate_attributes': 0,
 'num_candidate_attributes_ratio': -1.0,
 'num_oob_variable_importances_permutations': 1,
 'num_trees': 1000,
 'pure_serving_model': False,
 'random_seed': 123456,
 'sampling_with_replacement': True,
 'sorting_strategy': 'PRESORT',
 'sparse_oblique_normalization': None,
 'sparse_oblique_num_projections_exponent': None,
 'sparse_oblique_projection_density_factor': None,
 'sparse_oblique_weights': None,
 'split_axis': 'AXIS_ALIGNED',
 'uplift_min_examples_in_treatment': 5,
 'uplift_split_score': 'KULLBACK_LEIBLER',
 'winner_take_all': True}