I want to be able to review the hyperparameters passed to keras' RandomForestModel. I think this should be possible with model.get_config()
.
However, after creating and training the model, get_config()
always returns an empty dictionary.
This is the function that creates the model in my RandomForestWrapper class:
def add_new_model(self, model_name, params):
self.train_test_split()
model = tfdf.keras.RandomForestModel(
random_seed=params["random_seed"],
num_trees=params["num_trees"],
categorical_algorithm=params["categorical_algorithm"],
compute_oob_performances=params["compute_oob_performances"],
growing_strategy=params["growing_strategy"],
honest=params["honest"],
max_depth=params["max_depth"],
max_num_nodes=params["max_num_nodes"]
)
print(model.get_config())
self.models.update({model_name: model})
print(f"{model_name} added")
Example parameters:
params_v2 = {
"random_seed": 123456,
"num_trees": 1000,
"categorical_algorithm": "CART",
"compute_oob_performances": True,
"growing_strategy": "LOCAL",
"honest": True,
"max_depth": 8,
"max_num_nodes": None
}
I then instantiate the class and train the model:
rf_models = RF(data, obs_col="obs", class_col="cell_type")
rf_models.add_new_model("model_2", params_v2)
rf_models.train_model("model_2", verbose=False, metrics=["Accuracy"])
model = rf_models.models["model_2"]
model.get_config()
##
{}
In the model summary I can see that the parameters are accepted.
CodePudding user response:
Regarding get_config()
, notice what the docs state:
Returns the config of the Model.
Config is a Python dictionary (serializable) containing the configuration of an object, which in this case is a Model. This allows the Model to be be reinstantiated later (without its trained weights) from this configuration.
Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.
Developers of subclassed Model are advised to override this method, and continue to update the dict from super(MyModel, self).get_config() to provide the proper configuration of this Model. The default config is an empty dict. Optionally, raise NotImplementedError to allow Keras to attempt a default serialization.
I think what you can do is just call model.learner_params
to get the details you want:
import tensorflow_decision_forests as tfdf
import pprint
params_v2 = {
"random_seed": 123456,
"num_trees": 1000,
"categorical_algorithm": "CART",
"compute_oob_performances": True,
"growing_strategy": "LOCAL",
"honest": True,
"max_depth": 8,
"max_num_nodes": None
}
model = tfdf.keras.RandomForestModel().from_config(params_v2)
pprint.pprint(model.learner_params)
{'adapt_bootstrap_size_ratio_for_maximum_training_duration': False,
'allow_na_conditions': False,
'bootstrap_size_ratio': 1.0,
'bootstrap_training_dataset': True,
'categorical_algorithm': 'CART',
'categorical_set_split_greedy_sampling': 0.1,
'categorical_set_split_max_num_items': -1,
'categorical_set_split_min_item_frequency': 1,
'compute_oob_performances': True,
'compute_oob_variable_importances': False,
'growing_strategy': 'LOCAL',
'honest': True,
'honest_fixed_separation': False,
'honest_ratio_leaf_examples': 0.5,
'in_split_min_examples_check': True,
'keep_non_leaf_label_distribution': True,
'max_depth': 8,
'max_num_nodes': None,
'maximum_model_size_in_memory_in_bytes': -1.0,
'maximum_training_duration_seconds': -1.0,
'min_examples': 5,
'missing_value_policy': 'GLOBAL_IMPUTATION',
'num_candidate_attributes': 0,
'num_candidate_attributes_ratio': -1.0,
'num_oob_variable_importances_permutations': 1,
'num_trees': 1000,
'pure_serving_model': False,
'random_seed': 123456,
'sampling_with_replacement': True,
'sorting_strategy': 'PRESORT',
'sparse_oblique_normalization': None,
'sparse_oblique_num_projections_exponent': None,
'sparse_oblique_projection_density_factor': None,
'sparse_oblique_weights': None,
'split_axis': 'AXIS_ALIGNED',
'uplift_min_examples_in_treatment': 5,
'uplift_split_score': 'KULLBACK_LEIBLER',
'winner_take_all': True}