I was using MLflow to log some parameters for my ML experiment and kept experiencing a BAD_REQUEST
error. The specific traceback is:
mlflow.exceptions.RestException: BAD_REQUEST: (pymysql.err.IntegrityError) (1062, "Duplicate entry 'hidden_Size-417080a853934d5d8a7cf5a27' for key 'params.PRIMARY'")
[SQL: INSERT INTO params (`key`, value, run_uuid) VALUES (%(key)s, %(value)s, %(run_uuid)s)]
[parameters: ({'key': 'dropout_rate', 'value': '0.1', 'run_uuid': '417080a853934d5d8a7cf5a27'}, {'key': 'model_type', 'value': '', 'run_uuid': '417080a853934d5d8a7cf5a27'}, {'key': 'model_name_or_path', 'value': '', 'run_uuid': '417080a853934d5d8a7cf5a27'}, {'key': 'model_checkpoint_path', 'value': '', 'run_uuid': '417080a853934d5d8a7cf5a27'}, {'key': 'hidden_size', 'value': '768', 'run_uuid': '417080a853934d5d8a7cf5a27'}, {'key': 'vocab_size',
'value': '49408', 'run_uuid': '417080a853934d5d8a7cf5a27'}, {'key': 'num_layers', 'value': '12', 'run_uuid': '417080a853934d5d8a7cf5a27'}, {'key': 'attn_heads', 'value': '8', 'run_uuid': '417080a853934d5d8a7cf5a27'} ... displaying 10 of 40 total bo
und parameter sets ... {'key': 'num_labels', 'value': '13', 'run_uuid': '417080a853934d5d8a7cf5a27'}, {'key': 'hidden_Size', 'value': '256', 'run_uuid': '417080a853934d5d8a7cf5a27'})]
(Background on this error at: https://sqlalche.me/e/14/gkpj)
I'm using an argument parser that contains all of these values and was using mlflow.log_params(vars(args))
to log them. Somewhere in the code I was assigning a new value to the Namespace called hidden_Size
which is identical to hidden_size
but with a different name.
Doing some reading tells me that this is a SQL problem where I probably was assigning the same value for two keys. But after doing some reading it seems that MySQL is case-sensitive in Linux which is what I'm using. If SQL is case-sensitive, why am I getting a duplicate entry error? Shouldn't hidden_size
and hidden_Size
both be treated as separate entries?
CodePudding user response:
String data values are case insensitive if they use a case insensitive collation. This is the default on all operating systems.
Case-sensitive collations end in _bin
or _cs
. See https://dev.mysql.com/doc/refman/8.0/en/case-sensitivity.html
Case-sensitivity of string values is independent of case-sensitivity of table identifiers.
Linux has case-sensitive table identifiers by default. For full details on this, read: https://dev.mysql.com/doc/refman/8.0/en/identifier-case-sensitivity.html