Home > Enterprise >  If MySQL is case-sensitive on Linux, why am I getting a duplicate entry error?
If MySQL is case-sensitive on Linux, why am I getting a duplicate entry error?

Time:01-18

I was using MLflow to log some parameters for my ML experiment and kept experiencing a BAD_REQUEST error. The specific traceback is:

mlflow.exceptions.RestException: BAD_REQUEST: (pymysql.err.IntegrityError) (1062, "Duplicate entry 'hidden_Size-417080a853934d5d8a7cf5a27' for key 'params.PRIMARY'")
[SQL: INSERT INTO params (`key`, value, run_uuid) VALUES (%(key)s, %(value)s, %(run_uuid)s)]
[parameters: ({'key': 'dropout_rate', 'value': '0.1', 'run_uuid': '417080a853934d5d8a7cf5a27'}, {'key': 'model_type', 'value': '', 'run_uuid': '417080a853934d5d8a7cf5a27'}, {'key': 'model_name_or_path', 'value': '', 'run_uuid': '417080a853934d5d8a7cf5a27'}, {'key': 'model_checkpoint_path', 'value': '', 'run_uuid': '417080a853934d5d8a7cf5a27'}, {'key': 'hidden_size', 'value': '768', 'run_uuid': '417080a853934d5d8a7cf5a27'}, {'key': 'vocab_size',
'value': '49408', 'run_uuid': '417080a853934d5d8a7cf5a27'}, {'key': 'num_layers', 'value': '12', 'run_uuid': '417080a853934d5d8a7cf5a27'}, {'key': 'attn_heads', 'value': '8', 'run_uuid': '417080a853934d5d8a7cf5a27'}  ... displaying 10 of 40 total bo
und parameter sets ...  {'key': 'num_labels', 'value': '13', 'run_uuid': '417080a853934d5d8a7cf5a27'}, {'key': 'hidden_Size', 'value': '256', 'run_uuid': '417080a853934d5d8a7cf5a27'})]
(Background on this error at: https://sqlalche.me/e/14/gkpj)

I'm using an argument parser that contains all of these values and was using mlflow.log_params(vars(args)) to log them. Somewhere in the code I was assigning a new value to the Namespace called hidden_Size which is identical to hidden_size but with a different name.

Doing some reading tells me that this is a SQL problem where I probably was assigning the same value for two keys. But after doing some reading it seems that MySQL is case-sensitive in Linux which is what I'm using. If SQL is case-sensitive, why am I getting a duplicate entry error? Shouldn't hidden_size and hidden_Size both be treated as separate entries?

CodePudding user response:

String data values are case insensitive if they use a case insensitive collation. This is the default on all operating systems.

Case-sensitive collations end in _bin or _cs. See https://dev.mysql.com/doc/refman/8.0/en/case-sensitivity.html

Case-sensitivity of string values is independent of case-sensitivity of table identifiers.

Linux has case-sensitive table identifiers by default. For full details on this, read: https://dev.mysql.com/doc/refman/8.0/en/identifier-case-sensitivity.html

  • Related