I have two tables in a one-to-many relation, created a while ago, not very well thought out.
CREATE TABLE DATA_CONNECTION (
ID bigint(20) NOT NULL,
OWNER_ID bigint(20) NOT NULL,
PROTOCOL_ID bigint(20) NOT NULL,
CONNECTION_METHOD smallint(6) NOT NULL,
ADDRESS varchar(255),
PORT INTEGER,
USERNAME varchar(255),
PASSWORD varchar(255),
FILE_ENCRYPTION_PASSWORD varchar(255),
SECONDARY_PASSWORD varchar(255),
HOST_KEY varchar(255),
ENCRYPTION_TYPE varchar(255),
PUBLIC_KEY_FILE_NAME varchar(255),
PRIVATE_KEY_FILE_NAME varchar(255),
DATA_FILE_NAME_REGEX varchar(255),
METADATA_FILE_NAME_REGEX varchar(255),
REMOTE_FILE_NAME_REGEX varchar(255),
TIME_ZONE varchar(255),
IS_ENABLED tinyint(1) NOT NULL DEFAULT 0,
PRIMARY KEY (ID),
CONSTRAINT FK_OWNER_ID FOREIGN KEY (OWNER_ID) REFERENCES USER (ID) ON DELETE CASCADE,
CONSTRAINT FK_PROTOCOL_ID FOREIGN KEY (PROTOCOL_ID) REFERENCES PROTOCOL (ID) ON DELETE RESTRICT
);
CREATE TABLE DATA_CONNECTION_FRAGMENT (
ID bigint(20) NOT NULL,
DATA_CONNECTION_ID bigint(20) NOT NULL,
NAME varchar(255),
AUTH varchar(255),
IS_ENABLED tinyint(1),
PRIMARY KEY (ID),
CONSTRAINT FK_DATA_CONNECTION_ID FOREIGN KEY (DATA_CONNECTION_ID) REFERENCES DATA_CONNECTION (ID) ON DELETE CASCADE
);
The problem is that a many of the fields in DATA_CONNECTION
are optional. Depending on the PROTOCOL_ID
and the CONNECTION_METHOD
, some are used and the others are null. I would like to just have a CONNECTION_DETAILS
column as a json type, including the DATA_CONNECTION_FRAGMENT
table. Like this.
CREATE TABLE DATA_INTEGRATION (
ID bigint(20) NOT NULL,
OWNER_ID bigint(20) NOT NULL,
PROTOCOL_ID bigint(20) NOT NULL,
CONNECTION_METHOD smallint(6) NOT NULL,
IS_ENABLED tinyint(1) NOT NULL DEFAULT 0,
CONNECTION_DETAILS JSON, -- Something like this {address: string, password: string, fragments: [{name: string, auth: string}, ...etc], ...etc}
CONSTRAINT FK_OWNER_ID FOREIGN KEY (OWNER_ID) REFERENCES USER (ID) ON DELETE CASCADE,
CONSTRAINT FK_PROTOCOL_ID FOREIGN KEY (PROTOCOL_ID) REFERENCES PROTOCOL (ID) ON DELETE RESTRICT
);
So, is there any way to migrate the data with only a sql command? Something like this maybe?
INSERT INTO DATA_INTEGRATION (
ID,
PROTOCOL_ID,
CONNECTION_METHOD,
OWNER_ID,
IS_ENABLED,
CONNECTION_DETAILS
) SELECT
C.ID,
C.PROTOCOL_ID,
C.CONNECTION_METHOD,
C.OWNER_ID,
C.IS_ENABLED,
JSON_OBJECT (
'address', C.ADDRESS,
'port', C.PORT,
'username', C.USERNAME,
'password', C.PASSWORD,
'secondaryPassword', C.SECONDARY_PASSWORD,
'hostKey', C.HOST_KEY,
'encryptionType', C.ENCRYPTION_TYPE,
'publicKeyFileName', C.PUBLIC_KEY_FILE_NAME,
'privateKeyFileName', C.PRIVATE_KEY_FILE_NAME,
'dataFilenameRegex', C.DATA_FILENAME_REGEX,
'metadataFilenameRegex', C.METADATA_FILENAME_REGEX,
'remoteFilenameRegex', C.REMOTE_FILENAME_REGEX,
'timeZone', C.TIME_ZONE,
'fragments', JSON_ARRAYAGG(JSON_OBJECT(
'name', F.NAME,
'auth', F.AUTH,
'isEnabled', F.IS_ENABLED
))
)
FROM DATA_CONNECTION C
JOIN DATA_CONNECTION_FRAGMENT F ON F.DATA_CONNECTION_ID = C.ID;
The above isn't working for me, but I'm not sure what I can do to fix it. It only returns one row. Maybe the JSON_ARRAYAGG is breaking it? Maybe I'm not using the right kind of JOIN?
The existing tables have quite a small number of records, so I'm not worried about stressing the db with this.
Thanks!
CodePudding user response:
You can get the results you need by using a sub-select instead of a join:
SELECT
C.ID,
C.PROTOCOL_ID,
C.CONNECTION_METHOD,
C.OWNER_ID,
C.IS_ENABLED,
JSON_OBJECT (
'address', C.ADDRESS,
'port', C.PORT,
'username', C.USERNAME,
'password', C.PASSWORD,
'secondaryPassword', C.SECONDARY_PASSWORD,
'hostKey', C.HOST_KEY,
'encryptionType', C.ENCRYPTION_TYPE,
'publicKeyFileName', C.PUBLIC_KEY_FILE_NAME,
'privateKeyFileName', C.PRIVATE_KEY_FILE_NAME,
'dataFilenameRegex', C.DATA_FILENAME_REGEX,
'metadataFilenameRegex', C.METADATA_FILENAME_REGEX,
'remoteFilenameRegex', C.REMOTE_FILENAME_REGEX,
'timeZone', C.TIME_ZONE,
'fragments', (
SELECT CAST(
CONCAT('[',
GROUP_CONCAT(
JSON_OBJECT(
'name', F.NAME,
'auth', F.AUTH,
'isEnabled', F.IS_ENABLED
)
),
']'
) AS JSON)
FROM DATA_COLLECTION_FRAGMENT F
WHERE F.DATA_COLLECTION_ID = C.ID
)
)
FROM DATA_COLLECTION C;
See How do I generate nested json objects using mysql native json functions?
CodePudding user response:
The JSON_ARRAYAGG() is an aggregating function, like COUNT() or SUM(). If you use this in your query without specifying a GROUP BY, the default is to reduce the result to a single "group" (i.e. a single row).
I think in this case you want to GROUP BY C.ID
so the result has one row per collection, and then the JSON_ARRAYAGG() will aggregate into an array only the fragments for the respective collection.
SELECT
C.ID,
...
JSON_OBJECT (
...
'fragments', JSON_ARRAYAGG(JSON_OBJECT(
'name', F.NAME,
'auth', F.AUTH,
'isEnabled', F.IS_ENABLED
))
)
FROM DATA_CONNECTION C
JOIN DATA_CONNECTION_FRAGMENT F
ON F.DATA_CONNECTION_ID = C.ID
GROUP BY C.ID;
I have to comment that I wouldn't recommend to convert a perfectly normal table into JSON. There is no problem leaving several columns NULL in a table.