I was given a query and I am attempting to modify it in order to get the most recent version of each COMP_ID. The original query:
SELECT
ANY_VALUE(DATA_INDEX)::string AS DATA_INDEX,
COMP_ID::string AS COMP_ID,
ANY_VALUE(ACCOUNT_ID)::string AS ACCOUNT_ID,
ANY_VALUE(COMP_VERSION)::string AS COMP_VERSION,
ANY_VALUE(NAME)::string AS NAME,
ANY_VALUE(DESCRIPTION)::string AS DESCRIPTION,
MAX(OBJECT_DICT:"startshape-type")[0]::string AS STARTSHAPE_TYPE,
MAX(OBJECT_DICT:"startshape-connector-type")[0]::string AS STARTSHAPE_CONNECTOR_TYPE ,
MAX(OBJECT_DICT:"startshape-action-type")[0]::string AS STATSHAPE_ACTION_TYPE,
MAX(OBJECT_DICT:"overrides-enabled")[0]::string AS OVERRIDES_ENABLED
FROM COMP_DATA
GROUP BY COMP_ID
ORDER BY COMP_ID;
I then attempted to use a window function to grab only the highest version for each comp_id. This is the modified query:
SELECT
ANY_VALUE(DATA_INDEX)::string AS DATA_INDEX,
COMP_ID::string AS COMP_ID,
ANY_VALUE(ACCOUNT_ID)::string AS ACCOUNT_ID,
ANY_VALUE(COMP_VERSION)::string AS COMP_VERSION,
ANY_VALUE(NAME)::string AS NAME,
ANY_VALUE(DESCRIPTION)::string AS DESCRIPTION,
MAX(OBJECT_DICT:"startshape-type")[0]::string AS STARTSHAPE_TYPE,
MAX(OBJECT_DICT:"startshape-connector-type")[0]::string AS STARTSHAPE_CONNECTOR_TYPE ,
MAX(OBJECT_DICT:"startshape-action-type")[0]::string AS STATSHAPE_ACTION_TYPE,
MAX(OBJECT_DICT:"overrides-enabled")[0]::string AS OVERRIDES_ENABLED,
ROW_NUMBER() OVER (PARTITION BY COMP_ID ORDER BY COMP_VERSION DESC) AS ROW_NUM
FROM COMP_DATA
QUALIFY 1 = ROW_NUM;
When attempting to compile the below error is given:
SQL compilation error: [COMP_DATA.COMP_ID] is not a valid group by expression
I had originally thought the issue was the ANY_VALUE on COMP_VERSION, but after removing the ANY_VALUE the same error was given. The only way I found to not get an error was removing the 4 MAX fields and all of the ANY_VALUE()'s, as shown below:
SELECT
DATA_INDEX::string AS DATA_INDEX,
COMP_ID::string AS COMP_ID,
ACCOUNT_ID::string AS ACCOUNT_ID,
COMP_VERSION::string AS COMP_VERSION,
NAME::string AS NAME,
DESCRIPTION::string AS DESCRIPTION,
ROW_NUMBER() OVER (PARTITION BY COMP_ID ORDER BY COMP_VERSION DESC) AS ROW_NUM
FROM COMP_DATA
QUALIFY 1 = ROW_NUM;
Of course this is not at all sufficient since I need the 4 max fields.
I have also tried creating the table with the max fields and from that new table using the window function to select the highest COMP_VERSION of each COMP_ID, but the same error was given.
CodePudding user response:
When you added your QUALIFY clause you dropped the GROUP BY clause from your SQL, aggregate function like MAX, need all selections to be aggregate function OR to have a GROUP BY clause.
So if you only want the best row per the grouping clause, which you note, you aggregate functions need to be explicitly windowed. Thus
SELECT
data_index::string AS data_index,
comp_id::string AS comp_id,
account_id::string AS account_id,
comp_version::string AS comp_version,
name::string AS name,
description::string AS description,
MAX(object_dict:"startshape-type")OVER(PARTITION BY comp_id)[0]::string AS startshape_type,
MAX(object_dict:"startshape-connector-type")OVER (PARTITION BY comp_id)[0]::string AS startshape_connector_type ,
MAX(object_dict:"startshape-action-type")OVER (PARTITION BY comp_id)[0]::string AS statshape_action_type,
MAX(object_dict:"overrides-enabled")OVER(PARTITION BY comp_id)[0]::string AS overrides_enabled,
FROM COMP_DATA
QUALIFY 1 = ROW_NUMBER() OVER (PARTITION BY comp_id ORDER BY comp_version DESC);
There is a small chance you will need to add a set of brackets around those MAX's like
(MAX(object_dict:"overrides-enabled")OVER(PARTITION BY comp_id))[0]::string AS overrides_enabled,
But I suspect it will work out of the box. And I assumed you don't want the row_number so pushed it into the qualify (because it will always be the value 1)