I'm using Hive hql. I am trying to inner join two tables filtering on issue_type='Impediments'
Now I have a new requirement to join dm_jira__label
to include the label
and issue_id
columns. I have tried having a subquery adding the issue_id
and label
by using a left join with dm_jira__label
on issue_id
INNER JOIN datamart_core.dm_jira__release
ON dm_jira.issue_id = dm_jira__release.issue_id;
(
SELECT b.issue_id, b.label AS jira_label
FROM datamart_core.dm_jira__label as B, datamart_core.dm_jira__release AS K
LEFT JOIN b
ON b.issue_id=k.issue_id
);
WHERE dm_jira.issue_type = 'Impediment') AS J
I am getting the following error:
AnalysisException: Illegal table reference to non-collection type: 'b' Path resolved to type: STRUCT<issue_id:DOUBLE,label:STRING>
See the full code below. thanks in advance.
SELECT DISTINCT
j.project_key AS jira_project_key,
j.issue_type,
j.issue_assignee AS impediment_owner,
j.issue_status AS impediment_status,
j.issue_priority AS impediment_priority,
j.issue_summary AS impediment_summary,
j.`release` AS jira_release,
j.sow AS sow_num,
j.issue_due_date_utc AS jira_issue_due_date_utc,
j.issue_id AS jira_issue_id,
s.sow_family
from (
--Subquery to combine dm_jira and dm_jira__release
SELECT dm_jira.project_key,
dm_jira.issue_type,
dm_jira.issue_assignee,
dm_jira.issue_status,
dm_jira.issue_priority,
dm_jira.issue_summary,
dm_jira.issue_due_date_utc,
dm_jira.issue_id,
dm_jira__release.`release`,
dm_jira__release.sow
from datamart_core.dm_jira
INNER JOIN datamart_core.dm_jira__release
ON dm_jira.issue_id = dm_jira__release.issue_id;
(
SELECT b.issue_id, b.label AS jira_label
FROM datamart_core.dm_jira__label as B, datamart_core.dm_jira__release AS K
LEFT JOIN b
ON b.issue_id=k.issue_id
);
WHERE dm_jira.issue_type = 'Impediment') AS J
INNER JOIN datamart_core.dm_asoe_jira_scrum_summary AS S
ON j.`release` = s.jira_release
AND j.sow = s.sow_num
AND j.project_key = s.jira_project_key;
CodePudding user response:
;
ends a whole statement, don't use it at the end of sub-queries or joins.
Using meaningless aliases such as B
or K
or J
harms readability, don't do it.
FROM x, y
is the same as FROM x CROSS JOIN y
, it's not a list of tables you're going to join. This means that you have the following code...
(
SELECT
b.issue_id, b.label AS jira_label
FROM
datamart_core.dm_jira__label as B
CROSS JOIN
datamart_core.dm_jira__release AS K
LEFT JOIN
b
ON b.issue_id=k.issue_id
)
- The
b
in theLEFT JOIN
isn't a table, and causes your syntax error.
Then, your sub query just sits in the middle of the code, it's not joined on or used in any way. I think you intended a pattern more like this...
FROM
datamart_core.dm_jira
INNER JOIN
datamart_core.dm_jira__release
ON dm_jira.issue_id = dm_jira__release.issue_id;
LEFT JOIN
(
<your sub-query>
)
AS fubar
ON fubar.something = something.else
WHERE
dm_jira.issue_type = 'Impediment'
Even then, you don't actually need nested sub-queries at all. You can just keep adding joins, such as this...
SELECT
jira.project_key AS jira_project_key,
jira.issue_type,
jira.issue_assignee AS impediment_owner,
jira.issue_status AS impediment_status,
jira.issue_priority AS impediment_priority,
jira.issue_summary AS impediment_summary,
jrel.`release` AS jira_release,
jrel.sow AS sow_num,
jira.issue_due_date_utc AS jira_issue_due_date_utc,
jira.issue_id AS jira_issue_id,
jlab.label,
summ.sow_family
FROM
datamart_core.dm_jira AS jira
INNER JOIN
datamart_core.dm_jira__release AS jrel
ON jrel.issue_id = jira.issue_id
LEFT JOIN
datamart_core.dm_jira__label AS jlab
ON jlab.issue_id = jrel.issue_id
INNER JOIN
datamart_core.dm_asoe_jira_scrum_summary AS summ
ON summ.jira_release = jrel.`release`
AND summ.sow_num = jrel.sow
AND summ.jira_project_key = jira.project_key
WHERE
jira.issue_type = 'Impediment'
;