Home > Software design >  How to run a subquery in hive
How to run a subquery in hive

Time:02-13

I have this query that I am trying to run in HIVE:

select transaction_date, count(total_distinct) from (
SELECT transaction_date, concat(subid,'**', itemid) as total_distinct
FROM TBL_1
group by transaction_date, subid,itemid
) group by transaction_date

What I am trying to do it get the distinct combination of subid and itemid, but I need the total count per day. When I run the query above, I get this error:

Error while compiling statement: FAILED: ParseException line 6:2 cannot recognize input near 'group' 'by' 'TRANSACTION_DATE' in subquery source

The query looks correct to me though. Has anyone encountered this error?

CodePudding user response:

Hive requires subqueries to be aliased, so you need to specify a name for it:

select transaction_date, count(total_distinct) from (
SELECT transaction_date, concat(subid,'**', itemid) as total_distinct
FROM TBL_1
group by transaction_date, subid,itemid
) dummy  -- << note here
group by transaction_date

True, the error message is far from helpful.

  • Related