Home > Back-end >  How would I return only specific values from a comma separated column entry in HIVEQL?
How would I return only specific values from a comma separated column entry in HIVEQL?

Time:09-16

So I have a table which contains a column where each entry is a comma separated list. See below:

ID class dt
08 A,B,C,D 20220801
21 A,B,C,D,E,F,G 20220801
92 A,B,C 20220801
98 A,B,C,D,E,F,G,H 20220801

In the actual table, some of the class entries contain up to 232 different elements.

I'd like to return only specific values in a new column with the same comma separated format. So as an example from the above, I'd like ID 21 to return a column with 'C,D,E'.

Any help is appreciated.

CodePudding user response:

You may try below query which returns class entries matching with any of C,D,E from each row.

WITH sample_table AS (
  SELECT 08 ID, 'A,B,C,D' class, '20220801' dt UNION ALL
  SELECT 21, 'A,B,C,D,E,F,G', '20220801' UNION ALL
  SELECT 92, 'A,B,C', '20220801' UNION ALL
  SELECT 98, 'A,B,C,D,E,F,G,H', '20220801'
)
SELECT ID, dt, CONCAT_WS(',', collect_list(cls)) clz
  FROM sample_table t LATERAL VIEW EXPLODE(SPLIT(class, ',')) c AS cls
 WHERE cls IN ('C', 'D', 'E')
 GROUP BY ID, dt;
Query results

enter image description here

  • Related