I try to use listagg
over xmlagg
to bypass the length limitation of 4000 bytes of listagg
.
My statement however includes a case when
statement, causing different results:
Select date_column as "Some Date",
sum(case when status = 'active' then 1 else 0 end) as "active id numbers",
listagg(case when status = 'active' then id_number END, ',' on overflow truncate) within group (order by id_number) as "Listagg List of active id numbers"
rtrim(xmlagg(xmlelement(e,case when status = 'active' then id_number END,',').extract('//text()') order by id_number).GetClobVal(),',') AS "Xmlagg List of active id numbers"
from mytable
group by date_column;
Result:
Date | Active id numbers | Listagg List of active id numbers | Xmlagg List of active id numbers |
---|---|---|---|
01/01/2021 | 4 | 2334,12333,512333,611651 | 2334,,,,,12333,,,,512333,,611651,, |
01/02/2021 | 6 | 4213,6481,7900,16515,22060 | ,,,4213,,,6481,,,,,,7900,,16515,22060,,,,,, |
01/03/2021 | 2 | 416,754 | ,,,416,,,,,,,,754,,, |
I face two issues with xmlagg over listagg:
- apparently, the function of xmlagg would always trigger even if
case when
is not true, causing empty,
eliminators - performance: it is much slower (~3 times the execution time)
Did I do some mistake with using case when
within xmlagg
?
Thanks in advance!
CodePudding user response:
As I said in my previous answer, you could alternatively filter in the WHERE
clause rather than using a CASE
expression:
Select date_column as "Some Date",
COUNT(*) as "active id numbers",
RTRIM(
XMLAGG(
XMLELEMENT(
E,
id_number,
','
).EXTRACT('//text()')
ORDER BY id_number
).GetClobVal(),
','
) AS LIST
from mytable
where status = 'active'
group by date_column;
That way you will not include the non-active values that will be converted to null
by the CASE
expression.
db<>fiddle here
Or, you can use a custom aggregation function.
As for the difference, LISTAGG
is documented to ignore NULL
values:
The arguments to the function are subject to the following rules:
- The
measure_expr
can be any expression. Null values in the measure column are ignored.
XMLAGG
does the same:
XMLAgg
is an aggregate function. It takes a collection of XML fragments and returns an aggregated XML document. Any arguments that return null are dropped from the result.
However, if you do:
SELECT DUMP(XMLELEMENT(E, NULL, ',').EXTRACT('//text()')) FROM DUAL;
You will find that the output has Typ=58
which is an XMLTYPE
data type (and not a string) so the values being aggregated are not NULL
and are not going to be ignored.
For example:
CREATE TABLE table_name (value) AS
SELECT 1 FROM DUAL UNION ALL
SELECT 2 FROM DUAL UNION ALL
SELECT NULL FROM DUAL UNION ALL
SELECT 4 FROM DUAL;
Select LISTAGG(value, ',') WITHIN GROUP (ORDER BY ROWNUM) AS listagg_list,
RTRIM(
XMLAGG(
XMLELEMENT(
E,
value,
','
).EXTRACT('//text()')
ORDER BY ROWNUM
).GetClobVal(),
','
) AS xmlagg_list
from table_name;
Outputs:
LISTAGG_LIST XMLAGG_LIST 1,2,4 1,2,,4
db<>fiddle here
And regarding performance, you are asking the XMLAGG
solution to create XML elements and then extract the text from the XML element and then aggregate and convert the XMLType to a CLOB (rather than a smaller VARCHAR2) and then to strip trailing commas; it is unsurprising that, when it doing so much more work, that it takes longer.