Say I have a table:
CREATE TABLE staff (
id INT,
name CHAR(9)
);
With data:
INSERT INTO staff (id, name) VALUES (1, 'Joe');
INSERT INTO staff (id, name) VALUES (2, 'Bob');
INSERT INTO staff (id, name) VALUES (3, 'Alice');
I need to create a multi row UDF, something like the built-in AVG
function, such that I can call it in the following manner:
SELECT vowel_count(name) FROM staff;
And assuming vowels are [AaEeIiOoUu]
, get the following result:
| vowel_count(name) |
|-------------------|
| 6 |
What is the syntax to take a table column as input to a UDF?
CREATE OR REPLACE FUNCTION vowel_cnt(/* what goes here? */)
RETURN NUMBER
IS
...
BEGIN
...
END;
The function must be table agnostic, just like SUM
, AVG
, etc.
I am using Oracle PL/SQL and SQL Developer.
CodePudding user response:
As @WilliamRoberston said, you can define your own aggregate function with OCI Data Cartridge.
Partly because I haven't done this for a while and wanted to remind myself, here's a working implementation to count vowels.
First, create an object type with the required functions, and a numeric variable to hold the total count:
create or replace type t_vowel_count as object (
g_count number,
static function ODCIAggregateInitialize(
p_ctx in out t_vowel_count
) return number,
member function ODCIAggregateIterate(
self in out t_vowel_count, p_string varchar2
) return number,
member function ODCIAggregateTerminate(
self in out t_vowel_count, p_result out number, p_flags in number
) return number,
member function ODCIAggregateMerge(
self in out t_vowel_count, p_ctx in t_vowel_count
) return number
);
/
Then create the type body, with - in this case - fairly straightforward function bodies:
create or replace type body t_vowel_count as
static function ODCIAggregateInitialize(
p_ctx in out t_vowel_count
) return number is
begin
p_ctx := t_vowel_count(null);
-- initialise count to zero
p_ctx.g_count := 0;
return ODCIConst.success;
end ODCIAggregateInitialize;
member function ODCIAggregateIterate(
self in out t_vowel_count, p_string varchar2
) return number is
begin
-- regex is clearer...
-- self.g_count := self.g_count regexp_count(p_string, '[aeiou]', 1, 'i');
-- but translate is faster...
self.g_count := self.g_count
coalesce(length(p_string), 0)
- coalesce(length(translate(p_string, 'xaAeEiIoOuU', 'x')), 0);
return ODCIConst.success;
end ODCIAggregateIterate;
member function ODCIAggregateTerminate(
self in out t_vowel_count, p_result out number, p_flags in number
) return number is
begin
p_result := self.g_count;
return ODCIConst.success;
end ODCIAggregateTerminate;
member function ODCIAggregateMerge(
self in out t_vowel_count, p_ctx in t_vowel_count
) return number is
begin
self.g_count := self.g_count p_ctx.g_count;
return ODCIConst.success;
end ODCIAggregateMerge;
end t_vowel_count;
/
The count of vowels in each individual string could be done in various ways; regular expressions are clear but slow, so I've shown a translate()
version which should be fast. I've included @MTO's suggestion to wrap that in coalesce for a null result (for the edge case where the string consists only of vowels, and also to handle null inputs (though it gets the right result without those changes; it's safer to assume it might not one day...).
And finally create the function that uses that type:
create or replace function vowel_count (p_string varchar2)
return number
parallel_enable
aggregate using t_vowel_count;
/
With your sample data you can now do:
SELECT vowel_count(name) FROM staff;
VOWEL_COUNT(NAME) |
---|
6 |
fiddle including the edge cases.
CodePudding user response:
You can't get your desired result with:
SELECT vowel_count(name) FROM staff;
becouse select works row by row meaning that the function will get parameters:
- 'Joe' for row 1
- 'Bob' for row 2
- 'Alice' for row 3 The result would be:
vowel_count(name) |
---|
2 |
1 |
3 |
If you sum it up afterwords you will get 6.
Here are 2 functions:
One to count vowels from string
and
One to get you count of 6 vowels from table (I named it A_TBL) column "NAME"
...
create or replace Function VOWEL_COUNT_FROM_STRING(p_string VarChar2) RETURN Number IS
BEGIN
Declare
vowels VarChar2(5) := 'AEIOU';
mRet Number(6) := 0;
Begin
For i in 1..Length(p_string) Loop
If InStr(vowels, SubStr(Upper(p_string), i, 1)) > 0 Then
mRet := mRet 1;
End If;
End Loop;
RETURN mRet;
End;
END VOWEL_COUNT_FROM_STRING;
-- ----------------------------------------------------------
create or replace Function VOWEL_COUNT RETURN Number IS
BEGIN
Declare
CURSOR c IS SELECT Upper(NAME) FROM A_TBL;
vowels VarChar2(5) := 'AEIOU';
mName A_TBL.NAME%TYPE;
mRet Number(6) := 0;
Begin
OPEN c;
LOOP
FETCH c InTo mName;
EXIT WHEN c%NOTFOUND;
For i in 1..Length(mName) Loop
If InStr(vowels, SubStr(mName, i, 1)) > 0 Then
mRet := mRet 1;
End If;
End Loop;
END LOOP;
Close c;
RETURN mRet;
End;
END VOWEL_COUNT;
If you call them from select statement like here:
Select ID, NAME, VOWEL_COUNT_FROM_STRING(NAME) "VOWELS", VOWEL_COUNT() "TOTAL_VOWELS" From A_TBL
... the result would be
ID | NAME | VOWELS | TOTAL_VOWELS |
---|---|---|---|
1 | Joe | 2 | 6 |
2 | Bob | 1 | 6 |
3 | Alice | 3 | 6 |
But, the same result you can get using analytic function Sum() Over() with the first function.
Select ID, NAME, VOWEL_COUNT_FROM_STRING(NAME) "VOWELS", Sum(VOWEL_COUNT_FROM_STRING(NAME)) OVER() "TOTAL_VOWELS" From A_TBL
The second one is here just as an example as it doesn't make sense anyway for it is fixed to one table and one column of the table.
It would make more sense to create function that will count vowels from any table's column:
create or replace Function VOWEL_COUNT_FROM_TABLE_COLUMN(p_table VarChar2, p_column VarChar2) RETURN Number IS
BEGIN
Declare
vowels VarChar2(5) := 'AEIOU';
mCmd VarChar2(1000);
mString VarChar2(32000); -- NOTE the limitation in this variable length
mRet Number(6) := 0;
Begin
mCmd := 'Select LISTAGG(' || p_column || ', '','') WITHIN GROUP (Order By ' || p_column || ') From ' || p_table;
Execute Immediate mCmd Into mString;
--
For i in 1..Length(mString) Loop
If InStr(vowels, SubStr(Upper(mString), i, 1)) > 0 Then
mRet := mRet 1;
End If;
End Loop;
--
RETURN mRet;
End;
END VOWEL_COUNT_FROM_TABLE_COLUMN;
NOTE: there is a limitation using LISTAGG into VarChar2 variable - for many rows or aggregated string too long - the function should be changed to do some looping instead...
usage in this sample
Select VOWEL_COUNT_FROM_TABLE_COLUMN('A_TBL', 'NAME') "TOTAL_VOWELS" From Dual;
-- Result:
-- TOTAL_VOWELS
-- ------------
-- 6
CodePudding user response:
Don't write a custom aggregation function.
Write a scalar function:
CREATE FUNCTION vowel_count(
value IN VARCHAR2
) RETURN NUMBER DETERMINISTIC
IS
BEGIN
RETURN LENGTH(value) - COALESCE(LENGTH(TRANSLATE(value, '_AaEeIiOoUu', '_')), 0);
END;
/
Then, if you want to aggregate an entire column you can use:
SELECT SUM(vowel_count(name)) AS total_vowel_count
FROM staff;
Which, for the sample data:
CREATE TABLE staff (id, name) AS
SELECT 1, 'Alice' FROM DUAL UNION ALL
SELECT 2, 'Betty' FROM DUAL UNION ALL
SELECT 3, 'Carol' FROM DUAL UNION ALL
SELECT 4, 'Aeia' FROM DUAL;
Outputs:
TOTAL_VOWEL_COUNT |
---|
10 |
Then if you want to find the minimum, maximum, average, etc. number of vowels then you can easily find it using the scalar function and wrapping it with a built-in aggregation function rather than having to create many different user-defined aggregation functions for each individual use-case.