I want to parse query in postgres
I've got string query, for example Select * FROM dm.lib Update sc.qar Join ws.gys
How can I get first word after following pattern words - SELECT, UPDATE, JOIN?
I use this query and get many NULL values.
with src(txt) as (
values ('Select * FROM dm.lib Update sc.qar Join ws.gys')
)
select unnest(regexp_matches (txt, 'FROM ([^ ] ) | SELECT ([^ ] ) | UPDATE ([^ ] ) | JOIN ([^ ] ) | DELETE ([^ ] ) | MERGE ([^ ] )','g'))
from src
where ;
it is important that upper case of lower doesn't matter. My query doesn't work correctly
CodePudding user response:
demo
Probably over engineered. There should be a simple way, but the following is how I do.
WITH src (txt) AS (
VALUES ('Select * FROM dm.lib Update sc.qar Join ws.gys'))
,cte (keywords) AS (
SELECT unnest('{FROM,select,UPDATE,JOIN,DELETE,MERGE}'::text[]))
,cte1 AS (
SELECT
regexp_instr (lower(txt), lower(keywords) || '\s ') AS BEGIN
,lower(keywords) || '\s ' AS pattern
,lower(keywords) AS keywords
,lower(txt) AS src
FROM cte, src
)
,cte2 AS (
SELECT
*
,regexp_instr (src, pattern,BEGIN, 1, 1) AS pattern_end
,coalesce(lead(BEGIN) OVER (ORDER BY BEGIN),length(src) 2)
- 1
- regexp_instr (src, pattern,BEGIN, 1,1) AS gap
FROM cte1 WHERE BEGIN != 0
ORDER BY 1
)
SELECT
*, substring(src FROM pattern_end FOR gap)
FROM cte2;
src, cte, cte1 used to normalized strings.
Key function is regexp_instr (only in postgres 15: https://www.postgresql.org/docs/current/functions-matching.html).
Key gotcha is that You can use regexp_instr get the match substring begin char position and end char position, It feel like char pointer walk through in C.