Home > database >  regex expression to include german characters
regex expression to include german characters

Time:09-27

I have a string that looks like this:

{"id":"1","name":"simpleword","type":"test"},{"id":"123","name":"f23ÜÜ","type":"prod"}

I want to do a REGEXP_SUBSTR_ALL such that I can extract all "name" values into a list.

Something like this allows me to extract some names but not all. This is because some names include german characters like "Ü" which are not included with '\w '

SELECT REGEXP_SUBSTR_ALL(ARRAY_TO_STRING(REGEXP_SUBSTR_ALL((ARRAY_TO_STRING(REGEXP_SUBSTR_ALL(ARTICLE_TAGS, '"name\\W \\w "'),',')), ':"\\w '),','),'\\w ') FROM TABLE

For example, the expression above would give me this output:

[
    "simpleword"
]

while my desired output is this:

[
    "simpleword", "f23ÜÜ"
]

CodePudding user response:

Just match everything that is not a ":

with data(s) as (
    select $${"id":"1","name":"simpleword","type":"test"},{"id":"123","name":"f23ÜÜ","type":"prod"}$$
)

select regexp_substr_all(s, 'name":"([^"]*)"', 1, 1, '', 1)
from data

-- [   "simpleword",   "f23ÜÜ" ]
  • Related