Home > Software design >  Finding Substring Using REGEX_SUBSTR
Finding Substring Using REGEX_SUBSTR

Time:06-01

I am trying to extract the sub-string of Twitter and Fortnite, Facebook and Words with Friends, & Expedia and Notion from the strings below. These are just examples and the common pattern amongst all of them is "AppNames";s:xx:" with xx being a different number for each string. I'm having a difficult time accounting for the different numbers. Here is my current REGEX_SUBSTR query that doesn't take into account the varying numerical value in s:xx

 REGEXP_SUBSTR(FORM_FIELDS, '"AppNames";s:17:\"([^\"] )', 1, 1, 'e')

Just to reiterate, the above query will output Twitter and Fortnite, which is correct for the first string but I have many other strings that are similar but have a different numerical patterns such as...

  • "AppNames";s:35
  • "AppNames";s:50
  • "AppNames";s:44

STRINGS

s:25:"Social/Games";s:14:"AppNames";s:17:"Twitter and Fortnite";s:12:"Audience";s:20;}
s:25:"Social/Games";s:14:"AppNames";s:35:"Facebook and Words with Friends";s:12:"Audience";s:20;}
s:15:"Travel/Productivity";s:19:"AppNames";s:50:"Expedia and Notion";s:12:"Audience";s:20;}

CodePudding user response:

with data(FORM_FIELDS) as (
    SELECT * from values
    ('s:25:"Social/Games";s:14:"AppNames";s:17:"Twitter and Fortnite";s:12:"Audience";s:20;}'),
    ('s:25:"Social/Games";s:14:"AppNames";s:35:"Facebook and Words with Friends";s:12:"Audience";s:20;}'),
    ('s:15:"Travel/Productivity";s:19:"AppNames";s:50:"Expedia and Notion";s:12:"Audience";s:20;}')
)
select 
    REGEXP_SUBSTR(FORM_FIELDS, '"AppNames";s:[0-9] :\"([^\"] )', 1, 1, 'e') as new
    ,REGEXP_SUBSTR(FORM_FIELDS, '"AppNames";s:17:\"([^\"] )', 1, 1, 'e') as old
from data;

gives:

NEW OLD
Twitter and Fortnite Twitter and Fortnite
Facebook and Words with Friends null
Expedia and Notion null

so [0-9] will match 1 digits, you can also use \\d but if you only want 2 \\d{2} will force that.

  • Related