I want to get text before certain part of the string. In fact, it is name of the company in each case which appears at the start of the string (it can be multiple words). Space characters may vary. Some expected outputs are:
Text output
Company-1 Gift Card $100 Company-1
Company 2 Digital Gift Card $100 Company 2
Company2 Digital Gift Card $50 Company2
Company3 Digital Gift Card $100 Company3
Company4 Gift Cards $100 - Digital Company4
Company2 Digital Gift Card $250 Company2
Company1 Digital Gift Card $50 Company1
Company4 Digital Gift Card $100 Company4
Company_5 Digital Gift Cards $100 Company_5
Company1 Digital Gift Card $250 Company1
Company 6 Gift Card $100 - Physical Company 6
Company2 Gift Card $100 - Digital Company2
I have done the following, but it seems that non-capturing mode of regular expression (?:) doesn't work in redshift. So, don't know how to solve this.
(^.*)(?:[\\s] (?:Digital)?[\\s].*)
Thanks for your help.
CodePudding user response:
I still hope someone can find an answer using regex.
I found a non regex solution using split_part
for now:
CASE WHEN LOWER(text) LIKE '%digital' then split_part(LOWER(text), 'gift',1)
WHEN LOWER(text) LIKE '%digital%' then split_part(LOWER(text), 'digital',1)
ELSE split_part(LOWER(text), 'gift',1) END AS company_name
CodePudding user response:
I'm no heavy hitter with SQL nor am I familiar with Redshift, but reading some documentation makes me believe that redshift's regex functions actually do support non-capture groups a.o. different to the POSIX ERE standard.
Use the 5th parameter in the REGEX_REPLACE() function. We simply write 'p' to interpret the pattern with Perl Compatible Regular Expression (PCRE) dialect.
In your case, this should then work:
SELECT regexp_replace(text, '(\\s Digital)?\\s Gift Card.*', '', 1, 'p') AS company_name
FROM YourTable;
Matching left-to-right:
(\\s Digital)?
- Optionally match: 1 whitespace characters before a literal 'Digital';\\s Gift Card
- Match: 1 whitespace characters before literally 'Gift Card';.*
- Match 0 characters other than newline (remainder of string).
Replace matched substring with an empty string. See an online regex101 demo.