How could I extract only alphabets of this column not considering characters after number? Also, there are situations where there are no numbers. In this case, is it possible to create a condition using CASE and WHEN to apply extraction condition only in cases where there are numbers?
Input:
medication_title
GLYXAMBI 25 MG E 5 MG
GLIFAGE XR 750 MG
AD TIL
SIMETICON GOTAS
Output:
medication_title | medication_commercialname
GLYXAMBI 25 MG E 5 MG GLYXAMBI
GLIFAGE XR 750 MG GLIFAGE XR
AD TIL AD TIL
SIMETICON GOTAS SIMETICON GOTAS
Thanks!
CodePudding user response:
What about REGEXP_REPLACE()
looking for \s*\d.*
(taking any string from 0 whitespace chars a digit and 0 more chars):
WITH input(medication_title) AS (
VALUES ('GLYXAMBI 25 MG E 5 MG')
, ('GLIFAGE XR 750 MG')
, ('AD TIL')
, ('SIMETICON GOTAS')
)
SELECT REGEXP_REPLACE(medication_title, '\s*\d.*','', 'g') FROM input;
Would return:
GLYXAMBI
GLIFAGE XR
AD TIL
SIMETICON GOTAS
See an online fiddle
CodePudding user response:
SELECT trim(substring(medication_title from '[^\d]*')) FROM tablename;
Output :
GLYXAMBI
GLIFAGE XR
AD TIL
SIMETICON GOTAS
CodePudding user response:
"The substring function with two parameters, substring(string from pattern), provides extraction of a substring that matches a POSIX regular expression pattern." (see: docs)
The query:
SELECT rtrim(substring(medication_title from '[A-Z ]*')) FROM tablename:
should return the characters before any number.
The regular expression '[A-Z ]*' will match all (capital) letters from 'A' to 'Z' and a space, any number of times (because of the *
).
The function rtrim()
will remove the spaces at the end.