I have a data in Telephone column, something as below:
with t (id,telephone) as (values
(1234560,'(345) 908-1234 '),
(1234561,'(345) 908-1234;ext=1234'),
(1234562 ,'(345) 908-1234;ext=12345'),
(1234563 ,'(345) 908-12345;ext=12345'),
(1234564 ,'1234567;ext=12345'),
(1234565 ,'123ABC;ext=12345'),
(1234566 ,'Ext=12345'),
(1234567 ,' 1123456789')
)
select * from t
Any string in the format (XXX) XXX-XXXX;ext=XXXX(not definite)
or (XXX) XXX-XXXX
will be considered as valid number and remaining cases will be considered as invalid
Below is what I have tried
select ID,
TELEPHONE,
/* To check format of the telephone */
case when TELEPHONE similar to '\(\d{3}\) \d{3}-\d{4}' then 0
when TELEPHONE similar to '\(\d{3}\) \d{3}-\d{4};ext=%' then 0
else 1
end as TEL_IND
from t
In the above mentioned samples #1,#2,#3 are valid formats and remaining are invalid formats Appreciate any help!
CodePudding user response:
with t (id,telephone) as (values
(1234560,'(345) 908-1234 '),
(1234561,'(345) 908-1234;ext=1234'),
(1234562 ,'(345) 908-1234;ext=12345'),
(1234562 ,'(345) 908-1234;ex=12345'),
(1234563 ,'(345) 908-12345;ext=12345'),
(1234564 ,'1234567;ext=12345'),
(1234565 ,'123ABC;ext=12345'),
(1234566 ,'Ext=12345'),
(1234567 ,' 1123456789')
),
captures AS
(
SELECT
t.*,
REGEXP_MATCHES(trim(telephone), '\(([0-9]{3})\) ([0-9]{3})-([0-9]{4})(?:;ext=([0-9]*))?|.*') AS parts
FROM
t
)
SELECT
*,
Parts[1],
Parts[2],
Parts[3],
Parts[4]
FROM
captures