SQL REGEX not working like its expected to-CodePudding

I'm having issues finding a solution for this

set serveroutput on;
declare
EmailRegexp CONSTANT VARCHAR2(1000) :='^[a-z0-9!#$%&''* /=?^_`{|}~-] (\.[a-z0-9!#$%&''* /=?^_`{|}~-] )*@([a-z0-9]([a-z0-9-]*[a-z0-9])?\.) ([A-Z]{2,})$';
p_vInEmailAddress VARCHAR2(30) := '[email protected]';

begin 
dbms_output.put_line('regex: '||LOWER(SUBSTR(REGEXP_SUBSTR(p_vInEmailAddress, '\.([^.\n\s]*)$'), 2))||''''); 
end;

--results:
--regex: '
--PL/SQL procedure successfully completed.

--now if i do this instead: 
p_vInEmailAddress VARCHAR2(30) := '[email protected]';
--I get results back

--PL/SQL procedure successfully completed.

--regex: com'

What am I doing wrong and why does it not like '.net' part?

How can I fix this?

Thanks

CodePudding user response：

The problematic part of your query can be reduced to:

begin 
  dbms_output.put_line(
    REGEXP_SUBSTR(
      '[email protected]',
      '\.([^.\n\s]*)$'
    )
  ); 
end;
/

The regular expression \.([^.\n\s]*)$ is looking for:

A dot character \.; then
Zero-or-more characters that are not a dot . or a slash \ or an n or a slash \ or an s; then
The end-of-the-string.

The problem is that your string has an n character and the regular expression is excluding n as \n is interpreted as two characters and not the perl-like expression representing a single newline character. You want to replace \n with the CHR(10) character outside the string literal (or a newline inside it) and \s with the posix-expression [:space:].

What you want is:

begin 
  dbms_output.put_line(
    REGEXP_SUBSTR(
      '[email protected]',
      '\.([^.' || CHR(10) || '[:space:]]*)$'
    )
  ); 
end;
/

begin 
  dbms_output.put_line(
    REGEXP_SUBSTR(
      '[email protected]',
      '\.([^.
[:space:]]*)$'
    )
  ); 
end;
/

db<>fiddle here

CodePudding user response：

Just use

REGEXP_SUBSTR(p_vInEmailAddress,'\.([^.[:space:]]*)$')

\s does not match whitespace in character class in Oracle regex.

[:space:] matches both horizontal and vertical whitespace.

EXPLANATION

--------------------------------------------------------------------------------
  \.                       '.'
--------------------------------------------------------------------------------
  (                        group and capture to \1:
--------------------------------------------------------------------------------
    [^.[:space:]]*           any character except: '.', whitespace
                             characters (like \s) (0 or more times
                             (matching the most amount possible))
--------------------------------------------------------------------------------
  )                        end of \1
--------------------------------------------------------------------------------
  $                        before an optional \n, and the end of the
                           string