Home > database >  extract first two word from an string exclude comma using regex in Postgresql
extract first two word from an string exclude comma using regex in Postgresql

Time:09-23

select regexp_matches('Hi, I am Harry Potter', '^[a-zA-Z0-9]*\W \w ');
select regexp_matches('Hi, I am Harry Potter', '\w \W \w ');

Both way returns {Hi, I} But expect {Hi I}. Related Question: extract first word in an String: extract the first word from a string - regex

CodePudding user response:

You cannot match disjoint (non-adjoining) parts of a string into a single group.

You can use REGEXP_REPLACE to capture the first two words into separate groups and then use two backreferences to the group in the replacement pattern to get what you need:

select regexp_replace('Hi, I am Harry Potter', '^\W*(\w )\W (\w ).*', '\1 \2');

See the online demo. The regex means

  • ^ - start of string
  • \W* - zero or more non-word chars
  • (\w ) - Group 1 (\1): one or more word chars
  • \W - one or more non-word chars
  • (\w ) - Group 2 (\2): one or more word chars
  • .* - the rest of the string.

CodePudding user response:

You can use this pattern:

select regexp_match(
          'Hi, I am Harry Potter',
          '^([[:alnum:]] )[^[:alnum:]] ([[:alnum:]] )'
       );

 regexp_matches 
════════════════
 {Hi,I}
(1 row)

The pattern matches the first sequence of alphanumerical characters, then a sequence of non-alphanumerical characters and another sequence of alphanumerical characters. The result is an array with the first and third expression, which are parenthesized in the pattern.

  • Related