Home > front end >  How to capture all phrases which doesn't have a pattern in the middle of theirself?
How to capture all phrases which doesn't have a pattern in the middle of theirself?

Time:09-26

I want to capture all strings that doesn't have the pattern _ a[a-z]* _ in the specified position in the example below:

<?php 

$myStrings = array(
  "123-456",
  "123-7-456",
  "123-Apple-456",
  "123-0-456",
  "123-Alphabet-456"
);

foreach($myStrings as $myStr){
  echo var_dump(
    preg_match("/123-(?!a[a-z]*)-456/i", $myStr)
  );
}

?>

CodePudding user response:

A lookahead is a zero-length assertion. The middle part also needs to be consumed to meet 456. For consuming use e.g. \w - for one or more word characters and hyphen inside an optional group that starts with your lookahead condition. See this regex101 demo (i flag for caseless matching).

Further for searching an array preg_grep can be used (see php demo at tio.run).

preg_grep('~^123-(?:(?!a[a-z]*-)\w -)?456$~i', $myStrings);

There is also an invert option: PREG_GREP_INVERT. If you don't need to check for start and end a more simple pattern like -a[a-z]*- without lookahead could be used (another php demo).

CodePudding user response:

You can check the following solution at this Regex101 share link.

^(123-(?:(?![aA][a-zA-Z]*).*)-456)|(123-456)$

It uses regex non-capturing group (?:) and regex negative lookahead (?!) to find all inner sections that do not start with 'a' (or 'A') and any letters after that. Also, the case with no inner section (123-456) is added (with the | sign) as a 2nd alternative for a wrong pattern.

CodePudding user response:

Match the pattern and invert the result:

!preg_match('/a[a-z]*/i', $yourStr);

Don't try to do everything with a regex when programming languages exist to do the job.

CodePudding user response:

You are not getting a match because in the pattern 123-(?!a[a-z]*)-456 the lookahead assertion (?!a[a-z]*) is always true because after matching the first - it has to directly match another hyphen like the pattern actually 123--456

If you move the last hyphen inside the lookahead like 123-(?!a[a-z]*-)456 you only get 1 match for 123-456 because you are actually not matching the middle part of the string.

Another option with php can be to consume the part that you don't want, and then use SKIP FAIL

^123-(?:a[a-z]*-(*SKIP)(*F)|\w -)?456$

Explanation

  • ^ Start of string
  • 123- Match literally
  • (?: Non capture group for the alternation
    • a[a-z]*-(*SKIP)(*F) Match a, then optional chars a-z, then match - and skip the match
    • | Or
    • \w - Match 1 word chars followed by -
  • )? Close the non capture group and make it optional to also match when there is no middle part
  • 456 Match literally
  • $ End of string

Regex demo

Example

$myStrings = array(
    "123-456",
    "123-7-456",
    "123-Apple-456",
    "123-0-456",
    "123-Alphabet-456",
    "123-b-456"
);

foreach($myStrings as $myStr) {
    if (preg_match("/^123-(?:a[a-z]*-(*SKIP)(*F)|\w -)?456$/i", $myStr, $match)) {
        echo "Match for $match[0]" . PHP_EOL;
    } else {
        echo "No match for $myStr" . PHP_EOL;
    }
}

Output

Match for 123-456
Match for 123-7-456
No match for 123-Apple-456
Match for 123-0-456
No match for 123-Alphabet-456
Match for 123-b-456
  • Related