Home > Mobile >  PHP preg_match_all returning blank strings in match group
PHP preg_match_all returning blank strings in match group

Time:12-22

I have the following string

$string = "5A3BB0020221209DES.ACT";

And running the following Regex

preg_match_all('/(00)|(?<!^)(?<date>2\d{7}|\d{2}\.\d{2}\.\d{2}|\d{8}|\d{6})/', $string, $m);

When I dump the output of $m['date'], I get an array like this

array(2) {
  [0]=>
  string(0) ""
  [1]=>
  string(8) "20221209"
}

I'm only wanting the second result. If I don't have the match group for (00), or there simply isn't a match for group (00), I don't get this extra blank string. Why are other match groups polluting the date match group results with blank strings? I tried adding more match groups, and it added more blank strings to the results of date, for all the match groups that found matches. I could set my code to ignore all the extra blank matches, but this seems like it should be unnecessary. In the preg_match_all docs, I see this exact same behavior in the examples, but I didn't see any explanation as to why or how to get rid of it.

CodePudding user response:

You likely want to be using a non-capturing group, which is (?:).

Eg: /(?:00)|(?<!^)(?<date>2\d{7}|\d{2}\.\d{2}\.\d{2}|\d{8}|\d{6})/

Although I am not sure that the expression does what you think it does. Eg: If the input contains 00 it will match that and only that.

I would wager that the following is more what you might be after:

(?<!^)(?:00)?(?<date>(?:2\d{7}|\d{2}\.\d{2}\.\d{2}|\d{8}|\d{6}))

Which works out like:

enter image description here

Via: Debuggex

CodePudding user response:

Because this string has two matches, "00" and "20221209".

You may not be aware that the alternation operator has the lowest precedence of all the regex operators. You probably wanted "00" OR Lookback to not the beginning, followed by what you're interested in. Instead you got "00" is a complete match or not the beginning followed by 8 digits is a match.

I'm guessing what you really want is something like

(^|(?<=00))(?<date>2\d{7}|\d{2}\.\d{2}\.\d{2}|\d{8}|\d{6})
  • Related