Home > Blockchain >  How to write preg_match for a date followed by specific string?
How to write preg_match for a date followed by specific string?

Time:03-29

I want to extract date from several HTML documents. The date always follow this pattern:

  1. Starting three alphabets of month with first character in uppercase i-e Jan.
  2. Two digit numerical characters of day of the month i-e 09
  3. A comma as a separater
  4. Four digit numerical characters of year i-e 2022.

Sample of complete date is Jan 09, 2022

I want to extract only those dates which are wraped in span tags. So, the complete pattern is

<span>Jan 09, 2022</span>

I am not good at writing preg_match. Can anyone please help me?

CodePudding user response:

<span>(\w{3} \d{1,2}, \w{4})<\/span>

\w is a meta-character for the set [a-zA-Z0-9_].

{3} means thrice.

\d is a meta-character for the set [0-9].

{1,2} means once or twice.

Try it https://regex101.com/r/tNRa73/1

$pattern = '/<span>(\w{3} \d{1,2}, \w{4})<\/span>/'; 

preg_match(
  $pattern,
  $html,
  $matches // <-- The results will be added to this new variable.
);

$matches[1]; // The date will be in the first index because it was
             // the first "capture group" i.e set of parens.


// If you expect multiple dates in one HTML document, then use:
preg_match_all(
  $pattern,
  $html,
  $matches
);

$matches[1]; // Now, the first index is an array of matches of
             // the first "capture group".
  • Related