Home > Enterprise >  Regex: whitepsace and   in non-capturing group
Regex: whitepsace and   in non-capturing group

Time:03-05

I am using a regex within PHP to match time strings. I would like to include both white space and   in the non-capturing group to get the following matches:

Match: 10pm
Match: 10 pm

This is the regex I'm using but it is not matching items with  

(\b)(\d{1,2}:\d\d|\d{1,2})(?:\s|&nbsp;)(a\.?m\.?|p\.?m\.?)(\s|<|$|,)

CodePudding user response:

If you want to match both values, you could write and shorten the pattern to:

\b\d{1,2}(?::\d\d)?(?:\s?|&nbsp;)[ap]\.?m\b
  • \b A word boundary
  • \d{1,2} Match 1-2 digits
  • (?::\d\d)? Optionally match : and 1-2 digits
  • (?:\s?|&nbsp;) Match an optional whitespace char or &nbsp
  • [ap]\.?m match either a or p optional dot and m
  • \b A word boundary or use (?:\s|<|$|,)

Regex demo

CodePudding user response:

/\b\d{1,2}(?:\s*(?:&nbsp;)?\s*)?(?:[ap]m\b|[ap]\.m\.)/

/\b\d{1,2}(?:\s*(?:&nbsp;)?\s*)?(?:[ap]m\b|[ap]\.m\.)/
  • \b assert position at a word boundary: (^\w|\w$|\W\w|\w\W)

  • \d matches a digit (equivalent to [0-9])

    • {1,2} matches the previous token between 1 and 2 times, as many times as possible, giving back as needed (greedy)
  • Non-capturing group (?:\s*(?:&nbsp;)?\s*)?

    • ? matches the previous token between zero and one times, as many times as possible, giving back as needed (greedy)

      • \s matches any whitespace character (equivalent to [\r\n\t\f\v ])

      • * matches the previous token between zero and unlimited times, as many times as possible, giving back as needed (greedy)

      • Non-capturing group (?:&nbsp;)?

        • ? matches the previous token between zero and one times, as many times as possible, giving back as needed (greedy)

          • &nbsp; matches the characters &nbsp; literally
      • \s matches any whitespace character (equivalent to [\r\n\t\f\v ])

      • * matches the previous token between zero and unlimited times, as many times as possible, giving back as needed (greedy)

  • Non-capturing group (?:[ap]m\b|[ap]\.m\.)

    • 1st Alternative [ap]m\b

      • Match a single character present in the list below [ap]

        • ap matches a single character in the list ap
      • m matches the character m literally

      • \b assert position at a word boundary: (^\w|\w$|\W\w|\w\W)

    • 2nd Alternative [ap]\.m\.

      • Match a single character present in the list below [ap]

        • ap matches a single character in the list ap
      • \. matches the character . literally

      • m matches the character m literally

      • \. matches the character . literally

  • Global pattern flags

    • g modifier: global. All matches (don't return after first match)

console.log(`
  Match10pm<br>
  Match:100pm<br>
  Match:10pm<br>          - match
  Match:10  pm<br>        - match
  Match:10 pmm<br>
  Match: 10p.m<br>
  Match: 10p.m.<br>       - match
  Match: 10 pm <br>       - match
  Match: 10&nbsp;pm       - match
  Match: 10&nbsp;pmm
  Match: 10&nbsp; pm<br>  - match
  Match: 10 &nbsp;pm<br>  - match
  Match: 10 &nbsp; pm<br> - match`
    // see ... [https://regex101.com/r/9186yf/2]
    .match(/\b\d{1,2}(?:\s*(?:&nbsp;)?\s*)?(?:[ap]m\b|[ap]\.m\.)/g)
);
.as-console-wrapper { min-height: 100%!important; top: 0; }

  • Related