Home > database >  Find if RegEx matches exactly
Find if RegEx matches exactly

Time:04-08

I'm new to RegEx, and wanted to make an expression that checks if a date is formatted properly (in the format of "Mar 22, 2022", for example).

This is the expression that I wrote:

const dateRegex = /[A-z][A-z][A-z]\s([0-9][0-9]|[0-9]),\s202[2-9]/g;

The problem is that although something like "Apr 24, 2022".match(dateRegex) works as needed, "Apr 24, 202222".match(dateRegex) matches with dateRegex and I don't want it to.

Can anyone please suggest a way to either rewrite the RegEx or an alternative solution to this problem. Thank you.

CodePudding user response:

Try this:

    const dateRegex = /^[A-Za-z]{3}\s[1-3]{0,1}[0-9]{1},\s202[2-9]{1}$/i
    
    var cases = [
        "Apr 1, 2022",
        "april, 1, 220222",
        "apr 22, 202",
        "ap 44 2022",
        "apr 44, 2022",
        "apr aa, 2022",
        "apr aa, 202a",
        "ap1 aa, 2022"
    ]
    
    cases.forEach(item => {
        var res = dateRegex.test(item);
        console.log("------------------")
        console.log("case: "   item)
        console.log(res)
    });

Breakdown:

        /^[A-Za-z]{3} [1-3]{0,1}[0-9]{1},\s202[2-9]{1}$/i
        --------------------------------------------------------------
        ^        - beginning of line
        [A-Za-z] - case insensitive
        {3}      - previous must be 3 characters in length (e.g. string length)
        \s       - one whitespace (space or tab character)
        [1-3]    - only allow numeric characters 1 through 3
        {0,1}    - previous must be 0 or 1 characters in length
        [0-9]    - only allow numeric characters 0 through 9
        {1}      - previous must be 1 characters in length
        ,        - literal comma
        \s       - one whitespace (space or tab character)
        202      - literal 202 (restricting to this decade :)
        [2-9]    - only allow numeric characters 2 through 9
        {1}      - previous must be 1 characters in length
        $        - end of line
        
    

result:

        ------------------
        case: Apr 1, 2022
        true
        ------------------
        case: april, 1, 220222
        false
        ------------------
        case: apr 22, 202
        false
        ------------------
        case: ap 44 2022
        false
        ------------------
        case: apr 44, 2022
        false
        ------------------
        case: apr aa, 2022
        false
        ------------------
        case: apr aa, 202a
        false
        ------------------
        case: ap1 aa, 2022
        false
        ------------------

CodePudding user response:

Add a beggining "^" and end "$" symbol to strictly matches your date regex.

It should be like this. ^[A-z][A-z][A-z]\s([0-9][0-9]|[0-9]),\s202[2-9]$

The problem with your current regex its that it will be always be true. Because, on your string some of the characters matches your regex.

CodePudding user response:

Updated with word boundries \b, positive lookbehind (?<=...) and ahead (?=...)

You need quantifiers:

  • ? zero or one time

  • * zero or more times

  • {2} exactly two times

  • {1,5} one to five times

It declares the frequency of occurance of whatever it suffixed.

In your case:

  • A space must OR word boundry must preceed match: (?<=\s|\b)

  • Three letter month, first letter uppercase: [A-Z][a-z]{2}

  • Space, a number 1 to 31, space: \s([1-2]?[1-9]|10|20|30|31),\s

  • 202, and 2 thru 9 (inclusive): 202[2-9]

  • A space must proceed match: (?=\s)

/(?<=\s|\b)[A-Z][a-z]{2}\s([1-2]?[1-9]|10|20|30|31),\s202[2-9](?=\s)/g

Review: https://regex101.com/r/8ECRbd/1

const pre = document.querySelector('pre');

const txt = pre.innerText;

const rgx = new RegExp(/(?<=\s|\b)([A-Z][a-z]{2}\s([1-2]?[1-9]|10|20|30|31),\s202[2-9])(?=\s)/, 'g');

const highlighted = txt.replaceAll(rgx, `<mark>$1</mark>`);

pre.innerHTML = highlighted;
<pre>
Matches Mmm (1-31), 202(2-9)
----------------------------
Mar 3, 2022
May 10, 2026
Jan 1, 2023
Jan 31, 2023
Anything at the beginning or the end of pattern must be whitespace.
 Aug 2, 2022
Aug 2, 2022 
Valid matches can be on the same line 
Aug 2, 2022 to Aug 12, 2022

These will not match
--------------------
First letter is lowercase
aug 2, 2022
Second and/or third letter is uppercase
AUG 2, 2022
Year is not within 2022 and 2029
Aug 2, 2021
Day is padded with a zero
Aug 02, 2022
Day exceeds 31
Aug 39, 2022
Extra character before match
qAug 2, 2022
AAug 2, 2022
Extra character after match
Aug 2, 202222
Aug 2, 2022$
</pre>

  • Related