Home > Software design >  How do I get C to successfully match a regex?
How do I get C to successfully match a regex?

Time:12-29

So, I am trying to check the format of a key using the regex.h library in C. This is my code:

#include <stdio.h>
#include <regex.h>

int match(char *reg, char *string)
{
    regex_t regex;
    int res;

    res = regcomp(&regex, reg, 0);
    if (res)
    {
        fprintf(stderr, "Could not compile regex\n");
        return 1;
    }

    res = regexec(&regex, string, 0, NULL, 0);
    return res;
}

int main(void)
{
    char *regex = "[\\w-]{24}\\.[\\w-]{6}\\.[\\w-]{27}|mfa\\.[\\w-]{84}";
    char *key = "xxxxxxxxxxxxxxxxxxxxxxxx.xxxxxx.xxxxxxxxxxxxxxxxxxxxxxxxxxx";

    if (match(regex, key) == 0) printf("Valid key!\n");
    else printf("Invalid key!\n");

    return 0;
}

When I run this code, I get the output:

Invalid key!

Why is this happening? If I try to test the same key with the same regex in Node.JS, I get that the key does match the regex:

> const regex = new RegExp("[\\w-]{24}\\.[\\w-]{6}\\.[\\w-]{27}|mfa\\.[\\w-]{84}");
undefined
> const key = "xxxxxxxxxxxxxxxxxxxxxxxx.xxxxxx.xxxxxxxxxxxxxxxxxxxxxxxxxxx";
undefined
> regex.test(key)
true

How could I get the right result using C?

Thanks in advance,
Robin

CodePudding user response:

There are at least two issues here and one extra potential problem:

  • The limiting quantifiers will work as such in a POSIX ERE flavor, thus, as it has been pointed out in comments, you need to regcomp the pattern with a REG_EXTENDED option (i.e. res = regcomp(&regex, reg, REG_EXTENDED))
  • The \w shorthand character class does not work inside bracket expressions as a word char matching pattern, you need to replace it with [:alnum:]_, i.e. [\w-] must be replaced with [[:alnum:]_-]. The solution will be:
char *regex = "[[:alnum:]_-]{24}\\.[[:alnum:]_-]{6}\\.[[:alnum:]_-]{27}|mfa\\.[[:alnum:]_-]{84}";
  • Besides, if your regex must match the two alternatives exactly, you need to use a group around the whole pattern and add ^ and $ anchors on both ends. The solution will be:
char *regex = "^([[:alnum:]_-]{24}\\.[[:alnum:]_-]{6}\\.[[:alnum:]_-]{27}|mfa\\.[[:alnum:]_-]{84})$";

See this C demo:

#include <stdio.h>
#include <regex.h>

int match(char *reg, char *string)
{
    regex_t regex;
    int res;

    res = regcomp(&regex, reg, REG_EXTENDED);
    if (res)
    {
        fprintf(stderr, "Could not compile regex\n");
        return 1;
    }

    res = regexec(&regex, string, 0, NULL, 0);
    return res;
}

int main(void)
{
    char *regex = "^([[:alnum:]_-]{24}\\.[[:alnum:]_-]{6}\\.[[:alnum:]_-]{27}|mfa\\.[[:alnum:]_-]{84})$";
    char *key = "xxxxxxxxxxxxxxxxxxxxxxxx.xxxxxx.xxxxxxxxxxxxxxxxxxxxxxxxxxx";

    if (match(regex, key) == 0) printf("Valid key!\n");
    else printf("Invalid key!\n");

    return 0;
}
// => Valid key!
  • Related