Home > front end >  Regex To Match String With All Words Contains Certain Format
Regex To Match String With All Words Contains Certain Format

Time:01-28

I want to validate a field of string so that it only accept string that contains words with certain format.
Example accepted string:

  1. #key;
  2. #key1; #key2;#key3;

Example rejected string:

  1. key;
  2. %key1X @key2X$key3X

My regex:

\B(\#[a-zA-Z0-9_; ] \b)(\;)

It seems my regex still accept a string as long as it has a word with valid format, while I only want it to be accepted if whole words are in the correct format.
Current example:

%key1; %key2 #keysz;#key3; @key4;

From the above Current Example still accepted because it contains #keysz; and #key3; while I want it to be rejected because there are %key1; %key2 and @key4;.

I've do some search and the closest I can found is this question, but it returns similar result as my current regex.
What did i do wrong in my regex? What is the right regex?
Sorry if this is dumb question but I'm a newbie in regex.

CodePudding user response:

The main thing needed are start ^ and end $ anchors. The rest can be simplified too:

^( *#\w ;) $

See live demo.

Breaking it down:

  • ^ = start
  • * = 0-n spaces
  • # = a literal hash (these don't need escaping in regex)
  • \w = one or more word characters (letters, digits and the underscore)`
  • $

If underscore can be in the input and must not be, then use:

^( *#[A-Za-z0-9] ;) $

CodePudding user response:

Your regex matches a full sentence because in your regex pattern(\B(\#[a-zA-Z0-9_; ] \b)(\;)) you haven't specified where the matching process should start and end. So regex engine will try to match every position of the string on which you run the regex.match.

The way to specify where regex should try to match is done by adding anchors(^-beginning and $-end) to regex pattern.

You can edit your pattern to look like this: /(?:\s|^)(#[a-zA-Z0-9_; ] ?);(?:\s|$)/gm

Explanation:

/(?:\s|^)  
- (?: means a non capture group, means dont include whatever is matched in between these () in the result. \s|^ means start matching if the beginning is a white space or beginning of a string.

(#[a-zA-Z0-9_; ] );
- () is a regular capture group, which means that things captured in this group are included in the result.
You don't need to insert a '\' before every symbol

(?:\s|$)/
- another non capture group, specifying to match a white space or end position of a string.

gm
- global and multiline flags of javascript regex

Here is an example:

let regex_pattern = /(?:\s|^)(#[a-zA-Z0-9_; ] );(?=\s|$)/gm

let input1 = "     #key;" // string with just one word
let input2 = "#key1; #key2;#key3;" // string with one whole word and another word which will match your pattern
let input3 = "soemthing random  #key;andjointstring" // a string with a word that will match the pattern but its not a whole word

console.log(input1.match(regex_pattern)) // it matches
console.log(input2.match(regex_pattern)) // it matches
console.log(input3.match(regex_pattern)) // it doesnt matches

  •  Tags:  
  • Related