Home > Enterprise >  Javascript regex to match more than one type of special characters between URLs
Javascript regex to match more than one type of special characters between URLs

Time:05-16

I'm trying to make sure only two types of special character (semi-colon, comma, or space) are used in a string.

Valid case:

  • Can contain alphanumeric letters and numbers
  • Can contain special characters : / .
  • Can contain up to two types of special characters when separating a URL: semi-colon, comma, spaces and semi-colon or spaces and comma.
  • Cannot mix special characters semi-colon and comma when separating URLs

e.g this should match as it only uses one type of special character (semi-colon):

https://hello.com/example1;https://hello.com.com/example2;https://hello.com.com/example3

This should pass as it mixes two types of allowed special characters (space and semi-colon):

https://hello.com/example1; https://hello.com.com/example2 ;https://hello.com.com/example3

This should pass as it mixes two types of special characters (comma and semi-colon):

https://hello.com/example1; https://hello.com.com/example2 , https://hello.com.com/example3 ; https://hello.com.com/example4

This should fail as it mixes two types of special characters that are not allowed (comma and semi-colon) between the URLs:

https://hello.com/example1;,https://hello.com.com/example2,https://hello.com.com/example3

This is my regex:

^[A-Za-z0-9:=_\/\\.\\?] (?=([,;\s])|$)(?:\1[A-Za-z0-9:=_\/\\.\\?] )*$

Currently it only allows matches when the string has one type of special character, but i need it to match multiple types between strings as given here: https://regex101.com/r/i9BF6X/1

CodePudding user response:

Given the four examples that you provided, my understanding of the problem is that the delimiters (comma and semicolon) can be found within the same match (as in 3rd example) if and only if they're not found to be separating the same two urls (as in 4th example).

One option could be using the following regex:

^https:\/(\/[\w\.] ) ( *(;|,) *https:\/(\/[\w\.] ) ) 

Explanation:

  • ^: begin of string
  • https:\/(\/[\w\.] ) : first url
    • https:\/: https:/
    • (\/[\w\.] ) : sequences of slash and combination of alphanumeric characters and dots
  • ( *(;|,) *https:\/(\/[\w\.] ) ) : sequences of "delimiter url"
    • *(;|,) *: optional spaces comma or semicolon optional spaces
    • https:\/: https:/
    • (\/[\w\.] ) : sequences of slash and combination of alphanumeric
  • $: end of string

Try it here.

  • Related