Home > Enterprise >  regular expression which can treat a string containing '#' as illegal input
regular expression which can treat a string containing '#' as illegal input

Time:08-05

I wrote a regular expression (https?:\/\/) ([a-x]*)?.[a-z]*.(com|io|cn|net) that can achieve:

  1. Must start with http or https
  2. Must end with com,cn,io or net
  3. Domain names can only consist of numbers, letters, and underscores
  4. Subdomain can be empty the right answer can be 'http://123.cn' or 'https://www.123.cn'

but it also considered 'http://ww#.123.com' as the correct answer, I wonder what's wrong with my expression, how to limit input '#'.

CodePudding user response:

If you use a RegEx tester online (like regex101.com) it will tell you that it's matching because the . is not escaped as \. so it will match the # character.

Try: ^(https?:\/\/)([a-z0-9_]*\.)?[a-z0-9_]*\.(com|io|cn|net)$ and you may get what you're looking for.

Note your original RegEx did not include digits or the underscore in the domain names.

  • Related