Home > Software engineering >  Why does <input type="email"> mark valid email addresses as invalid?
Why does <input type="email"> mark valid email addresses as invalid?

Time:01-03

Per Wikipedia, these are valid email addresses.

" "@example.org
"john..doe"@example.org
"very.(),:;<>[]\".VERY.\"very@\\ \"very\".unusual"@strange.example.com
postmaster@[123.123.123.123]
postmaster@[IPv6:2001:0db8:85a3:0000:0000:8a2e:0370:7334]

<input type="email"> rejects them as invalid.

Furthermore, per Wikipedia, this address is invalid.

1234567890123456789012345678901234567890123456789012345678901234 [email protected]

Yet <input type="email"> accepts it as valid.

Why is the implementation of <input type="email"> so imprecise?

I understand that the HTML standard specifies a particular validation regex that is consistent with this behaviour. But why is it formally incorrect?

Please note that I am not asking for a better regex than the default one or any other practical solution. I am asking for an explanation of why someone thought validating emails in a formally incorrect way is a good idea.

CodePudding user response:

Email validation is done based on the default regex of the web browser.

Based on docs default pattern is

/^[a-zA-Z0-9.!#$%&'* \/=?^_`{|}~-] @[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?(?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*$/

To overcome this inconsistency. Try setting your own regex using pattern field

<input pattern = "regular_exp">

Example

<input 
  type = 'email'
  pattern = "^[a-zA-Z0-9] (?:\.[a-zA-Z0-9] )*@[a-zA-Z0-9] (?:\.[a-zA-Z0-9] )*$"
/>

CodePudding user response:

The email validation is supposed to be the following as defined by the W3C HTML specification

A single e-mail address.

Value: Any string that matches the following [ABNF] production:

1*( atext / "." ) "@" ldh-str 1*( "." ldh-str )

…where atext is as defined in [RFC 5322], and ldh-str is as defined in [RFC 1034].

That is, any string which matches the following regular expression:

/^[a-zA-Z0-9.!#$%&’* /=?^_`{|}~-] @[a-zA-Z0-9-] (?:\.[a-zA-Z0-9-] )*$/

According to MDN, it is checking to see if its properly formatted. It is not checking to see if it's valid.

If you want a larger set, then you can check it yourself in JavaScript or back at the server.

  • Related