Home > Enterprise >  Validate only an email address username with BASH regex
Validate only an email address username with BASH regex

Time:06-20

In: [email protected]...

I need to make sure that 'johnson' is valid by server/client standards (probably RFC 5322) for the username part of an email address. That is, Gmail and Thunderbird would accept them.

This question addresses full email addresses, which I don't need: How can I validate an email address using a regular expression?

This unpopular question is about JavaScript and doesn't have answers: Validating username part of email address

This answer to the afirst question above offers a semi-acceptable regex for a full email address, which I don't need, though it seems there might be room for improvement, but I might not need improvement: https://stackoverflow.com/a/201378/10343144

My current best solution would be to take:

emailusername="$1"
testemail="${emailusername}@nonsense.com"
regex='some existing full-email regex'
if [[ "${testemail}" =~ ${regex} ]]; then
  echo "it works"
fi

But, that doesn't address the email username part specifically, and it's more expensive than validating only the username part.

Is adding a nonsense domain to the username for a regex check the best way?

Or is there a regex that can handle only only the username part of an email address?

CodePudding user response:

Gmail is more restrictive than the RFC in respect of the accepted usernames (see Create a username):

  • “Abuse” and “Postmaster” are reserved
  • 6–30 characters long
  • can contain letters (a-z), numbers (0-9), and periods (.)
  • cannot contain [...] more than one period (.) in a row
  • can begin or end with [...] except periods (.)
  • periods (dots) don’t matter in Gmail addresses

remark: the length of a username doesn't take the dots into account.

Then, for validating a Google username with bash you could do:

#!/bin/bash

username="$1"
username_nodots="${username//./}" 

if ! {
    (( ${#username_nodots} >=  6 )) && # this rule also excludes 'Abuse' 
    (( ${#username_nodots} <= 30 )) &&
    [[ $username =~ ^[[:alnum:]] (\.[[:alnum:]] )*$ ]] &&
    [[ $username != 'Postmaster' ]]
}
then
    echo "error: illegal google username: $username" >&2
    exit 1
fi

Edit: following @tripleee advice, i.e. using standard shell constructs:

username="$1"
length=$(printf %s "$username" | tr -d '.' | wc -c)

[ "$length" -ge 6 ] || {
    printf '%s\n' 'too short' >&2
    exit 1
}
[ "$length" -le 30 ] || {
    printf '%s\n' 'too long' >&2
    exit 1
}
case $username in
    *[^[:alnum:].]*)
        printf '%s\n' 'illegal character' >&2
        exit 1
    ;;
    .*)
        printf '%s\n' 'starts with dot' >&2
        exit 1
    ;;
    *.)
        printf '%s\n' 'ends with dot' >&2
        exit 1
    ;;
    *..*)
        printf '%s\n' 'multiple dots in a row' >&2
        exit 1
    ;;
    Abuse|Postmaster)
        printf '%s\n' 'reserved username' >&2
        exit 1
    ;;
esac

CodePudding user response:

If your locale is C the following may work for you. It is inspired by the last regex you mention (which has not been checked against the RFC), and was not extensively tested:

atext="A-Za-z0-9!#\$%&'* /=?^_\`{|}~-"
qs1=$'\x09\x0a\x0d\x20\x22\x5c'
qs2=$'\x0a\x0d'
[[ "$localpart" =~ ^([$atext] (\.[$atext] )*|\"([^$qs1]|\\[^$qs2])*\")$ ]] && echo "yes"
  • Related