Home > Enterprise >  An underscore, a dot and a dash must always be followed by one or more alphanumeric characters witho
An underscore, a dot and a dash must always be followed by one or more alphanumeric characters witho

Time:02-24

I'm working on a school assignment that validates emails without using Regex. The premise of this exercise is to learn about methods and to practice our critical thinking. I understand that the code can be reduced to fewer lines.

Right now, I need to check all conditions for the prefix of an email (The characters before '@'):

  • It contains at least one character.
  • It contains only alphanumeric characters, underscores(‘_’), periods(‘.’), and dashes(‘-’).
  • An underscore, a period, or a dash must always be followed by one or more alphanumeric characters.
  • The first character must be alphanumeric.

Examples of valid prefixes are: “abc-d”, “abc.def”, “abc”, “abc_def”.
Examples of invalid prefixes are: “abc-”, “abc..d”, “.abc”, “abc#def”.

I'm having a hard time figuring out the third condition. So far, I have these methods that meet the other conditions.

public static boolean isAlphanumeric(char c) {
    return Character.isLetterOrDigit(c);
}

public static boolean isValidPrefixChar(char preChar) {
    char[] prefixChar = new char[] {'-', '.', '_'};
    
    for (int i = 0; i < prefixChar.length; i  ) {
        if (prefixChar[i] == preChar) {
            return true;
        } else if (isAlphanumeric(preChar)) {
            return true;
        }
    }
    return false;
}

public static boolean isValidPrefix(String emailPrefix) {
    boolean result = false;
    
    // To check if first character is alphanumeric
    if (isAlphanumeric(emailPrefix.charAt(0)) && emailPrefix.length() > 1) {
        for (int i = 0; i < emailPrefix.length(); i  ) {
            // If email prefix respects all conditions, change result to true
            if (isValidPrefixChar(emailPrefix.charAt(i))) {
                result = true;
            } else {
                result = false;
                break;
            }
        }
    }
    return result;
}

CodePudding user response:

Let's look at your list:

  • It contains at least one character.
  • It contains only alphanumeric characters, underscores(‘_’), periods(‘.’), and dashes(‘-’).
  • An underscore, a period, or a dash must always be followed by one or more alphanumeric characters.
  • The first character must be alphanumeric.

As you say, 1, 2, and 4 are easy. Here's what I would do. My first line would check length and return false if incorrect. I would then iterate over the characters. Inside the loop;

  • set boolean lastWasSpecial = false.
  • Check that it's a legal character (condition 2)
  • If index == 0, check that it's alphanumeric (condition 4)
  • If it's one of the specials:
    • If lastWasSpecial is set, return false
    • Set lastWasSpecial = true;
  • else set lastWasSpecial = false again

Should be about 10 lines of easily-readable code.

CodePudding user response:

The algorithm can be optimised, but I tried to change just some lines and to use the same code style. I added the explanation in the comments.

public static boolean isAlphanumeric(char c) {
    return Character.isLetterOrDigit(c);
}

public static boolean isValidPrefixChar(char preChar) {
    char[] prefixChar = new char[]{'-', '.', '_'};

    for (int i = 0; i < prefixChar.length; i  ) {
        if (prefixChar[i] == preChar) {
            return true;
        }
    }
    return false;
}

public static boolean isValidPrefix(String emailPrefix) {
    boolean result = false;

    // To check if first character is alphanumeric
    if (isAlphanumeric(emailPrefix.charAt(0)) && emailPrefix.length() > 1) {
        // this boolean is set to true when the next char has to be alphanumeric
        boolean nextHasToBeAlphaNumeric = false;

        // the for loop start from 1 because char 0 has been already checked
        for (int i = 1; i < emailPrefix.length(); i  ) {
            // If email prefix respects all conditions, change result to true
            char character = emailPrefix.charAt(i);
            if (isValidPrefixChar(character)) {
                // the previous char is '.', '_', '-' then you cannot have two valid prefix char together
                if (nextHasToBeAlphaNumeric) {
                    result = false;
                    break;
                } else {
                    // the next char has to be alphanumeric
                    result = true;
                    nextHasToBeAlphaNumeric = true;
                }
            } else if (isAlphanumeric(character)) {
                result = true;
                nextHasToBeAlphaNumeric = false;
            } else {
                result = false;
                break;
            }
        }
    }
    return result;
}

CodePudding user response:

local-part

FYI, the portion of the address before the COMMERCIAL AT sign (@) is called a local-part.

Avoid char

Your code using the char will break when encountering characters outside the BMP. As a 16-bit value, char cannot represent most characters.

Code points

Use code points instead, when working with individual characters. A code point is the number assigned permanently to each of the over 140,000 characters defined in Unicode.

int[] codePoints = localPart.codePoints().toArray() ;

Define a array, list, or set of your acceptable punctuation characters.

int codePoint = "-".codePointAt( 0 ) ;  // Annoying zero-based index counting. 

To verify that every punctuation character encountered is followed by a letter/digit, first make sure the punctuation mark is not the last character. If not, then look ahead on the array for the following code point. Test if that code point is a letter or digit.

if( Character.isLetterOrDigit( codePoints[ i   1 ] ) ) { … }
  • Related