I'm working on a school assignment that validates emails without using Regex. The premise of this exercise is to learn about methods and to practice our critical thinking. I understand that the code can be reduced to fewer lines.
Right now, I need to check all conditions for the prefix of an email (The characters before '@'):
- It contains at least one character.
- It contains only alphanumeric characters, underscores(‘_’), periods(‘.’), and dashes(‘-’).
- An underscore, a period, or a dash must always be followed by one or more alphanumeric characters.
- The first character must be alphanumeric.
Examples of valid prefixes are: “abc-d”, “abc.def”, “abc”, “abc_def”.
Examples of invalid prefixes are: “abc-”, “abc..d”, “.abc”, “abc#def”.
I'm having a hard time figuring out the third condition. So far, I have these methods that meet the other conditions.
public static boolean isAlphanumeric(char c) {
return Character.isLetterOrDigit(c);
}
public static boolean isValidPrefixChar(char preChar) {
char[] prefixChar = new char[] {'-', '.', '_'};
for (int i = 0; i < prefixChar.length; i ) {
if (prefixChar[i] == preChar) {
return true;
} else if (isAlphanumeric(preChar)) {
return true;
}
}
return false;
}
public static boolean isValidPrefix(String emailPrefix) {
boolean result = false;
// To check if first character is alphanumeric
if (isAlphanumeric(emailPrefix.charAt(0)) && emailPrefix.length() > 1) {
for (int i = 0; i < emailPrefix.length(); i ) {
// If email prefix respects all conditions, change result to true
if (isValidPrefixChar(emailPrefix.charAt(i))) {
result = true;
} else {
result = false;
break;
}
}
}
return result;
}
CodePudding user response:
Let's look at your list:
- It contains at least one character.
- It contains only alphanumeric characters, underscores(‘_’), periods(‘.’), and dashes(‘-’).
- An underscore, a period, or a dash must always be followed by one or more alphanumeric characters.
- The first character must be alphanumeric.
As you say, 1, 2, and 4 are easy. Here's what I would do. My first line would check length and return false if incorrect. I would then iterate over the characters. Inside the loop;
- set boolean lastWasSpecial = false.
- Check that it's a legal character (condition 2)
- If index == 0, check that it's alphanumeric (condition 4)
- If it's one of the specials:
- If lastWasSpecial is set, return false
- Set lastWasSpecial = true;
- else set lastWasSpecial = false again
Should be about 10 lines of easily-readable code.
CodePudding user response:
The algorithm can be optimised, but I tried to change just some lines and to use the same code style. I added the explanation in the comments.
public static boolean isAlphanumeric(char c) {
return Character.isLetterOrDigit(c);
}
public static boolean isValidPrefixChar(char preChar) {
char[] prefixChar = new char[]{'-', '.', '_'};
for (int i = 0; i < prefixChar.length; i ) {
if (prefixChar[i] == preChar) {
return true;
}
}
return false;
}
public static boolean isValidPrefix(String emailPrefix) {
boolean result = false;
// To check if first character is alphanumeric
if (isAlphanumeric(emailPrefix.charAt(0)) && emailPrefix.length() > 1) {
// this boolean is set to true when the next char has to be alphanumeric
boolean nextHasToBeAlphaNumeric = false;
// the for loop start from 1 because char 0 has been already checked
for (int i = 1; i < emailPrefix.length(); i ) {
// If email prefix respects all conditions, change result to true
char character = emailPrefix.charAt(i);
if (isValidPrefixChar(character)) {
// the previous char is '.', '_', '-' then you cannot have two valid prefix char together
if (nextHasToBeAlphaNumeric) {
result = false;
break;
} else {
// the next char has to be alphanumeric
result = true;
nextHasToBeAlphaNumeric = true;
}
} else if (isAlphanumeric(character)) {
result = true;
nextHasToBeAlphaNumeric = false;
} else {
result = false;
break;
}
}
}
return result;
}
CodePudding user response:
local-part
FYI, the portion of the address before the COMMERCIAL AT sign (@
) is called a local-part.
Avoid char
Your code using the char
will break when encountering characters outside the BMP. As a 16-bit value, char
cannot represent most characters.
Code points
Use code points instead, when working with individual characters. A code point is the number assigned permanently to each of the over 140,000 characters defined in Unicode.
int[] codePoints = localPart.codePoints().toArray() ;
Define a array, list, or set of your acceptable punctuation characters.
int codePoint = "-".codePointAt( 0 ) ; // Annoying zero-based index counting.
To verify that every punctuation character encountered is followed by a letter/digit, first make sure the punctuation mark is not the last character. If not, then look ahead on the array for the following code point. Test if that code point is a letter or digit.
if( Character.isLetterOrDigit( codePoints[ i 1 ] ) ) { … }