I had copy-pasted that snippet of code with the ->
from my Chrome browser into Intellij IDE. I thought they were the same.
This is not clear. Both characters look the same ->.
Both are 2 chars
See screenshot before
java: illegal character: '\u02c2'
When I deleted the -> and typed it in -> then the compiler accepted it.
See screenshot after:
CodePudding user response:
Welcome to the world of homoglyphs: characters that look the same but are actually different.
There are numerous other examples of homoglyphic characters in Unicode, depending on the fonts that your software uses. Other examples include look-alike letters in different languages (examples), minus versus "long hyphen" characters, straight versus sloping quote characters, normal versus wide or non-breaking space, and so on.
In your case, you appear to have copy and pasted
U 02C2
Modifier Letter Left Arrowhead. (˂
)U 02C3
Modifier Letter Right Arrowhead. (˃
)
characters instead of the standard characters
U 003C
Less Than. (<
)U 003E
Greater Than. (>
)
Q: What is the difference?
A: The difference is that they mean different things to Java, even if the font that your IDE is using has them looking the same. (They look the same in my browser ...)
Q: How do they / did they get there?
A: In examples like this, they typically get there when some word processing application (or a person such as a copy editor) thinks that the alternate version "looks nicer" in some font than the correct character, and substitutes them.
In other cases, they can be deliberately used to trick people; e.g. by using homoglyphs in URLs to trick people into visiting the wrong website; see Homoglyph attack detection in email phishing
The bottom line is that you need to be careful when copying code from documents and web pages that may have been generated by "word processing" software somewhere along the line.