I'm trying to compare two char primitives ch1 and ch2. Both are assigned the value 1 as shown below.

But when compared using the "==" operator it returns false, which I don't understand how or what's happening behind the scenes.

char ch1 = (char)1;
char ch2 = '1';
System.out.println(ch1==ch2); //false

//further comparisions
System.out.println(ch1 == 1);       //true
System.out.println(ch1 == '\u0031'); //false

System.out.println(ch2 == 1);       //false
System.out.println(ch2 == '\u0031'); //true

CodePudding user response：

'1' has the value 49 (31 hexadecimal).

(char)1 has the value 1.

A char is just a 16-bit integer. The notation 'x' means 'the character code for the character x', where the encoding used in Java is Unicode, specifically UTF-16.

The cast (char) does not change the value of the expression to its right, except that it truncates it from a full-size integer to 16 bits (which is no change for values 0 to 65535).

CodePudding user response：

Basically what you are doing is casting the number one as a char, so ch1 is now equals to unicode character 1 (SOH or Start of Header)

So when you compare ch1 (SOH) to ch2 ('1') its going to return false As well if you compare ch1 (SOH - \u0001) to `'1' - \u0031 is going to return false

That's the main reason why is returning false, the unicode value of ch1 that you expect is different from the one you assigned

CodePudding user response：

Code point

The char type is essentially broken since Java 2, physically incapable of representing most characters.

Instead use code point integer numbers. Every character is permanently assigned a specific number, a code point.

int codePoint = "1".codePointAt( 0 ) ;  // Annoying zero-based index counting.

The result is 49 decimal, 31 hexadecimal.

Make a string of that single character per the code point.

String s = Character.toString( codePoint ) ;

Or more specifically:

String latinDigitOneCharacter = Character.toString( 49 ) ;

As others pointed out, your code was mistakenly comparing the character defined as the Latin digit “1” with a code point of 1.

The character assigned to the code point of one is the control code SOH, Start of Heading. This is true in both Unicode and US-ASCII (Unicode is a superset of US-ASCII).