In the Delphi Alexandria RTL, they have this function:
function ScanChar(const S: string; var Pos: Integer; Ch: Char): Boolean;
var
C: Char;
begin
if (Ch = ' ') and ScanBlanks(S, Pos) then
Exit(True);
Result := False;
if Pos <= High(S) then
begin
C := S[Pos];
if C = Ch then
Result := True
else if (Ch >= 'a') and (Ch <= 'z') and (C >= 'a') and (C <= 'z') then
Result := Char(Word(C) xor $0020) = Char(Word(Ch) xor $0020)
else if Ch.IsLetter and C.IsLetter then
Result := ToUpper(C) = ToUpper(Ch);
if Result then
Inc(Pos);
end;
end;
I can't understand the purpose of this comparison:
else if (Ch >= 'a') and (Ch <= 'z') and (C >= 'a') and (C <= 'z') then
Result := Char(Word(C) xor $0020) = Char(Word(Ch) xor $0020)
It looks like it's the same as doing this:
else if (Ch >= 'a') and (Ch <= 'z') and (C >= 'a') and (C <= 'z') then
Result := c = Ch
Is this true?
CodePudding user response:
It is not exactly the same as C = Ch
, but the result is the same, I suppose.
The comparison is redundant, IMHO. It is using XOR
to convert lowercase ASCII letters into uppercase ASCII letters (as they differ by only 1 bit), and then comparing the uppercase letters for equality. But the following comparison using IsLetter
ToUpper
does the same thing, just for any letters, not just ASCII letters.
CodePudding user response:
else if (Ch >= 'a') and (Ch <= 'z') and (C >= 'a') and (C <= 'z') then Result := Char(Word(C) xor $0020) = Char(Word(Ch) xor $0020)
Purpose of this comparison is optimization and making faster comparison if the characters are plain ASCII letters and avoiding expensive call to WinAPI via ToUpper
function that can handle Unicode characters.
Or at least that is what would happen if the comparison itself would not be badly broken.
Comparison checks whether both characters are lower case and fall into range between small letter a
(ASCII value 97) and small letter z
(ASCII value 122). But what it should actually check is that both characters fall into range between large letter A
(ASCII value 65) and small letter z
, covering the whole range of ASCII letters regardless of their case. (There are few non letter characters in that range, but those are not relevant as Result
assignment would never yield True
for any of those characters.)
Once that is fixed, we also need to fix Result
assignment expression as it will not properly compare lowercase and uppercase letters. We only need to apply xor
to the character if it is a lowercase letter.
Correct code for that part of the ScanChar
function would be:
...
else
if (Ch >= 'A') and (Ch <= 'z') and (C >= 'A') and (C <= 'z') then
begin
if Ch >= 'a' then
Ch := Char(Word(Ch) xor $0020);
if C >= 'a' then
C := Char(Word(C) xor $0020);
Result := Ch = C;
end
else
...