Home > Mobile >  Using XOR(^) as MATH operator
Using XOR(^) as MATH operator

Time:10-11

I was migrating some old code from my my company DLL, that was writed in C# to Python, and as i advanced, i encountered some parts of logic that i didn't understand how it works because of the XOR(^) use

These lines are the rewrited code, but with the same logic, and working the same as it was in C# like that:

list_content = [1, 2,210, 224, 97]
item_verif = 3
for j in range(len(list_content)):
    item_verif ^= list_content[j]
return item_verif

And the item_verif would turn to be like: 3, 2, 0, 210, 50 and 83 I don't undrstand why of those values, because what happen in the code is:

Assuming that item_verif value it's 210, and the next value of list_content would be 224, so it would be something like:

item_verif = 210 ^ 224

And that would result: 50

What it's happening to get that value

I searched through many places and didn't found any solution, so here it's my last resort

Thanks in advance

CodePudding user response:

I'm going out on a limb here and assuming that the real question here is "what is this doing, and why?" (I'm making this assumption since the Python seems to give the same output as the most obvious/literal C# translations, so I'm guessing this isn't a code/technical question - the code already seems to have been translated correctly)

All this is doing is using xor to compute a crude hash, xoring the bits of all of the operands - so we can use that number to check ... well, not equality, but definitely non-equality: if two such hashes are different, then the inputs are definitely different; if two such hashes are the same, we can't say with any real confidence that the inputs are the same, but they could be. Hash collisions are absurdly easy to force from this implementation, so: don't use it for anything security related. Usually, such hashes are used only to short-circuit the "false" part of a sequence equality test (on the basis that generally we'll be comparing non-equal sequences which will usually have different hashes, so usually this will save time), i.e.

  • given two sequences with pre-computed hashes:
    • are the hashes different? return false <==== this is what it allows us to do
    • are the lengths different? return false
    • for each item in turn from both sequences
      • are the corresponding value from the two sequences different? return false
    • return true

Note that this only helps if you already have the hashes for each sequence. You don't want to compute the hash each time if you don't already know it. Short-circuiting in equality tests can also actually be undesirable in some niche scenarios, where it can be used in timing-based attacks - security code should usually take a constant time to say "no" vs "yes".

But: we can see what is happening in the modified C# version:

using System;
var list_content = new[] { 1, 2, 210, 224, 97 };
var item_verif = 3;
Console.WriteLine($"  {Convert.ToString(item_verif, 2).PadLeft(8, '0')} ({item_verif})");
foreach (var value in list_content)
{
    Console.WriteLine($"^ {Convert.ToString(value, 2).PadLeft(8, '0')} ({value})");
    item_verif ^= value;
    Console.WriteLine($"= {Convert.ToString(item_verif, 2).PadLeft(8, '0')} ({item_verif})");
}

which outputs:

  00000011 (3)
^ 00000001 (1)
= 00000010 (2)
^ 00000010 (2)
= 00000000 (0)
^ 11010010 (210)
= 11010010 (210)
^ 11100000 (224)
= 00110010 (50)
^ 01100001 (97)
= 01010011 (83)

  • Related