Home > Net >  Temporarily index letters in a string to a different binary
Temporarily index letters in a string to a different binary

Time:10-18

For self education purposes:

So ASCII puts every character to a binary representation right? A = 65 = 01000001, etc. I was curious though, if you wanted to temporarily switch the variable to something different to conserve space, is there a simple way to go about doing that? Like if I had a project where I only needed 4 letters, could I store, AR as 01, RA as 11, SQ as 00, and QS as 10? That way if I need the data read back it's a hell of a lot faster. I did a little paper math and it would be up to 20 times faster.

Currently I'm using primarily python, but I have experience in C as well. If anyone has thoughts, I'd appreciate answers using those languages with built in functions if they exist. But like I said, I'm mostly just curious. If it needs to be done at close to hardware level that's cool.

Thanks all!!!

CodePudding user response:

Firstly it will only make your code theoretically take 20 times less space for those variables, not make it faster or reduce the size of the whole code. In practice the difference will be negligible, it will break compatibility with standards (ASCII) and there is no straightforward way to implement this in Python without using ctypes.

In pure python if you try to make a dictionary to translate from "AR" to the binary number 10 (or 2 in decimal) it will be stored as an int, which uses 32 bits or 4 bytes.

Also memory can't be stored in less than a single byte, you can probably store multiple variable in a single byte using ctypes.c_uint8 or bitarray.

From the stackoverflow question in C

When is it worthwhile to use bit fields?

Whilst bit-fields can lead to neat syntax, they're pretty platform-dependent, and therefore non-portable. A more portable, but yet more verbose, approach is to use direct bitwise manipulation, using shifts and bit-masks.

If you use bit-fields for anything other than assembling (or disassembling) structures at some physical interface, performance may suffer. This is because every time you read or write from a bit-field, the compiler will have to generate code to do the masking and shifting, which will burn cycles.

The difference in python will make it even worse.

  • Related