Home > Net >  Compacting Number values into shorter string
Compacting Number values into shorter string

Time:07-10

I have 1296 random values ranging 0-31. I want to represent this information so let’s say I just concatenate all 1296 values into a string. If I were to do that the string would be about 2500 characters long. How can I store this information in a shorter string so I can still know all 1296 values in the correct order but not have such a long string? (I managed to get it to 648, but wanted to see if someone has an even better way)

CodePudding user response:

You can easily store 32 unique values in one character, which means your 1296 numbers can fit in a string of 1296 characters.

For example:

import random

numbers = [random.randint(0, 31) for i in range(1296)]

def numbers_to_string(numbers):
    return "".join(chr(ord("0")   number) for number in numbers)

def numbers_from_string(string):
    return [ord(char) - ord("0") for char in string]

numbers_str = numbers_to_string(numbers)
numbers_roundtrip = numbers_from_string(numbers_str)

print(numbers_roundtrip == numbers)

Output:

True

These are the numbers and the characters used to represent them:

 0 0
 1 1
 2 2
 3 3
 4 4
 5 5
 6 6
 7 7
 8 8
 9 9
10 :
11 ;
12 <
13 =
14 >
15 ?
16 @
17 A
18 B
19 C
20 D
21 E
22 F
23 G
24 H
25 I
26 J
27 K
28 L
29 M
30 N
31 O

CodePudding user response:

This will work when the range of numbers in the input list are 0-31 (inclusive) and when the list length is a multiple of 3

import random

numbers = [random.randint(0, 31) for _ in range(1296)]

def pack(n):
    result = []
    for i in range(0, len(n), 3):
        a = n[i] << 10 | n[i 1] << 5 | n[i 2]
        result.append(chr(a))
    return ''.join(result)

def unpack(s):
    result = []
    for c in s:
        o = ord(c)
        for shift in 10, 5, 0:
            result.append(o >> shift & 0x1F)
    return result

packed = pack(numbers)
print(len(packed))
result = unpack(packed)
assert result == numbers

Output:

432

Note:

If the range of numbers was 1-31 then this technique (with a minor modification) could be used for any list length because zero could be used as a padding indicator

  • Related