Home > front end >  How to do 1024-bit operations using arrays of uint64_t
How to do 1024-bit operations using arrays of uint64_t

Time:12-03

I am trying to find a way to compute values that are of type uint1024_t (unsigned 1024-bit integer), by defining the 5 basic operations: plus, minus, times, divide, modulus.

The way that I can do that is by creating a structure that will have the following prototype:

typedef struct {
    uint64_t chunk[16];
} uint1024_t; 

Now since it is complicated to wrap my head around such operations with uint64_t as block size, I have first written some code for manipulating uint8_t. Here is what I came up with:

#define UINT8_HI(x) (x >> 4)
#define UINT8_LO(x) (((1 << 4) - 1) & x)

void uint8_add(uint8_t a, uint8_t b, uint8_t *res, int i) {
    uint8_t s0, s1, s2;

    uint8_t x = UINT8_LO(a)   UINT8_LO(b);
    s0 = UINT8_LO(x);

    x = UINT8_HI(a)   UINT8_HI(b)   UINT8_HI(x);
    s1 = UINT8_LO(x);
    s2 = UINT8_HI(x);

    uint8_t result = s0   (s1 << 4);
    uint8_t carry = s2;
    res[1   i] = result;
    res[0   i] = carry;
}

void uint8_multiply(uint8_t a, uint8_t b, uint8_t *res, int i) {
    uint8_t s0, s1, s2, s3;

    uint8_t x = UINT8_LO(a) * UINT8_LO(b);
    s0 = UINT8_LO(x);

    x = UINT8_HI(a) * UINT8_LO(b)   UINT8_HI(x);
    s1 = UINT8_LO(x);
    s2 = UINT8_HI(x);

    x = s1   UINT8_LO(a) * UINT8_HI(b);
    s1 = UINT8_LO(x);

    x = s2   UINT8_HI(a) * UINT8_HI(b)   UINT8_HI(x);
    s2 = UINT8_LO(x);
    s3 = UINT8_HI(x);

    uint8_t result = s1 << 4 | s0;
    uint8_t carry = s3 << 4 | s2;
    res[1   i] = result;
    res[0   i] = carry;
}

And it seems to work just fine, however I am unable to define the same operations for division, subtraction and modulus...

Furthermore I just can't seem to see how to implement the same principal to my custom uint1024_t structure even though it is pretty much identical with a few lines of code more to manage overflows.

I would really appreciate some help in implementing the 5 basic operations for my structure.

CodePudding user response:

There are some problems with your implementation for uint8_t arrays:

  • you did not parenthesize the macro arguments in the expansion. This is very error prone as it may cause unexpected operator precedence problems if the arguments are expressions. You should write:

    #define UINT8_HI(x) ((x) >> 4)
    #define UINT8_LO(x) (((1 << 4) - 1) & (x))
    
  • storing the array elements with the most significant part first is counter intuitive. Multi-precision arithmetics usually represents the large values as arrays with the least significant part first.

  • for a small type such as uint8_t, there is no need to split it into halves as larger types are available. Furthermore, you must propagate the carry from the previous addition. Here is a much simpler implementation for the addition:

    void uint8_add(uint8_t a, uint8_t b, uint8_t *res, int i) {
        uint16_t result = a   b   res[i   0]; // add previous carry
        res[i   0] = (uint8_t)result;
        res[i   1] = (uint8_t)(result >> 8); // assuming res has at least i 1 elements and is initialized to 0
    }
    
  • for the multiplication, you must add the result of multiplying each part of each number to the appropriately chosen parts of the result number, propagating the carry to the higher parts.

Division is much more difficult to implement. I recommend you study an open source multiprecision package such as QuickJS' libbf.c.

To transpose this to arrays of uint64_t, you can use unsigned 128-bit integer types if available on your platform (64-bit compilers gcc, clang and vsc all support such types).

Here is a somple implementation for the addition and multiplication:

#include <limits.h>
#include <stddef.h>
#include <stdint.h>

#define NB_CHUNK  16

typedef __uint128_t uint128_t;

typedef struct {
    uint64_t chunk[NB_CHUNK];
} uint1024_t;

void uint0124_add(uint1024_t *dest, const uint1024_t *a, const uint1024_t *b) {
    uint128_t result = 0;
    for (size_t i = 0; i < NB_CHUNK; i  ) {
        result  = (uint128_t)a->chunk[i]   b->chunk[i];
        dest->chunk[i] = (uint64_t)result;
        result >>= CHAR_BIT * sizeof(uint64_t);
    }
}

void uint0124_multiply(uint1024_t *dest, const uint1024_t *a, const uint1024_t *b) {
    for (size_t i = 0; i < NB_CHUNK; i  )
        dest->chunk[i] = 0;
    for (size_t i = 0; i < NB_CHUNK; i  ) {
        uint128_t result = 0;
        for (size_t j = 0, k = i; k < NB_CHUNK; j  , k  ) {
            result  = (uint128_t)a->chunk[i] * b->chunk[j]   dest->chunk[k];
            dest->chunk[k] = (uint64_t)result;
            result >>= CHAR_BIT * sizeof(uint64_t);
        }
    }
}

CodePudding user response:

find a way to compute ... the 5 basic operations: plus, minus, times, divide, modulus.

If uint1024_t used uint32_t, it would be easier.

I would recommend 1) half the width of the widest type uintmax_t, or 2) unsigned, whichever is smaller. E.g. 32-bit.

(Also consider something other than uintN_t to avoid collisions with future versions of C.)

typedef struct {
    uint32_t chunk[1024/32];
} u1024; 

Example of some untested code to give OP an idea of how using uint32_t simplifies the task.

void u1024_mult(u1024 *product, const u1024 *a, const u1024 *b) {
  memset(product, 0, sizeof product[0]);
  unsigned n = sizeof product->chunk / sizeof product->chunk[0];
  for (unsigned ai = 0; ai < n; ai  ) {
    uint64_t acc = 0;
    uint32_t m = a->chunk[ai];
    for (unsigned bi = 0; ai   bi < n; bi  ) {
      acc  = (uint64_t) m * b->chunk[bi]   product->chunk[ai   bi];
      product->chunk[ai   bi] = (uint32_t) acc;
      acc >>= 32;
    }
  }
}

, - are quite similar to the above.

/, % could be combined into one routine that computes the quotient and remainder together.

It is not that hard to post those functions here as it really is the same as grade school math, but instead of base 10, base 232. I am against posting it though as it is fun exercise to do oneself.

I hope the * sample code above inspires rather than answers.

  •  Tags:  
  • c
  • Related