how can I get the first 3 digits of a given ***unsigned long long*** without converting it to string-CodePudding

how can I get the first 3 digits of a given unsigned long long without converting it to string and there is now the constant length of the number.

without using the naive approach of dividing with / 10.

like this

int first_3_digits(unsigned long long number)
{
    unsigned long long n = number;
    while(n >= 1000){ 
        n = n/10;
    }

    return n;
}

I had an interview and they said that this solution wasn't good enough the interviewer said the solution resembles a binary search. I know how binary search work but I don't know how to connect it to this problem.

CodePudding user response：

Modern processors are able to compute the log2 of an integer in only few cycles using specific low-level instructions (eg. bsr on mainstream x86-64 processors). Based on

One can see that this solution is the fastest. The one of Yakov Khodorkovski and qqNade give wrong results for big values due to floating point rounding. The one of qqNade is not presented in the benchmark for sake of clarity as it is >10 time slower than the original one.

The reason why the solution of Gerhardh is so fast with Clang is that the compiler is able to partially generate fast conditional moves instead of slow conditional branches. This optimization is insanely clever since this is only possible on 32-bit integers (and also if the optimization about the division by a constant is performed before), but Clang is able to know that n is small enough after the 2 first conditions! That being said, this optimization is fragile since few changes in the code often appears to break it.

One can note that the original code is surprisingly quite fast (especially on GCC). This is due to branch prediction. Modern processors execute many iterations without checking they should be executed (and roll back if needed). Each iteration is very fast since the division by a constant is optimized: it only takes 2 cycle/iterations on my machine. On modern x86-64 Intel processors a branch misprediction takes 14 cycles while a well-predicted branch takes only 1-2 cycles (similar on AMD Zen processors). The average number of iteration is ~9 and only the last iteration is expensive. The solution of Gerhardh results in far less instructions executed, but it can result in up to 4 miss-predicted branch with GCC and up to 2 with Clang. The proposed solution results in only 1 indirect miss-predicted branch (that processors can less easily execute efficiently though). Since the optimized implementations runs in only ~10 cycles in average on Quickbench, the effect of branch misprediction is huge.

Note that using other input distribution have an impact on results though the overall trend remains the same. Here are results for a uniform distribution: GCC and Clang. The original algorithm is significantly slower since the average number of digits is twice bigger (17~18 instead of ~9) so is the number of iterations. The speed of the other algorithms is not very different compared to the previous distribution and the trend is left unchanged overall for them.

Conclusion

To conclude, the solution of Gerhardh is relatively portable, simple and pretty fast. The new proposed solution is more complex, but it is the fastest on both GCC and Clang. Thus, the solution of Gerhardh should be preferred unless the performance of this function is very important.

CodePudding user response：

Finding the "best" solution often depends on the criteria you define to matter. If you want smallest or simplest solution, your approach is not bad.

If you want fasted solution (and got the hint about "binary search" from the interviewer), then you might try something like this (not tested):

int first_3_digits(unsigned long long number)
{
    unsigned long long n = number;
    unsigned long long chopped;

    // n has up to 20 digits. We need to chop up to 17 digits.

    // Chop 8 digits
    chopped = n / 100000000ull;
    if (chopped >= 100)
    {
        n = chopped;
    }
    // chopped has up to 12 digits,
    // If we use old n we have up to 11 digits
    // 9 more to go...

    // Chop 4 digits
    chopped = n / 10000ull;
    if (chopped >= 100)
    {
        n = chopped;
    }
    // chopped has up to 8 digits,
    // If we use old n we have up to 7 digits
    // 5 more to go...

    // Chop 2 digits
    chopped = n / 100ull;
    if (chopped >= 100)
    {
        n = chopped;
    }
    // chopped has up to 6 digits,
    // If we use old n we have up to 5 digits
    // 3 more to go...

    // Chop 2 digits again
    chopped = n / 100ull;
    if (chopped >= 100)
    {
        n = chopped;
    }
    // chopped has up to 4 digits,
    // If we use old n we have up to 3 digits
    // 1 more to go...

    // Chop last digit if required.
    if (n >= 1000)
    {
        n /= 10;
    }

    return n;
}

For 64 bit values, maximum number is 18446744073709551615, i.e. 20 digits. We have to remove at most 17 digits. As this is not a perfect number to use powers of 2 for number of digits to chop, we repeat the step to chop 2 digits.

That solution might be a bit faster but is likely to take more code.

CodePudding user response：

When the interviewer says:

the solution resembles a binary search

that's evidence that the interviewer has in mind a particular distribution of inputs for this function, and that distribution is not a uniform distribution over the range of unsigned long long values. At that point, it's necessary to ask the interviewer what input distribution they expect, since it's not possible to optimise algorithms like this without knowing that.

In particular, if the inputs were selected from a uniform sample of the range of 64-bit unsigned values, then the following simple function would be close to optimal:

/* See Note 1 for this constant */
#define MAX_POWER (10ULL * 1000 * 1000 * 1000 * 1000 * 1000)

int first_3_digits(unsigned long long n) {
  if (n >= 1000) { /* Almost always true but needed for correctness */
    while (n < MAX_POWER * 100) n *= 10;
    if (n >= MAX_POWER * 1000) n /= 10;
    n /= MAX_POWER;
  }
  return n;
}

I hope this demonstrates how much difference the expected input distribution makes. The above solution optimises for the case where the input is close to the maximum number of digits, which is almost always the case with a uniform distribution. In that case, the while loop will hardly ever execute (more precisely, it will execute about 1/1844 times).

That solution also has the advantage that the while loop, even if it does execute, only does multiplications by 10, not divisions by 10. Both GCC and Clang understand how to optimise divisions by constants into multiplication and shift, but a multiplication is still faster, and multiplication by 10 is particularly easy to optimise. [Note 2]

Notes

Note that the constant MAX_POWER was precomputed for the case where unsigned long long is a 64-bit value. That's not guaranteed, although it's common, and it's also the minimum possible size of an unsigned long long. Computing the correct value with the C preprocessor is possible, at least up to a certain point, but it's tedious, so I left it out. The value needed is the largest power of 10 no greater than ULLONG_MAX / 1000. (Since unsigned long long is always a power of 2, it cannot be a power of 10 and the test could equally be for the largest power of 10 less than ULLONG_MAX / 1000, if that were more convenient.)
In the benchmark provided by Jérôme Richard, whose sample inputs are chosen so that their logarithms are roughly uniform --a very different input distribution--, this comes out a bit slower than his optimised solution, although it's within the margin of error of the benchmark tool. (On the other hand, the code is quite a bit simpler.) On Jérôme's second benchmark, with a uniform sample, it comes out a lot faster.

CodePudding user response：

The hint for a binary search implies the last few tests and divides should be as below, using powers of 10: ..., 8, 4, 2, 1

// Pseudo code
if (big enough)
  divide by 100,000,000
if (big enough)
  divide by 10,000 
if (big enough)
  divide by 100 
if (big enough)
  divide by 10

Working this backwards we need up to 5 divisions for a 64-bit unsigned long long.
As unsigned long long may be wider than 64, add tests for that.
Provide not only a solution, but a test harness to demo the correctness.

Example:

#include <errno.h>
#include <inttypes.h>
#include <limits.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

unsigned first3(unsigned long long x) {
  static const unsigned long long ten18 = 1000000ull * 1000000 * 1000000;
  static const unsigned long long ten16 = 10000ull * 1000000 * 1000000;
  static const unsigned long long ten10 = 10000ull * 1000000;
  static const uint32_t ten8 = 100ul * 1000000;
  static const uint32_t ten6 = 1000000u;
  static const uint32_t ten4 = 10000u;
  static const uint32_t ten3 = 1000u;
  static const uint32_t ten2 = 100u;

  // while loop used in case unsigned long long is more than 64-bit
  // We could use macro-magic to make this an `if` for common 64-bit.
  while (x >= ten18) {
    x /= ten16;
  }

  if (x >= ten10) {
    x /= ten8;
  }
  if (x >= ten6) {
    x /= ten4;
  }
  uint32_t x32 = (uint32_t) x;  // Let us switch to narrow math
  if (x32 >= ten4) {
    x32 /= ten2;
  }
  if (x32 >= ten3) {
    x32 /= 10;
  }
  return (unsigned) x32;
}

Test code

int test_first3(void) {
  char buf[sizeof(unsigned long long) * CHAR_BIT];
  for (size_t i = 1;; i  ) {
    for (int dig = '0'; dig <= '9'; dig  = 9) {
      memset(buf, dig, i);
      if (dig == '0') {
        buf[0]  ;
      }
      buf[i] = '\0';
      errno = 0;
      unsigned long long x = strtoull(buf, 0, 10);
      if (errno) {
        puts("Success!");
        return 0;
      }
      unsigned f3 = first3(x);
      char buf3[sizeof(unsigned) * CHAR_BIT];
      int len = sprintf(buf3, "%u", f3);
      printf("%2zu <%s> <%s>\n", i, buf3, buf);
      if (len > 3) {
        printf("i:%zu dig:%c\n", i, dig);
        return -1;
      }
      if (strncmp(buf, buf3, 3) != 0) {
        return -1;
      }
    }
  }
}

int main(void) {
  test_first3();
}

Output

 1 <1> <1>
 1 <9> <9>
 2 <10> <10>
 2 <99> <99>
 3 <100> <100>
 3 <999> <999>
 4 <100> <1000>
 4 <999> <9999>
 5 <100> <10000>
 5 <999> <99999>
 ...
17 <100> <10000000000000000>
17 <999> <99999999999999999>
18 <100> <100000000000000000>
18 <999> <999999999999999999>
19 <100> <1000000000000000000>
19 <999> <9999999999999999999>
20 <100> <10000000000000000000>
Success!

CodePudding user response：

here a short solution using log10 function :

int first_n_digits(unsigned long long number){
    return number < 1000 ? number =  (int)number : number = (int)(number/pow(10,(int)(log10(number) 1)-3));
}

CodePudding user response：

You can calculate the length of a number (its power) using the decimal logarithm, then use the decimal exponent to get a divisor less than three orders of magnitude and integer divide the number by it:

#include <assert.h>
#include <math.h>

int first_3_digits(unsigned long long number)
{
     if(number < 1000)
         return number;
     int number_length = int(floor(log10l(number))) 1;
     assert(number_length > 3);
     unsigned long long divider = exp10l(number_length - 3);
     return number/divider;
}

int main()
{
     assert(first_3_digits(0)==0);
     assert(first_3_digits(999)==999);
     assert(first_3_digits(1234)==123);
     assert(first_3_digits(9876543210123456789ull)==987);
     return 0;
}