Home > Back-end >  I need to create a decimal to binary program that can receive input of up to 100,000,000 and output
I need to create a decimal to binary program that can receive input of up to 100,000,000 and output

Time:10-16

As you've read, I created a decimal to binary program and it works well, but it cannot handle user input equal to 100,000,000. My solution is to print each character as it goes, but I do not know what the appropriate loop to use is, and I am also not that great with the math so the main formula to be used is unclear to me. Arrays are not allowed. Any advice is appreciated. Thank you.

#include <stdio.h>

unsigned long long int input,inp,rem=0,ans=0,place_value=1,ans;
int main()
{
    printf("\nYou have chosen Decimal to Binary and Octal Conversion!\n");
    printf("Enter a decimal number:\n");  
    scanf("%llu", &input);
    inp=input;
    while(input){
        rem=input%2;
        input=input/2;
        ans=ans (rem*place_value);
        place_value=place_value*10;
    }
    printf("%llu in Decimal is %llu in Binary Form.\n", inp,ans);
    return 0;
}

CodePudding user response:

OP's code suffers from overflow in place_value*10


A way to avoid no array and range limitations is to use recursion.

Perhaps beyond where OP is now.

#include <stdio.h>

void print_lsbit(unsigned long long x) {
  if (x > 1) {
    print_lsbit(x / 2); // Print more significant digits first
  }
  putchar(x % 2   '0'); // Print the LSBit
}

int main(void) {
  printf("\nYou have chosen Decimal to Binary and Octal Conversion!\n");
  printf("Enter a decimal number:\n");
  //scanf("%llu", &input);
  unsigned long long input = 100000000;
  printf("%llu in Decimal is ", input);
  print_lsbit(input);
  printf(" in Binary Form.\n");
  return 0;
}

Output

You have chosen Decimal to Binary and Octal Conversion!
Enter a decimal number:
100000000 in Decimal is 101111101011110000100000000 in Binary Form.

CodePudding user response:

So as the comments have explained, the decimal number 100000000 has the 27-bit binary representation 101111101011110000100000000. We can therefore store that in a 32-bit int with no problem. But if we were to try to store the decimal number 101111101011110000100000000, which just happens to look like a binary number, well, that would require 87 bits, so it won't even fit into a 64-bit long long integer.

And the code in this question does try to compute its result, ans, as a decimal number which just happens to look like a binary number. And for that reason this code can't work for numbers larger than 1048575 (assuming a 64-bit unsigned long long int).

And this is one reason that "decimal to binary" conversion (or, for that matter, conversion to any base) should normally not be done to a result variable that's an integer. Normally, the result of such a conversion — to any base — should either be done to a result variable that's a string, or it should be printed out immediately. (The moral here is that the base only matters when a number is printed out for a human to read, which implies either a string, and/or something printed to, say, stdout.)

However, in C a string is of course an array. So asking someone to do base conversion without using arrays is a perverse, pointless exercise.

If you print the digits out immediately, you don't have to store them in an array. But the standard algorithm — repeated division by 2 (or whatever the base is) generates digits in reverse order, from least-significant to most-significant, which ends up being right-to-left, which is the wrong order to just print them out. Conventional convert-to-digits code usually stores the computed digits into an array, and then reverses the array — but if there's a prohibition against using arrays, this strategy is (again pointlessly) denied to us.

The other way to get the digits out in the other order is to use a recursive algorithm, as @chux has demonstrated in his answer.

But just to be perverse in my own way, I'm going to show another way to do it.

Even though it's generally a horrible idea, constructing the digits into an integer, that's in base 10 but looks like it's in base 2, is at least one way to store things up and get the answer back out with the digits in the right order. The only problem is that, as we've seen, the number can get outrageously big, especially for base 2. (The other problem, not that it matters here, is that this approach won't work for bases greater than 10, since there's obviously no way to construct a decimal number that just happens to look like it's in, say, base 16.)

The question is, how can we represent integers that might be as big as 87 bits? And my answer is, we can use what's called "multiple precision arithmetic". For example, if we use a pair of 64-bit unsigned long long int variables, we can theoretically represent numbers up to 128 bits in size, or 340282366920938463463374607431768211455!

Multiple precision arithmetic is an advanced but fascinating and instructive topic. Normally it uses arrays, too, but if we limit ourselves to just two "halves" of our big numbers, and make certain other simplifications, we can do it pretty simply, and achieve something just powerful enough to solve the problem in the question.

So, to repeat, we're going to represent a 128-bit number as a "high half" and a "low half". Actually, to keeps things simpler, it's not actually going to be a 128-bit number. To keep things simpler, the "high half" is going to be the first 18 digits of a 36-digit decimal number, and the "low half" is going to be the other 18 digits. This will give us the equivalent of of only about 120 bits, but it will still be plenty for our purposes.

So how do we do arithmetic on 36-digit numbers represented as "high" and "low" halves? Actually, it ends up being the same way we learned how to do pencil-and-paper arithmetic on numbers represented as digits, at all.

If I have one of these "big" numbers, in its two halves:

  high1  low1

and if I have a second one, also in two halves:

  high2  low2

and if I want to compute the sum

  high1  low1
  high2  low2
  -----------
  high3  low3

the way I do it is to add low1 and low2 to get the low half of the sum, low3. If low3 is less than 1000000000000000000 — that is, if it has 18 digits or less — I'm okay, but if it's bigger than that, I have a carry into the next column. And then to get the high half of the sum, high3, I just add high1 plus high2 plus the carry, if any.

Multiplication is harder, but it turns out for this problem we're never going to have to compute a full 36-digit × 36-digit product. We're only ever going to have to multiply one of our big numbers by a small number, like 2 or 10. The problem will look like this:

  high1  low1
×         fac
  -----------
  high3  low3

So, again by the rules of paper-and-pencil arithmetic we learned long ago, low3 is going to be low1 × fac, and high3 is going to be high1 × fac, again with a possible carry.

The next question is how we're going to carry these low and high halves around. As I said, normally we'd use an array, but we can't here. The second choice might be a struct, but you may not have learned about those yet, and if your crazy instructor won't let you use arrays, it seems that using structures might well be out of bounds, also. So we'll just write a few functions that accept high and low halves as separate arguments.

Here's our first function, to add two 36-digit numbers. It's actually pretty simple:

void long_add(unsigned long long int *hi, unsigned long long int *lo,
                unsigned long long int addhi, unsigned long long int addlo)
{
    *hi  = addhi;
    *lo  = addlo;
}

The way I've written it, it doesn't compute c = a b; it's more like a = b. That is, it takes addhi and addlo and adds them in to hi and lo, modifying hi and lo in the process. So hi and lo are passed in as pointers, so that the pointed-to values can be modified. The high half is *hi, and we add in the high half of the number to be added in, addhi. And then we do the same thing with the low half. And then — whoops — what about the carry? That's not too hard, but to keep things nice and simple, I'm going to defer it to a separate function. So my final long_add function looks like:

void long_add(unsigned long long int *hi, unsigned long long int *lo,
                unsigned long long int addhi, unsigned long long int addlo)
{
    *hi  = addhi;
    *lo  = addlo;
    check_carry(hi, lo);
}

And then check_carry is simple, too. It looks like this:

void check_carry(unsigned long long int *hi, unsigned long long int *lo)
{
    if(*lo >= 1000000000000000000ULL) {
        int carry = *lo / 1000000000000000000ULL;
        *lo %= 1000000000000000000ULL;
        *hi  = carry;
    }
}

Again, it accepts pointers to lo and hi, so that it can modify them.

The low half is *lo, which is supposed to be at most an 18-bit number, but if it's got 19 — that is, if it's greater than or equal to 1000000000000000000, that means it has overflowed, and we have to do the carry thing. The carry is the extent by which *lo exceeds 18 digits — it's actually just the top 19th (and any greater) digit(s). If you're not super-comfortable with this kind of math, it may not be immediately obvious that taking *lo, and dividing it by that big number (it's literally 1 with eighteen 0's) will give you the top 19th digit, or that using % will give you the low 18 digits, but that's exactly what / and % do, and this is a good way to learn that.

In any case, having computed the carry, we add it in to *hi, and we're done.

So now we're done with addition, and we can tackle multiplication. For our purposes, it's just about as easy:

void long_multiply(unsigned long long int *hi, unsigned long long int *lo,
                                            unsigned int fac)
{
    *hi *= fac;
    *lo *= fac;
    check_carry(hi, lo);
}

It looks eerily similar to the addition case, but it's just what our pencil-and-paper analysis said we were going to have to do. (Again, this is a simplified version.) We can re-use the same check_carry function, and that's why I chose to break it out as a separate function.

With these functions in hand, we can now rewrite the binary-to-decimal program so that it will work with these even bigger numbers:

int main()
{
    unsigned int inp, input;
    unsigned long long int anslo = 0, anshi = 0;
    unsigned long long int place_value_lo = 1, place_value_hi = 0;
    
    printf("Enter a decimal number:\n");
    scanf("%u", &input);
    inp = input;

    while(input){
        int rem = input % 2;
        input = input / 2;

        // ans=ans (rem*place_value);
        unsigned long long int tmplo = place_value_lo;
        unsigned long long int tmphi = place_value_hi;
        long_multiply(&tmphi, &tmplo, rem);
        long_add(&anshi, &anslo, tmphi, tmplo);

        // place_value=place_value*10;
        long_multiply(&place_value_hi, &place_value_lo, 10);
    }

    printf("%u in Decimal is ", inp);
    if(anshi == 0)
         printf("%llu", anslo);
    else printf("%llu8llu", anshi, anslo);
    printf(" in Binary Form.\n");
}

This is basically the same program as in the question, with these changes:

  • The ans and place_value variables have to be greater than 64 bits, so they now exist as _hi and _lo halves.
  • We're calling our new functions to do addition and multiplication on big numbers.
  • We need a tmp variable (actually tmp_hi and tmp_lo) to hold the intermediate result in what used to be the simple expression ans = ans (rem * place_value);.
  • There's no need for the user's input variable to be big, so I've reduced it to a plain unsigned int.

There's also some mild trickiness involved in printing the two halves of the final answer, anshi and anslo, back out. But if you compile and run this program, I think you'll find it now works for any input numbers you can give it. (It should theoretically work for inputs up to 68719476735 or so, which is bigger than will fit in a 32-bit input inp.)


Also, for those still with me, I have to add a few disclaimers. The only reason I could get away with writing long_add and long_multiply functions that looked so small and simple was that they are simple, and work only for "easy" problems, without undue overflow. I chose 18 digits as the maximum for the "high" and "lo" halves because a 64-bit unsigned long long int can actually hold numbers up to the equivalent of 19 digits, and that means that I can detect overflow — of up to one digit — simply, with that > 1000000000000000000ULL test. If any intermediate result ever overflowed by two digits, I'd have been in real trouble. But for simple additions, there's only ever a single-digit carry. And since I'm only ever doing tiny multiplications, I could cheat and assume (that is, get away with) a single-digit carry there, too.

If you're trying to do multiprecision arithmetic in full generality, for multiplication you have to consider partial products that have up to twice as many digits/bits as their inputs. So you either need to use an output type that's twice as wide as the inputs, or you have to split the inputs into halves ("sub-halves"), and work with them individually, basically doing a little 2×2 problem, with various carries, for each "digit".

Another problem with multiplication is that the "obvious" algorithm, the one based on the pencil-and-paper technique everybody learned in elementary school, can be unacceptably inefficient for really big problems, since it's basically O(N2) in the number of digits. People who do this stuff for a living have lots of more-sophisticated techniques they've worked out, for things like detecting overflow and for doing multiplication more efficiently. And then if you want some real fun (or a real nightmare, full of bad flashbacks to elementary school), there's long division...

  • Related