Clang generates strange output when dividing two integers-CodePudding

I have written the following very simple code which I am experimenting with in godbolt's compiler explorer:

#include <cstdint>

uint64_t func(uint64_t num, uint64_t den)
{
    return num / den;
}

GCC produces the following output, which I would expect:

func(unsigned long, unsigned long):
        mov     rax, rdi
        xor     edx, edx
        div     rsi
        ret

However Clang 13.0.0 produces the following, involving shifts and a jump even:

func(unsigned long, unsigned long):                              # @func(unsigned long, unsigned long)
        mov     rax, rdi
        mov     rcx, rdi
        or      rcx, rsi
        shr     rcx, 32
        je      .LBB0_1
        xor     edx, edx
        div     rsi
        ret
.LBB0_1:
        xor     edx, edx
        div     esi
        ret

When using uint32_t, clang's output is once again "simple" and what I would expect.

It seems this might be some sort of optimization, since clang 10.0.1 produces the same output as GCC, however I cannot understand what is happening. Why is clang producing this longer assembly?

CodePudding user response：

The assembly seems to be checking if either num or den is larger than 2**32 by shifting right by 32 bits and then checking whether the resulting number is 0. Depending on the decision, a 64-bit division (div rsi) or 32-bit division (div esi) is performed.

Presumably this code is generated because the compiler writer thinks the additional checks and potential branch outweigh the costs of doing an unnecessary 64-bit division.

CodePudding user response：

If I understand correctly, it just checks if any of the operands is larger than 32-bits and uses different div for "up to" 32 bits and for larger one.