Home > Software design >  Set XMM register via address location for X86-64
Set XMM register via address location for X86-64

Time:11-28

I have a float value at some address in memory, and I want to set an XMM register to that value by using the address. I'm using asmjit.

This code works for a 32 bit build and sets the XMM register v to the correct value *f:

using namespace asmjit;
using namespace x86;

void setXmmVarViaAddressLocation(X86Compiler& cc, X86Xmm& v, const float* f)
{
   cc.movq(v, X86Mem(reinterpret_cast<std::uintptr_t>(f)));
}

When I compile in 64 bits, though, I get a segfault when trying to use the register. Why is that?

(And yes, I am not very strong in assembly... Be kind... I've been on this for a day now...)

CodePudding user response:

The simplest solution is to avoid the absolute address in ptr(). The reason is that x86/x86_64 requires a 32-bit displacement, which is not always possible for arbitrary user addresses - the displacement is calculated by using the current instruction pointer and the target address - if the difference is outside a signed 32-bit integer the instruction is not encodable (this is an architecture constraint).

Example code:

using namespace asmjit;

void setXmmVarViaAddressLocation(x86::Compiler& cc, x86::Xmm& v, const float* f)
{
    x86::Gp tmpPtr = cc.newIntPtr("tmpPtr");
    cc.mov(tmpPtr, reinterpret_cast<std::uintptr_t>(f);
    cc.movq(v, x86::ptr(tmpPtr));
}

If you want to optimize this code for 32-bit mode, which doesn't have the problem, you would have to check the target architecture first, something like:

using namespace asmjit;

void setXmmVarViaAddressLocation(x86::Compiler& cc, x86::Xmm& v, const float* f)
{
    // Ideally, abstract this out so the code doesn't repeat.
    x86::Mem m;
    if (cc.is32Bit()) {
        m = x86::ptr(reinterpret_cast<std::uintptr_t>(f));
    }
    else {
        x86::Gp tmpPtr = cc.newIntPtr("tmpPtr");
        cc.mov(tmpPtr, reinterpret_cast<std::uintptr_t>(f);
        m = x86::ptr(tmpPtr);
    }

    // Do the move, now the content of `m` depends on target arch.
    cc.movq(v, x86::ptr(tmpPtr));
}

This way you would save one register in 32-bit mode, which is always precious.

  • Related