Home > Enterprise >  Multiplication of complex numbers using AVX2 FMA3
Multiplication of complex numbers using AVX2 FMA3

Time:05-24

I have found some solutions where each AVX2 register holds both, the real and imaginary part of the complex numbers. I am interested in a solution where each AVX2 registers holds either the real or the imaginary part.
Assuming we have 4 AVX2 registers:R1, I1, R2, I2
Registers R1, I1 form 4 complex numbers. Same applies for the remaining two registers. Now I want to multiply the 4 complex numbers of R1, I1 with the 4 complex numbers of R2, I2. What would be the most efficient way to do this? Besides AVX2, FMA3 can be used as well.

CodePudding user response:

You wrote you have AVX2, all Intel and AMD AVX2 processors also support FMA3. For this reason, I would do it like that.

// 4 FP64 complex numbers stored in 2 AVX vectors,
// de-interleaved into real and imaginary vectors
struct Complex4
{
    __m256d r, i;
};

// Multiply 4 complex numbers by another 4 numbers
Complex4 mul4( Complex4 a, Complex4 b )
{
    Complex4 prod;
    prod.r = _mm256_mul_pd( a.r, b.r );
    prod.i = _mm256_mul_pd( a.r, b.i );
    prod.r = _mm256_fnmadd_pd( a.i, b.i, prod.r );
    prod.i = _mm256_fmadd_pd( a.i, b.r, prod.i );
    return prod;
}

Or if you targeting that one VIA processor which doesn’t have FMA, replace the FMA intrinsics with the following lines:

prod.r = _mm256_sub_pd( prod.r, _mm256_mul_pd( a.i, b.i ) );
prod.i = _mm256_add_pd( prod.i, _mm256_mul_pd( a.i, b.r ) );
  • Related