I have gcc 11.3.0 installed using Homebrew on a MacBook Air with Apple Silicon M1 CPU. The binary is the aarch64 native version, not Rosetta emulated. The installed OS is macOS Monterey 12.3.
I'm having an issue compiling a program which uses the ARMv8.2-A SHA-3 extension instructions, which are supported by the M1 CPU. This is a minimal reproducible example:
#include <arm_neon.h>
int main() {
uint64x2_t a = {0}, b = {0}, c = {0};
veor3q_u64(a, b, c);
return 0;
}
This code compiles just fine with the Apple supplied clang compiler.
I compiled it using the following command line for gcc 11:
gcc-11 -o test test.c -march=armv8-a sha3
This results in the following error:
In file included from test.c:1:
test.c: In function 'main':
/opt/homebrew/Cellar/gcc/11.3.0/lib/gcc/11/gcc/aarch64-apple-darwin21/11/include/arm_neon.h:32320:1: error: inlining failed in call to 'always_inline' 'veor3q_u64': target specific option mismatch
32320 | veor3q_u64 (uint64x2_t __a, uint64x2_t __b, uint64x2_t __c)
| ^~~~~~~~~~
test.c:5:5: note: called from here
5 | veor3q_u64(a, b, c);
| ^~~~~~~~~~~~~~~~~~~
Is this a bug in this particular hardware/software combination, or is there some command-line option I can pass to gcc to make this particular program compile?
CodePudding user response:
Solved the problem. It turns out that gcc requires -march=armv8.2-a sha3
rather than just -march=armv8-a sha3
to compile this intrinsic. Indeed, in gcc's version of arm_neon.h
, one can find this right before the block of intrinsics which includes veor3q_u64
:
#pragma GCC target ("arch=armv8.2-a sha3")