Home > OS >  Which linux OS supports AVX-512 VNNI (Vector Neural Network Instruction)?
Which linux OS supports AVX-512 VNNI (Vector Neural Network Instruction)?

Time:06-18

I need to deploy an EC2 instance where VNNI (Vector Neural Network Instruction) is supported. There are some EC2 instance types that can support the same.

From AWS:

Intel Deep Learning Boost (Intel DL Boost): A new set of built-in processor technologies designed to accelerate AI deep learning use cases. The 2nd Gen Intel Xeon Scalable processors extend Intel AVX-512 with a new Vector Neural Network Instruction (VNNI/INT8) that significantly increases deep learning inference performance over previous generation Intel Xeon Scalable processors (with FP32), for image recognition/segmentation, object detection, speech recognition, language translation, recommendation systems, reinforcement learning and others. VNNI may not be compatible with all Linux distributions. Please check documentation before using.

It is mentioned that VNNI may not be compatible with all Linux distributions. So, which Linux distribution supports VNNI? I am also not sure as to which documentation this statement refers to.

CodePudding user response:

In AWS, the instance type and OS combination that worked for me:

  • EC2 instance type: m5n.large (m5n instance family supports AVX-512 VNNI)
  • OS: Amazon Linux 2 (other Linux distributions should work as well, as explained by @BasileStarynkevitch and @PeterCordes).

For curious minds: What Linux distribution is the Amazon Linux AMI based on?

CodePudding user response:

No kernel support is needed beyond that for AVX-512 (i.e. context switch handling of the new AVX-512 zmm and k registers). AVX-512VNNI instructions just operate on those registers, so there's no new architectural state to save/restore on context switch. https://en.wikichip.org/wiki/x86/avx512_vnni / https://en.wikipedia.org/wiki/AVX-512#VNNI

(Unlike AMX (Advanced Matrix Extensions), new in Sapphire Rapids; that does introduce large new "2D tile" registers, 8x 1KiB, that context-switches need to handle1.)


The other relevant thing for distros are compilers versions, like GCC or clang. https://godbolt.org/z/668rvhWPx shows GCC 8.1 and clang 7.0 (both released in 2018) compiling AVX-512VNNI _mm512_dpbusd_epi32 with -march=icelake-server or -march=icelake-client. Versions before that fail, so those are the minimum versions. (Or clang6.0 for -mavx512vnni, but that doesn't enable other things an IceLake CPU supports, or set tuning options.)

So if you want to use the latest hotness, you need a compiler that's at least somewhat up to date. It's generally a good idea to use a compiler newer than the CPU you're using, so compiler devs have had a chance to tweak tuning settings for it. And code-gen from intrinsics, especially newish instruction-sets like AVX-512, has generally improved over compiler versions, so if you care about performance of the generated code, you typically want a newer compiler version. (Regressions happen for some releases for some loops/functions, and thus for some programs, but on average newer compilers make faster code than old ones. That's a big part of what compiler devs spend time improving.)

You can install a new compiler on an old distro via backport packages or manually. Or you can just use a distro release that isn't old and crusty.


Footnote 1: See also a phoronix article re: non-empty AMX register state keeping the CPU from doing a deep sleep. Normally CPUs fully power down the core in deeper sleep states, stashing registers somewhere that stays powered. I'm guessing that they didn't provide space for AMX tiles to do that, so having state there prevents sleep. So if you're using AMX, you'll want Linux kernel at least 5.19.

  • Related