Home > front end >  C how to set environment variable so OpenBLAS runs multithreaded
C how to set environment variable so OpenBLAS runs multithreaded

Time:06-18

The author recommends the following: https://github.com/xianyi/OpenBLAS

Setting the number of threads using environment variables

Environment variables are used to specify a maximum number of threads. For example,

export OPENBLAS_NUM_THREADS=4
export GOTO_NUM_THREADS=4
export OMP_NUM_THREADS=4

The priorities are OPENBLAS_NUM_THREADS > GOTO_NUM_THREADS > OMP_NUM_THREADS.

If you compile this library with USE_OPENMP=1, you should set the OMP_NUM_THREADS environment variable; OpenBLAS ignores OPENBLAS_NUM_THREADS and GOTO_NUM_THREADS when compiled with USE_OPENMP=1.

When I use "export OPENBLAS_NUM_THREADS=16" in my main.cpp, I get an error about templates.

So, I changed my CMakeList.txt file to include:

set($ENV{OPENBLAS_NUM_THREADS} 16)

This seemed to have no effect on the threading of my application. I only see 1 CPU core at 100%.

CodePudding user response:

When I use "export OPENBLAS_NUM_THREADS=16" in my main.cpp, I get an error about templates.

OPENBLAS_NUM_THREADS is a runtime defined variable so it should not impact the build of an application unless the build scripts explicitly use this variable which is very unusual and a very bad idea (since the compile-time environment can be different from the run-time one).

Note that export OPENBLAS_NUM_THREADS=16 is a bash command and not something to put in a C file. Its purpose is to set the environment variable OPENBLAS_NUM_THREADS so it can be read at runtime by OpenBLAS when your application call a BLAS function. You should do something like:

# Build part
cmake  # with the correct parameters
make

# Running part
export OPENBLAS_NUM_THREADS=4
./your_application  # with the correct parameters

# Alternative solution:
# OPENBLAS_NUM_THREADS=4 ./your_application

So, I changed my CMakeList.txt file to include:

This should have not effect indeed because the variable should not be used at compile time.

I only see 1 CPU core at 100%

Note that setting OPENBLAS_NUM_THREADS may not be enough to use multiple threads in practice. If your matrices are small, then consider reading this very-recent post about how OpenBLAS works with multiple threads.

  • Related