Home > OS >  What are the strictly required (bare minimum) libraries/packages to run machine learning with GPU in
What are the strictly required (bare minimum) libraries/packages to run machine learning with GPU in

Time:08-29

In a 'new computer' with Ubuntu 20.04 (using docker and pulling ubuntu:20.04), if I install miniconda3 and just run:

conda install -c anaconda tensorflow-gpu

Everything is good to go to use GPU for machine learning, because I can run:

import tensorflow as tf  
print('Num GPUs Available: ', len(tf.config.list_physical_devices('GPU')))
Num GPUs Available:  1

This is okay.

But the 'problem' is that anaconda installs a lot of packages when I run conda install -c anaconda tensorflow-gpu

Before run the command conda install -c anaconda tensorflow-gpu, if I run conda list I get:

# Name Version Build Channel
_libgcc_mutex 0.1 main
_openmp_mutex 4.5 1_gnu
brotlipy 0.7.0 py39h27cfd23_1003
ca-certificates 2022.3.29 h06a4308_1
certifi 2021.10.8 py39h06a4308_2
cffi 1.15.0 py39hd667e15_1
charset-normalizer 2.0.4 pyhd3eb1b0_0
colorama 0.4.4 pyhd3eb1b0_0
conda 4.12.0 py39h06a4308_0
conda-content-trust 0.1.1 pyhd3eb1b0_0
conda-package-handling 1.8.1 py39h7f8727e_0
cryptography 36.0.0 py39h9ce1e76_0
idna 3.3 pyhd3eb1b0_0
ld_impl_linux-64 2.35.1 h7274673_9
libffi 3.3 he6710b0_2
libgcc-ng 9.3.0 h5101ec6_17
libgomp 9.3.0 h5101ec6_17
libstdcxx-ng 9.3.0 hd4cf53a_17
ncurses 6.3 h7f8727e_2
openssl 1.1.1n h7f8727e_0
pip 21.2.4 py39h06a4308_0
pycosat 0.6.3 py39h27cfd23_0
pycparser 2.21 pyhd3eb1b0_0
pyopenssl 22.0.0 pyhd3eb1b0_0
pysocks 1.7.1 py39h06a4308_0
python 3.9.12 h12debd9_0
readline 8.1.2 h7f8727e_1
requests 2.27.1 pyhd3eb1b0_0
ruamel_yaml 0.15.100 py39h27cfd23_0
setuptools 61.2.0 py39h06a4308_0
six 1.16.0 pyhd3eb1b0_1
sqlite 3.38.2 hc218d9a_0
tk 8.6.11 h1ccaba5_0
tqdm 4.63.0 pyhd3eb1b0_0
tzdata 2022a hda174b7_0
urllib3 1.26.8 pyhd3eb1b0_0
wheel 0.37.1 pyhd3eb1b0_0
xz 5.2.5 h7b6447c_0
yaml 0.2.5 h7b6447c_0
zlib 1.2.12 h7f8727e_1

After run the command conda install -c anaconda tensorflow-gpu, if I run conda list I get:

# Name Version Build Channel
_libgcc_mutex 0.1 main
_openmp_mutex 4.5 1_gnu
_tflow_select 2.1.0 gpu anaconda
absl-py 0.15.0 pyhd3eb1b0_0 anaconda
aiohttp 3.8.1 py39h7f8727e_1 anaconda
aiosignal 1.2.0 pyhd3eb1b0_0 anaconda
astor 0.8.1 py39h06a4308_0 anaconda
astunparse 1.6.3 py_0 anaconda
async-timeout 4.0.1 pyhd3eb1b0_0 anaconda
attrs 21.4.0 pyhd3eb1b0_0 anaconda
blas 1.0 mkl anaconda
blinker 1.4 py39h06a4308_0 anaconda
brotlipy 0.7.0 py39h27cfd23_1003
c-ares 1.18.1 h7f8727e_0 anaconda
ca-certificates 2022.07.19 h06a4308_0 anaconda
cachetools 4.2.2 pyhd3eb1b0_0 anaconda
certifi 2022.6.15 py39h06a4308_0 anaconda
cffi 1.15.0 py39hd667e15_1
charset-normalizer 2.0.4 pyhd3eb1b0_0
click 8.0.4 py39h06a4308_0 anaconda
colorama 0.4.4 pyhd3eb1b0_0
conda 4.13.0 py39h06a4308_0 anaconda
conda-content-trust 0.1.1 pyhd3eb1b0_0
conda-package-handling 1.8.1 py39h7f8727e_0
cryptography 36.0.0 py39h9ce1e76_0
cudatoolkit 10.1.243 h6bb024c_0 anaconda
cudnn 7.6.5 cuda10.1_0 anaconda
cupti 10.1.168 0 anaconda
dataclasses 0.8 pyh6d0b6a4_7 anaconda
frozenlist 1.2.0 py39h7f8727e_0 anaconda
gast 0.4.0 pyhd3eb1b0_0 anaconda
google-auth 2.6.0 pyhd3eb1b0_0 anaconda
google-auth-oauthlib 0.4.4 pyhd3eb1b0_0 anaconda
google-pasta 0.2.0 pyhd3eb1b0_0 anaconda
grpcio 1.42.0 py39hce63b2e_0 anaconda
h5py 2.10.0 py39hec9cf62_0 anaconda
hdf5 1.10.6 hb1b8bf9_0 anaconda
idna 3.3 pyhd3eb1b0_0
importlib-metadata 4.11.3 py39h06a4308_0 anaconda
intel-openmp 2021.4.0 h06a4308_3561 anaconda
keras-preprocessing 1.1.2 pyhd3eb1b0_0 anaconda
ld_impl_linux-64 2.35.1 h7274673_9
libffi 3.3 he6710b0_2
libgcc-ng 9.3.0 h5101ec6_17
libgfortran-ng 7.5.0 ha8ba4b0_17 anaconda
libgfortran4 7.5.0 ha8ba4b0_17 anaconda
libgomp 9.3.0 h5101ec6_17
libprotobuf 3.20.1 h4ff587b_0 anaconda
libstdcxx-ng 9.3.0 hd4cf53a_17
markdown 3.3.4 py39h06a4308_0 anaconda
mkl 2021.4.0 h06a4308_640 anaconda
mkl-service 2.4.0 py39h7f8727e_0 anaconda
mkl_fft 1.3.1 py39hd3c417c_0 anaconda
mkl_random 1.2.2 py39h51133e4_0 anaconda
multidict 5.2.0 py39h7f8727e_2 anaconda
ncurses 6.3 h7f8727e_2
numpy 1.22.3 py39he7a7128_0 anaconda
numpy-base 1.22.3 py39hf524024_0 anaconda
oauthlib 3.1.0 py_0 anaconda
openssl 1.1.1q h7f8727e_0 anaconda
opt_einsum 3.3.0 pyhd3eb1b0_1 anaconda
pip 21.2.4 py39h06a4308_0
protobuf 3.20.1 py39h295c915_0 anaconda
pyasn1 0.4.8 pyhd3eb1b0_0 anaconda
pyasn1-modules 0.2.8 py_0 anaconda
pycosat 0.6.3 py39h27cfd23_0
pycparser 2.21 pyhd3eb1b0_0
pyjwt 2.4.0 py39h06a4308_0 anaconda
pyopenssl 22.0.0 pyhd3eb1b0_0
pysocks 1.7.1 py39h06a4308_0
python 3.9.12 h12debd9_0
python-flatbuffers 2.0 pyhd3eb1b0_0 anaconda
readline 8.1.2 h7f8727e_1
requests 2.27.1 pyhd3eb1b0_0
requests-oauthlib 1.3.0 py_0 anaconda
rsa 4.7.2 pyhd3eb1b0_1 anaconda
ruamel_yaml 0.15.100 py39h27cfd23_0
scipy 1.7.3 py39hc147768_0 anaconda
setuptools 61.2.0 py39h06a4308_0
six 1.16.0 pyhd3eb1b0_1
sqlite 3.38.2 hc218d9a_0
tensorboard 2.8.0 py39h06a4308_0 anaconda
tensorboard-data-server 0.6.0 py39hca6d32c_0 anaconda
tensorboard-plugin-wit 1.8.1 py39h06a4308_0 anaconda
tensorflow 2.4.1 gpu_py39h8236f22_0 anaconda
tensorflow-base 2.4.1 gpu_py39h29c2da4_0 anaconda
tensorflow-estimator 2.6.0 pyh7b7c402_0 anaconda
tensorflow-gpu 2.4.1 h30adc30_0 anaconda
termcolor 1.1.0 py39h06a4308_1 anaconda
tk 8.6.11 h1ccaba5_0
tqdm 4.63.0 pyhd3eb1b0_0
typing-extensions 4.3.0 py39h06a4308_0 anaconda
typing_extensions 4.3.0 py39h06a4308_0 anaconda
tzdata 2022a hda174b7_0
urllib3 1.26.8 pyhd3eb1b0_0
werkzeug 2.0.3 pyhd3eb1b0_0 anaconda
wheel 0.37.1 pyhd3eb1b0_0
wrapt 1.13.3 py39h7f8727e_2 anaconda
xz 5.2.5 h7b6447c_0
yaml 0.2.5 h7b6447c_0
yarl 1.6.3 py39h27cfd23_0 anaconda
zipp 3.8.0 py39h06a4308_0 anaconda
zlib 1.2.12 h7f8727e_1

I know that the following packages are needed:

cudnn
tensorflow-gpu

, so is anything else needed to run the commands?:

import tensorflow as tf  
print('Num GPUs Available: ', len(tf.config.list_physical_devices('GPU')))
Num GPUs Available:  1

,or are all packages installed with conda install -c anaconda tensorflow-gpu necessary?

As the title says, I would like to know which are the strictly required (bare minimum) libraries/packages to run this

Thanks in advance

CodePudding user response:

Yes, it needs a lot more.

Typically, a large and complicated package like tensorflow has a whole tree of dependencies. If I take your list of packages after the install and remove the packages before the install, the following results:

'_tflow_select', 'absl-py', 'aiohttp', 'aiosignal', 'astor',
'astunparse', 'async-timeout', 'attrs', 'blas', 'blinker',
'c-ares', 'cachetools', 'click', 'cudatoolkit', 'cudnn',
'cupti', 'dataclasses', 'frozenlist', 'gast', 'google-auth',
'google-auth-oauthlib', 'google-pasta', 'grpcio', 'h5py',
'hdf5', 'importlib-metadata', 'intel-openmp', 
'keras-preprocessing', 'libgfortran-ng', 'libgfortran4',
'libprotobuf', 'markdown', 'mkl', 'mkl-service', 'mkl_fft',
'mkl_random', 'multidict', 'numpy', 'numpy-base', 'oauthlib',
'opt_einsum', 'protobuf', 'pyasn1', 'pyasn1-modules', 'pyjwt',
'python-flatbuffers', 'requests-oauthlib', 'rsa', 'scipy',
'tensorboard', 'tensorboard-data-server', 
'tensorboard-plugin-wit', 'tensorflow', 'tensorflow-base',
'tensorflow-estimator', 'tensorflow-gpu', 'termcolor', 
'typing-extensions', 'typing_extensions', 'werkzeug',
'wrapt', 'yarl', 'zipp'

Tensorflow depends both on a number of Python and C/C libraries. Each of those may have dependencies of their own. For example, tensorflow requires keras which requires hdf5. And tensorflow requires numpy which requires a BLAS library (in this case mkl) which requires the Fortran runtime.

Now, it may be that some of those dependencies are optional. But at first glance I don't see any of those.

Trying to pare down the dependencies is a significant task; you would basically have to build the whole dependency tree from source, for every dependency checking which of its dependencies are optional and if you want to do without them.

Personally, I would not bother in this case.

  • Related