I'm writing a module that will use CuPy (an implementation of the Numpy and Scipy APIs using CUDA) for fast analysis of data my workplace works with, but I want the functions to be usable on computers that don't have access to CuPy. I was considered just doing something like:
try:
import cupy as np
import cupyx.scipy as scipy
except ModuleNotFoundError:
import numpy as np
import scipy
But this doesn't really work for a few reasons. First, it only covers the case that cupy isn't installed, but not that cupy is installed but CUDA isn't working or the GPU can't be detected for whatever reason. Also, cupy isn't a perfect drop-in replacement for numpy, for example when passing a cupy array to a non-cupy function that takes a numpy array, you need to call .get() as cupy generates an error instead of doing implicit conversion to numpy.
Is the best way just to add a "use_gpu" parameter to every function, and then have if statements checking that parameter and doing the numpy or cupy specific code where necessary? So for example:
import numpy # Sometimes we want to use a numpy function even if CuPy is available
try:
import cupy as np
except ModuleNotFoundError:
import numpy as np
print("CuPy not available, only use_gpu=False will work")
import pandas as pd
def some_function(arg1, arg2, use_gpu=True):
"""Just some example function doing something arbitrary."""
x = np.func1(arg1)
y = np.func2(arg2)
res = np.hstack(x, y)
if use_gpu:
return pd.DataFrame(res.get())
else:
return pd.DataFrame(res)
Is there a more elegant way to do this?
CodePudding user response:
I do something similar using find_spec
:
import importlib
# I have numpy installed, this returns a ModuleSpec object
importlib.util.find_spec('numpy')
Out[4]: ModuleSpec(name='numpy', loader=<_frozen_importlib_external.SourceFileLoader object at 0x7f7949e22ee0>, origin='/home/davidparks21/opt/anaconda3/lib/python3.8/site-packages/numpy/__init__.py', submodule_search_locations=['/home/davidparks21/opt/anaconda3/lib/python3.8/site-packages/numpy'])
# I don't have cupy installed, this returns None
importlib.util.find_spec('cupy')
CodePudding user response:
Update
As long as you don't allow user overrides (because np
in your code is either cupy
or numpy
depending on cupy
presence), just a single global flag is enough:
import numpy # Sometimes we want to use a numpy function even if CuPy is available
try:
import cupy as np
_USE_GPU = True
except ModuleNotFoundError:
import numpy as np
print("CuPy not available, only use_gpu=False will work")
_USE_GPU = False
import pandas as pd
def some_function(arg1, arg2):
if _USE_GPU:
return do_gpu()
else:
return do_no_gpu()
Old answer
With the following pattern you are more user-friendly:
import numpy # Sometimes we want to use a numpy function even if CuPy is available
USE_GPU = None
try:
import cupy as np
USE_GPU_DEFAULT = True
except ModuleNotFoundError:
import numpy as np
print("CuPy not available, only use_gpu=False will work")
USE_GPU_DEFAULT = USE_GPU = False
import pandas as pd
def _should_use_gpu(local_flag: bool | None) -> bool:
if local_flag is not None:
return local_flag
return USE_GPU or USE_GPU_DEFAULT
# I'd rather make use_gpu kw-only to avoid confusion
def some_function(arg1, arg2, *, use_gpu=None):
if _should_use_gpu(use_gpu):
return do_gpu()
else:
return do_no_gpu()
Now user can choose preferred method and override it on per-call basis:
import your_module
your_module.USE_GPU = True # Set global preference
your_module.some_function(1, 2)
your_module.some_function(1, 2, use_gpu=True) # override once