Home > Enterprise >  What is the best way to write a module that uses CuPy if available, and falls back to Numpy/Scipy ot
What is the best way to write a module that uses CuPy if available, and falls back to Numpy/Scipy ot

Time:07-21

I'm writing a module that will use CuPy (an implementation of the Numpy and Scipy APIs using CUDA) for fast analysis of data my workplace works with, but I want the functions to be usable on computers that don't have access to CuPy. I was considered just doing something like:

try:
    import cupy as np
    import cupyx.scipy as scipy
except ModuleNotFoundError:
    import numpy as np
    import scipy

But this doesn't really work for a few reasons. First, it only covers the case that cupy isn't installed, but not that cupy is installed but CUDA isn't working or the GPU can't be detected for whatever reason. Also, cupy isn't a perfect drop-in replacement for numpy, for example when passing a cupy array to a non-cupy function that takes a numpy array, you need to call .get() as cupy generates an error instead of doing implicit conversion to numpy.

Is the best way just to add a "use_gpu" parameter to every function, and then have if statements checking that parameter and doing the numpy or cupy specific code where necessary? So for example:

import numpy # Sometimes we want to use a numpy function even if CuPy is available
try:
    import cupy as np
except ModuleNotFoundError:
    import numpy as np
    print("CuPy not available, only use_gpu=False will work")
import pandas as pd

def some_function(arg1, arg2, use_gpu=True):
    """Just some example function doing something arbitrary."""
    x = np.func1(arg1)
    y = np.func2(arg2)
    res = np.hstack(x, y)
    if use_gpu:
        return pd.DataFrame(res.get())
    else:
        return pd.DataFrame(res)

Is there a more elegant way to do this?

CodePudding user response:

I do something similar using find_spec:

import importlib

# I have numpy installed, this returns a ModuleSpec object
importlib.util.find_spec('numpy')
Out[4]: ModuleSpec(name='numpy', loader=<_frozen_importlib_external.SourceFileLoader object at 0x7f7949e22ee0>, origin='/home/davidparks21/opt/anaconda3/lib/python3.8/site-packages/numpy/__init__.py', submodule_search_locations=['/home/davidparks21/opt/anaconda3/lib/python3.8/site-packages/numpy'])

# I don't have cupy installed, this returns None
importlib.util.find_spec('cupy')

CodePudding user response:

Update

As long as you don't allow user overrides (because np in your code is either cupy or numpy depending on cupy presence), just a single global flag is enough:

import numpy # Sometimes we want to use a numpy function even if CuPy is available

try:
    import cupy as np
    _USE_GPU = True
except ModuleNotFoundError:
    import numpy as np
    print("CuPy not available, only use_gpu=False will work")
    _USE_GPU = False

import pandas as pd

def some_function(arg1, arg2):
    if _USE_GPU:
        return do_gpu()
    else:
        return do_no_gpu()

Old answer

With the following pattern you are more user-friendly:

import numpy # Sometimes we want to use a numpy function even if CuPy is available

USE_GPU = None
try:
    import cupy as np
    USE_GPU_DEFAULT = True
except ModuleNotFoundError:
    import numpy as np
    print("CuPy not available, only use_gpu=False will work")
    USE_GPU_DEFAULT = USE_GPU = False

import pandas as pd

def _should_use_gpu(local_flag: bool | None) -> bool:
    if local_flag is not None:
        return local_flag
    return USE_GPU or USE_GPU_DEFAULT

# I'd rather make use_gpu kw-only to avoid confusion
def some_function(arg1, arg2, *, use_gpu=None):
    if _should_use_gpu(use_gpu):
        return do_gpu()
    else:
        return do_no_gpu()

Now user can choose preferred method and override it on per-call basis:

import your_module
your_module.USE_GPU = True  # Set global preference

your_module.some_function(1, 2)
your_module.some_function(1, 2, use_gpu=True)  # override once
  • Related