Home > Enterprise >  How to constrain a Union so that input and output types match?
How to constrain a Union so that input and output types match?

Time:10-05

I have a testcase.py:

import pathlib
import typing as tp


# Not under my control

PathType = tp.Union[str, pathlib.Path]
def foreign(filename: PathType) -> PathType:
    return filename


# Under my control

T = tp.TypeVar('T', str, pathlib.Path)

def my_func(filename: T) -> T:
    return foreign(filename)


if __name__ == '__main__':
    path1: str = '/abc/efg/string.py'
    san_path1: str = my_func(path1)
    print(san_path1, type(san_path1))

    path2: pathlib.Path = pathlib.Path('/abc/efg/pathlib.py')
    san_path2: pathlib.Path = my_func(path2)
    print(san_path2, type(san_path2))

There are two sections in this file. In section "Not under my control" I am simulating function from module that is not under my control, but this function is defined as presented here.

In section "Under my control" I am trying to enforce that if I call my_func with str, to get back str, or if I call my_func with pathlib.Path to get back pathlib.Path, but to prevent situations in which I will call the function with str and get back pythlib.Path and vice verse.

The code works well. If I run it the output is:

$ python testcase.py 
/abc/efg/string.py <class 'str'>
/abc/efg/pathlib.py <class 'pathlib.PosixPath'>

But mypy complains:

$ mypy testcase.py 
testcase.py:17: error: Incompatible return value type (got "Union[str, Path]", expected "str")
testcase.py:17: error: Incompatible return value type (got "Union[str, Path]", expected "Path")
Found 2 errors in 1 file (checked 1 source file)

Line 17 is return foreign(filename). How to satisfy mypy?

CodePudding user response:

foreign doesn't guarantee that it will return a str for str input and the same for Path. It only guarantees what's in the types (unless documentation specifies otherwise): that it will return either type regardless of input. Since it's outside your control the authors could change the implementation such that e.g. it always returns a Path, thus breaking your code.

If you can't be sure how foreign works, you could convert as necessary

def my_func(filename: T) -> T:
    res = foreign(filename)
    
    return str(res) if isinstance(filename, str) else pathlib.Path(res)

Or, if you have tests that check foreign returns the same type it's given, you can just assert the types in my_func

def my_func(filename: T) -> T:
    res = foreign(filename)

    if isinstance(filename, str):
        assert instance(res, str)
        return res
    else:
        assert instance(res, pathlib.Path)
        return res

or even better raise an exception on failure

def my_func(filename: T) -> T:
    res = foreign(filename)

    if isinstance(filename, str) and isinstance(res, str):
        return res
    elif isinstance(filename, pathlib.Path) and isinstance(res, pathlib.Path):
        return res
        
    sys.exit("Fatal error")  # we exit if our code's broken

or, if you don't want the runtime cost of these checks and you're happy to rely on runtime tests for verification for foreign, just # type: ignore the return statement in my_func.

CodePudding user response:

You could use typing.overload instead of TypeVar. In essence, this allows you to describe the correct combinations of output and input parameters:

# Under my control

@tp.overload
def my_func(filename: str) -> str: ...
@tp.overload
def my_func(filename: pathlib.Path) -> pathlib.Path: ...

def my_func(filename: PathType) -> PathType:
    return foreign(filename)
  • Related