I read that Unit tests run fast. If they don’t run fast, they aren’t unit tests. A test is not a unit test if 1. It talks to a database. 2. It communicates across a network. 3. It touches the file system. 4. You have to do special things to your environment (such as editing configuration files) to run it.
in Working Effectively with legacy code (book)
.
I have a function that is downloading the zip from the internet and then converting it into a python object for a particular class.
import typing as t
def get_book_objects(date: str) -> t.List[Book]:
# download the zip with the date from the endpoint
res = requests.get(f"HTTP-URL-{date}")
# code to read the response content in BytesIO and then use the ZipFile module
# to extract data.
# parse the data and return a list of Book object
return books
let's say I want to write a unit test for the function get_book_objects
. Then how am I supposed to write a unit test without making a network request? I mean I prefer file system read-over a network request because it will be way faster than making a request to the network although it is written that a good unit test also not touches the file system I will be fine with that.
So even if I want to write a unit test where I can provide a local zip file I have to modify the existing function to open the file from the local file system or I have to add some additional parameter to the function so I can send a zip file path from unit test function.
What will you do to write a good unit test in this kind of situation?
CodePudding user response:
What will you do to write a good unit test in this kind of situation?
In the TDD world, the usual answer would be to delegate the work to a more easily tested component.
Consider:
def get_book_objects(date: str) -> t.List[Book]:
# This is the piece that makes get_book_objects hard
# to isolate
http_get = requests.get
# download the zip with the date from the endpoint
res = http_get(f"HTTP-URL-{date}")
# code to read the response content in BytesIO and then use the ZipFile module
# to extract data.
# parse the data and return a list of Book object
return books
which might then become something like
def get_book_objects(date: str) -> t.List[Book]:
# This is the piece that makes get_book_objects hard
# to isolate
http_get = requests.get
return get_book_objects_v2(http_get, date)
def get_book_objects_v2(http_get, date: str) -> t.List[Book]
# download the zip with the date from the endpoint
res = http_get(f"HTTP-URL-{date}")
# code to read the response content in BytesIO and then use the ZipFile module
# to extract data.
# parse the data and return a list of Book object
return books
get_book_objects
is still hard to test, but it is also "so simple that there are obviously no deficiencies". On the other hand, get_book_objects_v2
is easy to test, because your test can control what callable is passed to the subject, and can use any reasonable substitute you like.
What we've done is shift most of the complexity/risk into a "unit" that is easier to test. For the function that is still hard to test, we'll use other techniques.
When authors talk about tests "driving" the design, this is one example - we're treating "complicated code needs to be easy to test" as a constraint on our design.
You've already identified the correct reference (Working Effectively with Legacy Code). The material you want is the discussion of seams.
A seam is a place where you can alter behavior in your program without editing in that place.
(In my edition of the book, the discussion begins in Chapter 4).