Should you reuse system functionality in tests, or be explicit?-CodePudding

When writing tests is it acceptable (or should I) to use functionality from elsewhere in the application to assist in a test.

So as an example, the application I am writing tests for uses the CQRS pattern. A lot of the existing tests make use of these commands, queries and handlers when performing the arrange part of a test. They all have their own test cases so I should be OK to accept they function as expected.

I am curious though if this is best practice or if I should be performing setup during the arrange of a test manually (without using other application functionality)? If one of the commands, queries or handlers breaks, then my 'unrelated' test breaks too? Is this good or bad?

CodePudding user response：

When writing tests is it acceptable (or should I) to use functionality from elsewhere in the application to assist in a test.

There are absolutely circumstances where using functionality from elsewhere is going to have good trade offs.

In my experience, it is useful to think about an automated check as consisting of two parts - a measurement that produces a value, and a validation that evaluates whether that value satisfies some specification.

Measurement actual = measurement(args)
assert specification.isSatisfiedBy(actual)

In the specification part, re-using code is commonplace. Consider

String actual = measurement(args)
assert specification.expected.equals(actual)

So here, we have introduced a dependency on String::equals, and that's fine, we have lots and lots of confidence that String::equals is correct, thanks to the robust distributed test program of everybody in the world using it.

Foo actual = measurement(args)
assert specification.expected.equals(actual)

Same idea here, except that instead of some general purpose type we are using our own bespoke equality check. If the bespoke equality check is well tested, then you can be confident that any assertion failures indicate a problem in the measurement. (If not, well then at least the check signals that measurement and specification are in disagreement, and you can investigate why.)

Sometimes, you'll want to have an explicit dependency on other parts of the system, because that's a better description of the actual requirements. For example, compare

int actual = foo("a")
assert 7 == actual

with

assert 7 == bar(0) // This check might be in a different test
assert bar(0) == foo("a")

At a fixed point in time, these spellings are essentially equivalent; but for tests that are expected to evaluate many generations of an evolving system, the verification is somewhat different:

// Future foo should return the same thing as today's foo
assert 7 == foo("a")

// Future foo should return the same thing as future bar
assert bar(0) == foo("a")

Within measurements, the tradeoffs are a bit different, but because you included cqrs I'll offer one specific observation: measurements are about reads.

(Sometimes what we read is "how many times did we crash?" or "what messages did we send?" but, explicit or implicit, we're evaluating the information that comes out of our system).

That means that including a read invocation in your measurement is going to be common, even in designs where you have decoupled reads from writes.

A lot of the existing tests make use of these commands, queries and handlers when performing the arrange part of a test.

Yup and the answer is the same - we're still talking about tradeoffs: does the test detect the problems you want it to? how expensive is it to track down the fault that was detected? How common are false positives (the "fault" is in the test itself, not the test subject)? How much future work are you signing up for just to "maintain" the test (which is related, in part, to how "stable" the dependencies are) during its useful lifetime.