Designing software tests that actually help — Part 1.

10 min readJan 17, 2021

Writing tests isn’t what some software developers associate with good use of time (the argument is usually along with the developers vs. testers line — developers want little to do with writing tests 😂). Many developers believe that the effort to benefit ratio isn’t something that encourages this very crucial step in creating maintainable software because it simply doesn’t help expose edge cases well enough and/or there isn’t enough time. However, most agree on its necessity to keep things in check (just in case). In other words, it’s optional — Well, in reality, it mostly depends on the client’s ability to make provisions for it in the software project budget (in terms of time and money).

Nevertheless, in scenarios where we can and have the time to write tests, we should. It shouldn’t be optional. In this article, I am going to talk about designing tests (it doesn’t matter whether it’s TDD or ATDD or whatever you want to name your method to testing) that actually helps to certify software stability levels with a very high level of certainty (I won’t be talking about test coverage — sorry) and what testing techniques better serve this objective. I will take us through 2 powerful testing techniques: Decision Table Testing and Equivalence Partition Testing. These techniques can even ultimately reduce the number of test cases/suites you need to come up with(via sampling subsets) and still cover salient edge cases and expose unhandled error scenarios. I personally believe it is more important to focus on the quality of your test cases than be fixated on test coverage scores. Don’t get me wrong: test coverage is important but is second on the scale of importance.

Now, it’s common knowledge that writing tests (TDD, BDD, or ATDD - whichever you prefer) cost time (usually the time we don’t have). But you know what’s funny, the same (or even more) amount of time we refuse to put into writing tests early on, we spend later on firefighting, hot-fixing, and squashing bugs till late at night! Error after error, bug after bug; we become consumed in getting things to “just work” rather than work reliably. So, just as one bug is getting fixed, we discover another just around the corner that is usually a product of an unchecked precondition OR an unexpected/unhandled state transition in the code.

This brings us back to the question on everyone’s mind.

What is the aim of testing code anyway if the tests I write don’t inform me about invalid state transitions and edge cases before hand ?

The answer to this question is to utilize techniques for coming up with tests that increase test coverage easily and quickly while improving test quality. I just mentioned 2 of them earlier in this article which I will discuss in a bit.

The next question after this is usually

So, How do I know what to instructively test for; how do I come up with quality test cases ?

This question is at the heart of designing tests. I will be answering this question in Part 2. For now, let’s continue…

The core of the correctness (adherence to the requirement specifications) of any software in animation (running software) is hinged on 2 things:

code execution’s flow (concurrent or asynchronous)
code state flow/transition (valid or invalid)

In any running software, if any of the 2 above don’t align with each other, we get a bug! Bugs are the bane of any software in continual development. Beyond writing tests to ensure software correctness. We write tests to expose bugs, check exception handling and check state changes in a part or whole of the software we create. When there’s a misalignment with code execution's flow, we have indeterminism (e.g. race conditions) leading to unpredictable behavior (think: control timing issues — e.g. TOCTOU race conditions). And when there’s a misalignment with code state flow/transition, we have inconsistent shared state leading to undefined behavior (think: side effect issues — e.g. leaky state syndrome which can be solved by either making objects immutable or applying defensive copying OR non-destructive updates to mutable objects)

Before we move on, let’s get something straight; Errors aren’t Bugs and Bugs aren’t Errors. Bugs could cause errors and vice versa but they are 2 different things.

Errors are caused by failures (or defects) in the execution of code logic where the software, at runtime, is put into an incorrect and/or irreconcilable state due to a coding mistake (errors of omission and/or commission). These are mostly caused by breaking the rules of a coding language. Bugs on the other hand are faults in the logic (executed serially or in parallel — think “concurrency”) of the software that is then reflected in the incorrect (or sometimes unpredictable) output/behavior for specific inputs under specific conditions. In other words, Bugs cause software to give out unintended and unanticipated output (an error may or may not accompany a bug) and errors are ensured by incorrect runtime state.

We know that type unsafe logic is one of the leading causes of errors/bugs followed by the limitations of shared-state concurrency (message-passing concurrency seem a better option — A way to overcome the problem of shared mutability/mutable state ensuring thread-safe operations) in any piece of software and then finally by incorrect handling of resources like heap memory space allocations (e.g. ) and/or I/O handles (e.g. file descriptors). There are bugs that don’t impact overall software behavior or don’t matter to a development team. Such bugs aren’t registered and do not concern a tester so do not count as bugs in the context of that software project (can also apply in cases where a bug is taken as a feature). See this apt yet funny distinction on Quora.

We have two broad categories of testing:

Black-box Testing
White-box Testing

Black-box Testing: Black-box testing is a method of software testing that examines the functionality (without knowing the internal workings) of software based on the requirement specifications. It is focused on testing the logical output of the parts or the whole of the software. It tests the invalid (exceptional) cases, representative cases, and boundary conditions. This category of testing doesn’t concern itself with non-functional issues like performance OR security.

This method of test can be applied to each and every level of software testing such as unit, integration, system, and acceptance testing.

White-box Testing: White-box testing is a method of software testing that looks at the logical behavior of software based on the implementation of the requirement specifications (non-functional) and well as the functionality of the software. This category of testing focuses heavily on introspection and code implementation optimization. This includes attending to things like software performance.

This method of test can also be applied to each and every level of software testing such as unit, integration, system (e2e), and acceptance testing.

We aren’t going to talk about White-box testing because apart from being extremely cumbersome and expensive, it is hardly used in practice and scales badly with large codebases/software applications. We are going to look at Black-box testing as it is easier and faster and scales better.

2 reasons why coming up with test cases can seem difficult

Now, let's talk about test design and how to come up with test suites/cases using the techniques under Black-box testing. Testing is about being specific in each case and leaving the frivolities out! For instance, some software engineers are trying to test aspects of a software artifact/unit that are a waste of time to test. There’s no need to test argument length to an object’s method OR test the setter or getter of an object works. These are things that can be handled by invariants. You don’t also have to connect your unit tests to actual live databases. Apart from making your unit tests (running in and as part of CI/CD pipeline) slower as time goes on (cos they have to connect to live databases and run live queries), it also doesn’t validate your tests any more than using a mock database in your unit tests. There’s a reason why testing frameworks come with function stubs and mocks. There’s however one rule of thumb in stubbing/mocking anything:

Don’t mock anything you don’t utilise fully or own and control!

In order to fully test that all the requirements of a software application are met, there must be at least two test cases for each requirement: one positive test and one negative test (think: Boolean). Also, keeping track of the link between a requirement specification and a test suite (and all its test cases) is usually done using a document called a Requirements Traceability Matrix. Below is a sample image of what it looks like:

Courtesy of Software Testing: Introduction to Testing — Eze Sunday Eze

To be able to come up with the above document, the functional requirements have to be clear and precise (no ambiguity). Also, this document need not be produced before you start writing actual tests (whether TDD or not). They mostly serve as a signoff step that certifies the software built works correctly and can be useful in tracking bugs from when they are reported in production to when they are squashed!

Now, even though test design is a manual process, it is still a logical one too. Same for Unit/Integration tests.

Don’t focus on the behavior and properties of the single software artifact/unit you’ve just built (a.k.a don’t focus on implementation details). You should rarely have to change your tests when you change implementation details. Instead, focus on the output under given conditions that match the requirements.

Let me break it down a bit more:

For each top-level functional requirement, you are focused on the output under a given condition(s). This is why we use the term “Assertion” in testing.

Assertions are the result of combining condition(s) with output. They are the lifeblood of tests. Below is the structure of a test case in a diagram:

The reason I put up the figure above (diagram of a Test case) ☝🏾 is that that is the pattern you need to fit a React component OR web route action method or any software artifact/unit into. e.g. A ReactJS component receives input and gives output however its output is no dependent on its input but also on the conditions the component is subjected to.

The output is supposed to change (or remain the same) under given conditions. Remember, I said focus on output and conditions (a condition could be the data type/value of the input or passing no input at all to software artifact/unit). Assertions are carried out on the output to determine if a test case passes or fails.

Let’s take a real-life example; How would we go about designing a test (suite) for a login page — based on the functional requirements for a “fictitious” software project (assume this software is built using a React/Vue component) like below?

Firstly, we need to create a decision table to flesh out the details of test case assertions(s), conditions and output. Below is the decision table:

One thing about decision tables is that it easily persuades you to look at every possible combination of input conditions and output. This means it is easy to see both edge and impossible cases that won’t be obvious at first glance OR while writing the code to implement the login page. The downside is that with each input or condition added, the number of combinations to inspect grows exponentially ( 2 ^ n: where n = number of input conditions). The upside is that it helps with test coverage for both simple and complex software in terms of business logic and requirements. Decision table testing is a form of Black-box testing (as well as Equivalence partition testing — which I will discuss in Part 2 of this article).

You might ask: how do we deal with the exploding number of combinations (proportional to the number of assertions) as we try to add more input conditions to the ones that we have to deal with. In the decision table above, we have 5 assertions (that’s because I took away 3 assertions). Now, because the decision table can be treated as a good old truth table (meaning in the decision table above, there ought to be 8 assertions — 2 ^ 3: n = 3), we can have impossible case assertions and also have edge case assertions as well. To begin refining the assertions, we remove the redundant case and impossible case assertions (but keep edge case assertions) based on the combination of input conditions for each assertion.

For instance, “Email entered” and “Password entered” cannot both be FALSE and “Password passes validation” be TRUE. That’s outright impossible so we prune that assertion out (meaning no test case will test for this assertion). Also, it is important to figure out a condition combination (or set of condition combinations) that doesn’t directly or indirectly affect the output in all input conditions for each assertion. For instance, Whether or not “Email entered” is TRUE or FALSE, as long as “Password entered” is FALSE, the output will always be “Show error message, Login button disabled”. It won’t matter whether “Password passes validation” is TRUE or FALSE. So, we get rid of Assertion 1 as it is a redundant case assertion.

Now, we are down to 4 assertions from 8! How cool is that? Yippee!! 🎉

Finally, using Black-box testing doesn’t stop you from doing Ad-hoc testing (testing that is done manually by humans which introduces a little entropy and improvisation). But using just Ad-hoc testing alone is not a great testing strategy and lacks any design. It is good to realize that testing isn’t about eliminating bugs completely but mitigating them so don’t overdo it. Another thing is that having a software product map (using perhaps a mind map) is also crucial to enabling great test design and speedy test case development. See you in the next article — Part 2!

Meanwhile, we can then go ahead and split each of the remaining 4 assertions into test cases inside a test suite. Like so:

Designing software tests that actually help — Part 1.

Written by Ifeora Okechukwu