Unit testing approaches

Published on: 2025-11-02
Reading time: 11 minutes

Some thoughts about how to structure tests and code to make testing easier.

By Tom Hulton-Harrop

Motivation

Writing tests is crucial, but doing so without the right approach can lead to maintenance headaches and hard to diagnose failures. In the first part of this post, we’ll look at what makes a good unit test and why, and in the second part, we’ll see how we can make our types easier to test.

Part 1 - Writing unit tests

This section covers same basic patterns on how to write and structure unit tests.

Follow the Given, When, Then approach

All tests (unit or otherwise), should have a similar structure to make reading and understanding them as easy as possible (it’s a good idea to always keep in mind the future developer who’ll be trying to understand your test the next time it fails). For trivial unit tests that require no setup, this approach might be overkill (e.g. A test to validate a mathematical function), but in nearly all other cases this approach can help give tests structure.

The Given, When, Then approach divides a test into three main parts. The setup (Given scenario X), the action (When Y happens), and the result (Then Z is observed). It’s an incredibly simple, yet effective approach, and can dramatically simplify a test body to help the reader focus on what’s important.

TEST(VectorTest, Iterator_increment_advances_to_next_element)
{
  // Given
  auto elements = std::vector<int>{1, 2, 3, 4, 5};
  auto it = elements.begin();

  // When
  it++;

  // Then
  EXPECT_THAT(*it, Eq(elements[1]));
}

The example above is purely illustrative, but gives an idea of the shape you can come to expect from tests. The setup (Given) part can get more complex, in which case it’s often a good idea to move some of that logic to a function or separate fixture, but the action and result sections should ideally be quite small (see the next part for why).

Test one thing at a time

Each unit or integration test you write should ideally only have one reason to fail. If a test can fail for a myriad of reasons, it is going to make the life of the maintainer much more difficult to understand the source of the failure and what exactly is broken. Striving for one assertion/expectation per test is a good idea, though it’s by no means a hard rule (in cases where related post conditions can’t be easily verified by one assertion, it’s fine to have two or three, but they should all be logically related to the specific behaviour under test).

TEST(VectorTest, Iterator_addition_advances_by_offset_amount)
{
  // Given
  auto elements = std::vector<int>{1, 2, 3, 4, 5};
  auto it = elements.begin();

  // When
  it += 3;

  // Then
  EXPECT_THAT(*it, Eq(elements[3]));
}

Notice above that we’re reusing the vector example from the initial snippet, and have created a separate test to independently verify that after initializing it to elements.begin(), adding 3 to it returns an iterator to the 4th element in elements. It might have been tempting to throw this behaviour and assertion into the former test, but that’s a code smell, as we’re now testing two things at once. It’s true there are times when the first test will likely fail if the second test fails, but not necessarily. By structuring our tests in this way, we’re giving the maximum amount of information to the reader and helping guarantee independent behaviours of our type.

Test behaviours, not functions

When writing unit tests, it’s easy to fall into the trap of mapping a test directly to a function. This is occasionally appropriate (math types/functions might fit this category), but nearly all other types (even types such as collections) benefit from testing behaviours themselves. By testing behaviours, we’re focusing more on the functionality of the type and not the implementation. It’s rare we use a function in isolation, we usually need to call multiple functions to achieve the result we’d like.

TEST(VectorTest, Vector_size_increases_by_one_when_an_element_is_added)
{
  // Given
  auto elements = std::vector<int>{1, 2, 3, 4, 5};
  const auto size_before = elements.size();

  // When
  elements.push_back(6);

  // Then
  EXPECT_THAT(elements.size() - size_before, Eq(1));
}

In the above example, we could have tried to write a single test called Vector_size, and included lots of calls to add/remove elements with assertions scattered throughout, but that test would then have more than one reason to fail, and wouldn’t be fun to debug. The test might fail for one of several reasons which we aren’t able to ascertain from the test name alone.

See it fail

This isn’t so much related to code, but just a worthwhile habit to get into. It is always a good idea to actually see the test fail when first implementing it. When using an approach such as TDD, this is more or less guaranteed, but with tests written after, it’s surprisingly common to have a test that passes even after the code is modified to try and intentionally make it fail. Before wrapping up work on any unit test, try to modify the test, or code under test (comment out the function body for example, or return a placeholder value), and ensure the test does in fact fail and catch the regression. It’s scary how often this isn’t always the case.

Hamcrest (Matchers) notation for readability and consistency

This is a bonus point and not as important as the first four, but worth a brief mention. Adopting Hamcrest notation (Hamcrest is an anagram of Matchers) can cut out a lot of test boilerplate and give test expectations a consistent feel. There may of course be times where EXPECT_TRUE is a better fit, but it’s good practice not to mix them in a single test to help with consistency and readability.

For more information on Matchers, see this great reference about Google Test, or checkout this series (1, 2, 3) on Google Test Matchers from the O3DE blog.

Part 2 - Making code testable

This section covers approaches for how to make types easier to test.

Understand the different schools of testing

Before looking at two concrete approaches to make types and functions easier to test, let us review the two main schools of thought when it comes to testing. In the testing community, these two camps are referred to as the Classical School, and the London School.

The Classical School favours querying the state of an object after an action has been performed. Think of how you might test std::vector. You perform an operation on the type, and then query the state afterwards (e.g. You push a value, and then verify that the size of the vector increased by one). The main advantage to this approach, is that it prevents tests becoming too coupled to the implementation of the type. They rely on the public interface, and so should not break when internal details change. A drawback is that internal state can leak out of the object that might only be made visible for the benefit of a test (see the next section, Break down types, for one possible solution to this).

The London School follows the ‘Tell, don’t Ask’ object orientated mentality, and relies on mock objects and fakes to ‘sense’ side effects. An example of this might be calling a particular API function to delete all files in a directory. The system call to remove a file is replaced with an interface that can be mocked. In the test, we know how many files are in our virtual directory, and therefore know how many times ‘Remove’ should be called. We can configure our mock object to expect ‘Remove’ to be called the requisite number of times. Something like this can be hard to test with the traditional Classical approach. One downside to this technique can be tests that become too coupled to a types implementation, resulting in tests that must change with each update to the implementation. There are also situations where a prevalence of mocks can lead to significant complexity and maintenance headaches.

Both approaches have their place, and considering which to use ahead of time can help simplify writing tests.

Break down types (aka Extract Method/Extract Class)

Trying to figure out how to test a specific piece of functionality can be tricky. Often times the thing you want to verify is locked away inside a type and is not easily accessible. When faced with this situation, there are a number of options (some good, and some not so good… cough #define private public cough). One technique that has multiple benefits is to simply breakdown the type you wish to test into smaller pieces of functionality, and test each of those in isolation. The functionality can then be reintroduced and used via composition in the enclosing type (in the literature, this is sometimes referred to as Extract Method or Extract Class depending on if a function or type is used).

Let’s take the example of a simple Breakout clone. Breakout is composed of several simple elements (a ball, a paddle, blocks and walls). In the initial version of the game, a Breakout type is created that itself handles creating and destroying blocks. This all happens internally, so it is difficult to test. To make things more testable, one option is to add more public functions to the Breakout game. This however, has the downside of harming encapsulation. As far as the Breakout game is concerned, the only public interface is giving the player access to move the paddle left and right. To make it easier to verify how blocks are created and destroyed, a new type can be created called Board (Extract Class). This type can provide a rich API for interfacing with blocks and so can be tested and verified independently. It can then be added as a private member to Breakout, where it can be used through its public API.

This approach has the added benefit of reducing the overall responsibility of Breakout by limiting what it is responsible for. We can also reuse Board in other contexts in future if we wish. This technique can even be applied recursively to further decompose types and can be used to help tame the complexity of large monolithic systems.

Add customization points

Adding customization points (also known in some contexts as dependency injection, or ‘seams’) is a great way to make a type easier to test. One simple example might be providing an interface to print to the console that is used instead of a raw call to printf. In production code, we provide an implementation that simply calls printf, but in the test we create a new implementation that writes to an in-memory representation that is trivial to query and verify. This need not even be a full interface, it could just be a std::function that is bound at construction (this massively reduces the boilerplate necessary when creating a base class and providing an implementation to override the interface). If the overhead of the indirection of a virtual function call is a concern, it is possible to override such an interface at link time by carefully setting up the library functions to have a separate implementation at test time (this does of course introduce added setup and complexity, but may be warranted in certain situations). There is also the option in C++ of providing the function as a template argument.

Keeping these interfaces small by limiting how much one interface is responsible for, or simply using a single function, make providing overrides much simpler (large interfaces can be tedious and time consuming to override). When writing code that depends on any other concrete type (collaborator), consider if it might be worth introducing a level of indirection to make it possible to break the dependency so that it is possible to instantiate and use in a test.

Deliberation

Testing is hard, and takes time to get right. There’s no silver bullet, but with these principles and techniques in mind, we can save our future selves both time and headaches trying to debug hard to diagnose test failures. Any test is better than no test at all, so if things get a bit ugly, don’t sweat it. We’re all learning and trying to improve how we test, so there’s certain to be missteps along the way, but hopefully we can all continue refining our approach and learn from one another other along the way.