Defining ‘integration’ tests

What is an integration test? Everyone uses the term, but few seem to agree on the definition. Does the awkward filling of the testing pyramid still have value today, or is it too murky to be of real value?

Published
Last updated
Photo of Jay in Bazel

I’m a software engineer living and working in East London. I’m currently helping to build a one-stop-shop for the digitisation of alternative assets over at Daphne. Although once strictly front-end, today I work across the whole stack, including dipping my toes into DevOps and writing  Rust & Go.

The testing pyramid, with the integration slice breaking out.

I’ve been working on a project with this TODO item as part of the PR checklist:

- [] Write integration tests.

I’ve never once seen it ticked. I’m convinced it’s because nobody knows what it means.

With the possible exception of the Jamstack, I’ve never seen a term so rife with conflicting definitions. Not one engineer, QA or product owner I’ve spoken to seems to agree — at least in concrete terms — and yet we all collectively use it.

Definitions in the wild

Here are some definitions I’ve come across while reading. There are common threads, but arguably no clear consensus.

Martin Fowler & the integration split

Martin Fowler splits modern integration tests into two camps:

  • Narrow integration tests — Tests which exercise code that talks to an external service (e.g. an API), but that use test-doubles to simulate dependencies. These can be either in-memory or out-of-process.
  • Broad integration tests — Tests which use real/live services instead of test doubles, and which exercise entire code paths throughout the system.

By this definition, and Martin’s own admission, ‘broad’ integration tests are effectively end-to-end tests, and depending on scope may even be closer to acceptance or ‘smoke’ tests:

All this is why I’m wary with “integration test”. When I read it, I look for more context so I know which kind the author really means. If I talk about broad integration tests, I prefer to use “system test” or “end-to-end test”.

Taking the narrow definition, a test of an API client against a stub server is an integration test. What about using a real database to test a repository? The dependency is ‘real’, so perhaps fits the broader definition, but does not ‘exercise code paths through all services’ or ‘require live versions’ of everything.

David Farley, Jez Humble & the twice-run test

Dave and Jez touch on integration testing in Continuous Delivery, noting that:

We use the term integration testing to refer to tests which ensure that each independent part of your application works correctly with the services it depends on. Normally, integration tests should run in two contexts: firstly with the system under test running against the real external systems it depends on, or against their replicas controlled by the service provider, and secondly against a test harness which you create as part of your codebase.

Continuous Delivery: Chapter 4, Implementing a Testing Strategy

The first run (against a double) could theoretically be part of your check-in suite, while the latter run would serve the role of an acceptance or smoke test after deployment into a production-like environment. This feels like a combination of Martin’s narrow and broad definitions.

Vladamir Khorikov & the single unit

Vladimir claims that integration tests are any which fail to meet the criteria for unit tests (meaning that anything higher than ‘unit’ on the pyramid is effectively an integration test). A unit test is any which:

  • Verifies a single unit of behavior.
  • Does it quickly.
  • Does it in isolation from other tests.

Tests which interact with an out-of-process dependency (such as database) are therefore integration tests, since tests will likely share the same resource.

Where things get murky is the definition of a ‘single unit of behaviour’, particularly when comparing the London and Detroit schools of thought:

The London school considers any test that uses a real collaborator object an integration test. Most of the tests written in the classical style would be deemed integration tests by the London school proponents.

Unit Testing Principles, Practises, Patterns: 2.4

Taking a mockist (or ‘London’) perspective, a test for an Order class which is passed a real User, (rather than a test-double), is a form of integration test, since it’s using a real collaborator:

Logo for TypeScriptTypeScript
// SUT class Order { #user: User; constructor(user: User) { this.#user = user; } print(): string { return `Order for: ${this.#user.name()}` } } // Abstract collaborator interface User { name(): string; } // Concrete collaborator class ConcreteUser implements User{ #name: string; constructor(name: string) { this.#name = name; } name(): string { return this.#name; } } describe('Order', () => { // Integration test it('prints the order information', () => { // Instantiated with a real collaborator const sut = new Order(new ConcreteUser('Jay')); expect(sut.print()).toEqual('Order for: Jay'); }); // Unit test it(`prints the user's name`, () => { // Instantiated with a test double 🙈 const sut = new Order({ user: () => 'Jay' }); expect(sut.print()).stringContaining('Jay'); }); });

Steve Freeman, Nat Pryce & the us/them divide

In Growing Object-Orientated Software, Steve and Nat also touch on the subject:

We use the term integration tests to refer to the tests that check how some of our code works with code from outside the team that we can’t change. It might be a public framework, such as a persistence mapper, or a library from another team within our organization. The distinction is that integration tests make sure that any abstractions we build over third-party code work as we expect.

Growing Object-Orientated Software: Chapter 1

An API client wrapper, then, is a form of integration test, as is a test for an ORM-driven repository. Certainly most out-of-process dependencies qualify on some level (although does the filesystem count?), since they are ‘someone else’s code’, although the definition here is stricter since it encompasses in-memory libraries and frameworks.

Joe Rainsberger & the integra[tion|ted] scam

Screenshot of J.B's talk

J.B.'s infamous Integration tests are a scam talk suggests preferring contract and collaboration tests (both subsets of unit tests in this context) to integration tests, which test a collection of modules together:

I use the term […] to mean any test whose result (pass or fail) depends on the correctness of the implementation of more than one piece of non-trivial behavior.

J.B. suggests that the only place integration tests are needed are the points where the system interacts with external dependencies (the infrastructure layer of hexagonal architecture), such as a database or external service.

However, to complicate matters, J.B. renamed the talk and the associated post to use the term integrated tests, rather than integration tests, citing the following confusion:

…you will notice me change from “integration tests” to “integrated tests”, because I believe the latter term better fits the meaning I intend to convey as well as avoids confusion with what everyone else means by “integration tests”. I agree to reserve the term “integration tests” for tests that focus on checking the integration points between subsystems, systems, or any nontrivial client/supplier relationship. Integration tests might be integrated tests, and might be collaboration tests.

We can assume that J.B. now sees integration tests as close to Martin’s ‘broad’ definition, and was previously using the term in the mockist sense. However, given that the original talk places the (now) ‘integrated’ tests in the middle of the pyramid, and that J.B. suggests that these tests only belong at the edges of the system, it feels like both are somewhat conflated.

Retiring the term

For my money, the easiest definition is:

Any test which interacts with an out-of-process dependency.

This covers file system tests and repository tests (i.e. anything which hits a real database). Whether it covers API client wrappers (tested against stub-servers), or wrappers around your favourite framework remains up for debate.

However, I’m not convinced it’s a meaningful distinction. Many tests do need to interact with an out-of-process dependency, but remain fast enough to be part of your check-in suite. Moreover, it’s frequently sensible to test your system against a real out-of-process dependency, rather than a test double.

Are these integration tests? Probably, but containerised databases are fast, and test isolation (and therefore parallelism) isn’t the challenge it once was. If you’re truly testing the integration of two distinct systems (such as micro-services in a service-orientated architecture), you’re probably in the realm of E2E testing.

Perhaps it’s time to delete the checkbox.

- [] Write integration tests.

← Archive