Beyond unit tests

All the projects I was working on were covered by various test types to ensure that developed code is functioning as expected. It is interesting, however, that almost each project had a slightly different combination of test types. Also, I have noticed that each company was naming and structuring those tests in a different way. Because all of those definitions are blurred a bit, I thought that it would be a good idea to take a closer look and describe how the tests were constructed, what was their purpose and what was the working experience with them from the developer perspective.

A different test types

Below, I have enumerated a most memorable types I have seen:

unit testsDefinitely the most common and well know test types. Well defined by Martin Fowler in his UnitTest bliki article. We used them to test pure business logic in isolation from external dependencies, like file system access, database, network etc.
They are the fastest tests as their scope is very small and all external dependencies are mocked.
integration testsI should probably say: an application internal integration tests, as we used them to test all classes responsible for communication with external dependencies like database, file system etc. within a developed service or application.
They have the same scope as unit tests, but they are much slower.
automated GUI acceptance testsWe used those to test desktop or web application GUI interface by using automation tools like Selenium or QTP. In one project, they were used to verify business scenarios of desktop application deployed and configured in testing environment, so the tests were heavy and slow as scope was referring to whole application.
In other project, tests were referring only to a thin presentation layer of web application, where other parts such as back-end services were isolated.
service acceptance testsWe used those tests in various projects and companies to verify behavior of services (HTTP or message based). They were user scenario specific, usually defined by Product Owner and/or Quality Assurance. Their scope was single service or few services making a logically autonomous component.
end to end testsThose tests were used in projects focusing on bigger systems, especially with SOA architecture and we used them to ensure that all services or components were working together properly. Same as acceptance tests, they were scenario based but the scope was basically a whole system.
manual user acceptance testsThose tests were manually executed by QA to ensure that application works as expected. Depending on software nature, they were similar to one of previous three test types.

In fact there were a much longer list of those (like regression, smoke, load and performance tests, etc.), but I have decided to omit them, as their structure, scope and way how they works is similar to the mentioned tests, and the only difference is reason why they are created, their function or just a different name.

Beyond unit tests

Apparently the unit tests are the most obvious and well known test types that could be spot in various projects. During my work history, those tests were always present (maybe except the very first projects I was working on). The presence of other test types was not as obvious. I would say that in projects that have been started before an Agile Methodology era, most of the tests were manual. The tests were defined in a very loosely manner (like: check that application starts and it is possible to do X), or they were structured in a list of steps to execute and expectations for execution results. I would skip the manual tests in further parts of this post, as they were usually done by a separate team of testers and there was nothing related to programming itself. I would like to just mention two things about them:

  • manual user acceptance tests were a base for writing their automated versions (later referred as acceptance tests),
  • the formalized version of manual tests was written in a form of steps and expectations, so in reality it was a precursor for BDD-like tests.

The integration tests were definitely not unit tests, because they were testing integration with external dependencies such as databases. If present, we used them to test all classes interacting directly with externals, like the ones following repository pattern. In order to run them, we had to have a real database to connect to (if possible, we tried to use an in-memory / file-based database version like SQLite to make easier to run tests) or the sample input files to play with. Beside that, those tests were not much different from unit tests. Because of this strong similarity, I would omit them from now on.

Now, if we take a look on GUI tests, it is easy to spot that they are really similar to service tests. The only difference is that GUI tests uses GUI as an interface, while service tests are using HTTP or messages as an interface to communicate with tested service/application. We used those tests to check the application behavior. Usually we were following an approach where the tested application was installed and run in the same way as it would be run after final release, so the successfully executed tests were giving us a proof that application would behave the same way when installed in production. The assumption that tested component has to be the same as on production means that during testing we have not been altering any program code with mocks. We were also trying to use only public and official APIs to run the tests (i.e. GUI, HTTP interfaces, messages, input files, etc.) and avoiding direct alterations of internal component state like manually altering data in database etc. Of course there were cases, where we decided to violate this rule, but usually it was dictated by the poor interfaces definition while the tests were being created for existing software or a significant test speedup. It is worth to mention that tests written in this form are more high-level and much slower than unit tests. Also, they are more behavior specific, focusing on the action result, not the way how it is achieved.

The last kind, end to end tests were used in projects consisting of multiple autonomous components. Similarly to acceptance tests, we were deploying all components in a form which would be deployed in production. Obviously, those tests were the slowest ones, because all the tested components had to perform a specific action in order to succeed the whole test – nothing was mocked there.

I have found interesting the way how Martin Fowler’s has identified those tests by their function:

  • Acceptance tests, covering a list of scenarios that define behavior of a specific feature (like login, shop basket, etc.),
  • User journey tests, covering all actions that have to be taken from the user perspective in order to achieve a specific goal,

and their scope:

Test characteristics

In comparison to unit tests, which are low-level, focused on a small part of code and fast, those tests:

  • are high-level, business scenario / behavior based,
  • refers to wide part of code, covering one or multiple components, hence
  • they are usually much slower to execute.

There are few interesting consequences of this characteristic. First of all, those are high-level tests, focusing on behaviors of tested component or whole system. They implement scenarios, often written in BDD form:

Given an opened login window,
when user enters valid credentials,
and user clicks the login button,
then the login window should close,
and user should successfully log in to the application,
and user account details should be displayed on the screen.


Given a sample wav file present in input folder,
when an EncodeFileMessage is sent to Encoding Service with sample file path and MP3 output format specified,
then the Encoding Service should publish a FileEncodedEvent,
and that published FileEncodedEvent should have a path to encoded file in MP3 format.

Those scenarios are focusing on what is happening in the system, not how it is done, so they are usually using a public API of application for triggering an action and later query / validate its outcome. The scenarios are referring to a business feature or a whole user journey, which means that the scope of those tests is much wider than in unit tests, covering a part of component, a whole one, a few components or even a whole system.

A test scope has a big influence on how those components are tested. If test refers only to one component, the component:

  • may be started directly from a process that performs a test, or
  • may be deployed into a dedicated testing environment and accessed remotely by the test.

If scope corresponds to multiple components, it usually means that all of them have to be deployed into a testing environment and configured to communicate with each other. If component has to be deployed before testing, a dedicated test environment has to be present in order to run such tests. It also implies a time overhead related to component installation, configuration and start-up.

With the most common testing approach, the tested component is executed in a separate process than test, so that test code communicates with it in asynchronous manner.

The test scope and asynchronous communication have a big impact on test execution time. Those tests are slow, of course, the execution speed depends on a project, type of tests and their structure. It may vary between less than a second for service test to more than few minutes for end-to-end test.

A huge factor in execution speed plays a way how assertions are defined in such tests. They base on component public API, which means that they usually check things like:

  • a requested information has been displayed on a screen,
  • a message X has been received,
  • a resource Y became available over HTTP, or
  • a file appeared on FTP server.

Those assertions are time based. They have to repeatedly check specified condition up to a defined timeout, because the tests are asynchronous and components require time to process requests in order to fulfill those criteria. In case when something goes wrong, this type of assertion would consume a full timeout before it fails. It can lead to situations where successful tests take few minutes to execute, while tests executed on a faulty system could take few hours until they all fail (it is a real example). It is worth to mention that the biggest time killers are those, checking that specific condition did not happened, as they always use whole timeout to succeed. As that kind of tests are never good enough (it is always possible that the tested condition would happen just after assertion finish), we have been always trying to eliminate or limit them if possible – usually the same scenarios could be easily covered with unit tests.

To summarize, the nature of acceptance and end-to-end tests makes them significantly distinct from the unit tests. In the next post, I will describe how we came up with an expectations for a testing framework allowing us to write acceptance tests in easy manner.

One Comment

  • witek

    hello 🙂
    first of all I love Unit Test definition. Next I will share my 5 cents …
    I think that we only need business and performance testing
    First to proof that our software is working as expected from business perspective next our software is just fast enough to not cause outage during rush hours.

    Unit testing is for me kind of extension of business testing where I have two questions to be answered … first one:
    are my tests are slow? if so is there a way to write business test in isolation so that it will run faster.

    Cheers Wojtek!