2015
03.04

Beyond unit tests

All the projects I was working on were covered by various test types to ensure that developed code is functioning as expected. It is interesting, however, that almost each project had a slightly different combination of test types. Also, I have noticed that each company was naming and structuring those tests in a different way. Because all of those definitions are blurred a bit, I thought that it would be a good idea to take a closer look and describe how the tests were constructed, what was their purpose and what was the working experience with them from the developer perspective.

A different test types

Below, I have enumerated a most memorable types I have seen:

Type Description
unit tests Definitely the most common and well know test types. Well defined by Martin Fowler in his UnitTest bliki article. We used them to test pure business logic in isolation from external dependencies, like file system access, database, network etc.
They are the fastest tests as their scope is very small and all external dependencies are mocked.
integration tests I should probably say: an application internal integration tests, as we used them to test all classes responsible for communication with external dependencies like database, file system etc. within a developed service or application.
They have the same scope as unit tests, but they are much slower.
automated GUI acceptance tests We used those to test desktop or web application GUI interface by using automation tools like Selenium or QTP. In one project, they were used to verify business scenarios of desktop application deployed and configured in testing environment, so the tests were heavy and slow as scope was referring to whole application.
In other project, tests were referring only to a thin presentation layer of web application, where other parts such as back-end services were isolated.
service acceptance tests We used those tests in various projects and companies to verify behavior of services (HTTP or message based). They were user scenario specific, usually defined by Product Owner and/or Quality Assurance. Their scope was single service or few services making a logically autonomous component.
end to end tests Those tests were used in projects focusing on bigger systems, especially with SOA architecture and we used them to ensure that all services or components were working together properly. Same as acceptance tests, they were scenario based but the scope was basically a whole system.
manual user acceptance tests Those tests were manually executed by QA to ensure that application works as expected. Depending on software nature, they were similar to one of previous three test types.

In fact there were a much longer list of those (like regression, smoke, load and performance tests, etc.), but I have decided to omit them, as their structure, scope and way how they works is similar to the mentioned tests, and the only difference is reason why they are created, their function or just a different name.

Beyond unit tests

Apparently the unit tests are the most obvious and well known test types that could be spot in various projects. During my work history, those tests were always present (maybe except the very first projects I was working on). The presence of other test types was not as obvious. I would say that in projects that have been started before an Agile Methodology era, most of the tests were manual. The tests were defined in a very loosely manner (like: check that application starts and it is possible to do X), or they were structured in a list of steps to execute and expectations for execution results. I would skip the manual tests in further parts of this post, as they were usually done by a separate team of testers and there was nothing related to programming itself. I would like to just mention two things about them:

  • manual user acceptance tests were a base for writing their automated versions (later referred as acceptance tests),
  • the formalized version of manual tests was written in a form of steps and expectations, so in reality it was a precursor for BDD-like tests.

The integration tests were definitely not unit tests, because they were testing integration with external dependencies such as databases. If present, we used them to test all classes interacting directly with externals, like the ones following repository pattern. In order to run them, we had to have a real database to connect to (if possible, we tried to use an in-memory / file-based database version like SQLite to make easier to run tests) or the sample input files to play with. Beside that, those tests were not much different from unit tests. Because of this strong similarity, I would omit them from now on.

Now, if we take a look on GUI tests, it is easy to spot that they are really similar to service tests. The only difference is that GUI tests uses GUI as an interface, while service tests are using HTTP or messages as an interface to communicate with tested service/application. We used those tests to check the application behavior. Usually we were following an approach where the tested application was installed and run in the same way as it would be run after final release, so the successfully executed tests were giving us a proof that application would behave the same way when installed in production. The assumption that tested component has to be the same as on production means that during testing we have not been altering any program code with mocks. We were also trying to use only public and official APIs to run the tests (i.e. GUI, HTTP interfaces, messages, input files, etc.) and avoiding direct alterations of internal component state like manually altering data in database etc. Of course there were cases, where we decided to violate this rule, but usually it was dictated by the poor interfaces definition while the tests were being created for existing software or a significant test speedup. It is worth to mention that tests written in this form are more high-level and much slower than unit tests. Also, they are more behavior specific, focusing on the action result, not the way how it is achieved.

The last kind, end to end tests were used in projects consisting of multiple autonomous components. Similarly to acceptance tests, we were deploying all components in a form which would be deployed in production. Obviously, those tests were the slowest ones, because all the tested components had to perform a specific action in order to succeed the whole test – nothing was mocked there.

I have found interesting the way how Martin Fowler’s has identified those tests by their function:

  • Acceptance tests, covering a list of scenarios that define behavior of a specific feature (like login, shop basket, etc.),
  • User journey tests, covering all actions that have to be taken from the user perspective in order to achieve a specific goal,

and their scope:

Test characteristics

In comparison to unit tests, which are low-level, focused on a small part of code and fast, those tests:

  • are high-level, business scenario / behavior based,
  • refers to wide part of code, covering one or multiple components, hence
  • they are usually much slower to execute.

There are few interesting consequences of this characteristic. First of all, those are high-level tests, focusing on behaviors of tested component or whole system. They implement scenarios, often written in BDD form:

  1. Given an opened login window,
  2. when user enters valid credentials,
  3. and user clicks the login button,
  4. then the login window should close,
  5. and user should successfully log in to the application,
  6. and user account details should be displayed on the screen.

or

  1. Given a sample wav file present in input folder,
  2. when an EncodeFileMessage is sent to Encoding Service with sample file path and MP3 output format specified,
  3. then the Encoding Service should publish a FileEncodedEvent,
  4. and that published FileEncodedEvent should have a path to encoded file in MP3 format.

Those scenarios are focusing on what is happening in the system, not how it is done, so they are usually using a public API of application for triggering an action and later query / validate its outcome. The scenarios are referring to a business feature or a whole user journey, which means that the scope of those tests is much wider than in unit tests, covering a part of component, a whole one, a few components or even a whole system.

A test scope has a big influence on how those components are tested. If test refers only to one component, the component:

  • may be started directly from a process that performs a test, or
  • may be deployed into a dedicated testing environment and accessed remotely by the test.

If scope corresponds to multiple components, it usually means that all of them have to be deployed into a testing environment and configured to communicate with each other. If component has to be deployed before testing, a dedicated test environment has to be present in order to run such tests. It also implies a time overhead related to component installation, configuration and start-up.

With the most common testing approach, the tested component is executed in a separate process than test, so that test code communicates with it in asynchronous manner.

The test scope and asynchronous communication have a big impact on test execution time. Those tests are slow, of course, the execution speed depends on a project, type of tests and their structure. It may vary between less than a second for service test to more than few minutes for end-to-end test.

A huge factor in execution speed plays a way how assertions are defined in such tests. They base on component public API, which means that they usually check things like:

  • a requested information has been displayed on a screen,
  • a message X has been received,
  • a resource Y became available over HTTP, or
  • a file appeared on FTP server.

Those assertions are time based. They have to repeatedly check specified condition up to a defined timeout, because the tests are asynchronous and components require time to process requests in order to fulfill those criteria. In case when something goes wrong, this type of assertion would consume a full timeout before it fails. It can lead to situations where successful tests take few minutes to execute, while tests executed on a faulty system could take few hours until they all fail (it is a real example). It is worth to mention that the biggest time killers are those, checking that specific condition did not happened, as they always use whole timeout to succeed. As that kind of tests are never good enough (it is always possible that the tested condition would happen just after assertion finish), we have been always trying to eliminate or limit them if possible – usually the same scenarios could be easily covered with unit tests.

To summarize, the nature of acceptance and end-to-end tests makes them significantly distinct from the unit tests. In the next post, I will describe how we came up with an expectations for a testing framework allowing us to write acceptance tests in easy manner.

2015
01.04

This time I have decided to make a slightly different post – a short presentation about LightBDD itself.

For more details, please see project page or wiki page.

So here we go:

2014
12.17

I always preferred to write software than write about it, however during my work for various companies I have noticed that all the teams were facing the similar kind of problems. I thought that it would be a good idea to share my observations as well as some information about tools I have written for myself and others who would like to use them.

My first post is about a topic which is valid to each company at some point – writing acceptance tests. I would like to share my remarks on dealing with those tests, and describe why, at the end, I have decided to create my own framework.

First steps with ATDD and FitNesse

A few years ago I was working in a team developing a new, modern version of application for our customer. We tried a new approach of testing our software. Until then, our projects were tested in a way, that development team was writing unit tests, while later, a dedicated test team was executing automated UI tests, as well as, a long list of manual tests. In our project, we wanted to follow an ATDD model and introduce automated acceptance tests. That is why we had started using a FitNesse framework for .NET. It was serving a web application with an editor, where our BA/SME could enter acceptance criteria in a tabular format. It also allowed us to execute those tests against services that we had been developing. Below, there is an example of a TableFixture usage, taken from FitNesse UserGuide:

We thought that the idea of this approach was very good and definitively much better than what we had had before. First of all, BA was able to write tests and check immediately if given scenario was supported by application. Secondly, we could finally get some clear requirements and scenarios for our project and verify them much quicker against written code.
After few months, we had noticed that reality was a bit different. All scenarios entered in editor had to be mapped to underlying code in order to execute tests. Unfortunately, this editor was neither offering any hints for test syntax, nor any list of implemented methods that could be used to write test scenarios. It meant that it was not possible to write scenarios without constant checking how other tests are written or directly taking a look to code in order to check what is possible to do, what became quite problematic, when our project grew a bit. Maybe because of that, or maybe because of a fact that BA/SME were too busy doing their regular work, they had never entered any test in this editor. It meant that we had been left alone with this new tool…
So, as seen from a developer’s perspective, we had had to learn a new syntax for writing tests. It sounds easy to learn how to use a table fixture, a row fixture and few others constructs, but we had had to spend quite a long time to learn how to cope with all special cases like dealing with expected/actual value comparison, escaping properly all special characters etc. It took us a while to organize our tests and mappings properly as well.
Talking about reorganization… There was basically no such thing like refactoring, which meant that we had had to repeat all of the changes twice – first in code, after that in the web editor, typing everything manually. The bigger the changes, the more painful they were.
After all, we had managed to successfully finish this project and testing process that we applied was remarkably better than previously, however writing acceptance tests with this tool was not a pleasure. After a break from this project, it was difficult to jump back into those tests and re-learn the way they work, as well as, recall all the syntax.

A story about BDD and SpecFlow

A few years later, in a different country, company and project, we had started using SpecFlow to reflect business requirements as testable scenarios, written in BDD way. Below, there is an example taken from Wikipedia:

This time, our PO and QA were actively working with us on writing and validating requirements, so it made sense to use this tool, because it allowed them to write scenarios without knowledge of programming languages. After few months of work we had started having the same issues with maintainability of those tests like before.
SpecFlow is much better integrated with VisualStudio than FitNesse. It is possible to write and execute tests directly from IDE and SpecFlow plugin offers some help while writing them. What it has in common with FitNesse is that it is also based on concept of writing scenarios in plain text, which is later mapped to underlying code. Like FitNesse, it also has custom conventions and mechanisms for those mappings. When our project was small, we have not noticed any problems, but when it grew a bit (together with test code base), our tests became difficult to maintain in this form.
Like before, any refactoring applied in code, had to be manually applied in feature text files as well. IDE was also constantly showing that none of underlying scenario methods are used (because of reflection based mapping), so it was difficult to determine what could be really cleaned up and what is used during test execution.
While entire framework look&feel is very similar to standard testing frameworks like NUnit, SpecFlow follows different rules of executing tests, which we were not aware of at the beginning. The good example is that all of the methods with [BeforeScenario] attribute are called for each scenario, no matter if they belong to the same class as executed scenario steps or not. What we expected, was the same behavior as NUnit [SetUp] attribute. It was a big surprise for us when we have discovered that, and we had to spent a lot of time to sort out how to write code for test initialization properly.
The second significant difference was that the binding rules were allowing steps that belonged to one scenario to be mapped to multiple instances of classes with step methods.
We encountered some cases where given methods (that supposed to setup scenario data) were executed on different class instance than when methods (that supposed to act on previously prepared data). This problem, as well as issues related to [BeforeScenario] behavior, forced us to start using a ScenarioContext to share data between various steps. The ScenarioContext is basically a global dictionary allowing to put and get data objects identified by string literals, so its usage made our tests even less readable.

The beginnings of LightBDD

So finally, after spending another day trying to understand how our tests were working and how the test context is being shared between steps, I have started working on a simple wrapper on top of NUnit tests, that would allow us to:

  • write tests with testing tools that everybody knows how to use and what to expect from them, and
  • use all of the standard refactoring methods and IDE help to maintain those tests, but
  • keep all of our test definition as clear as possible, so it would be readable and editable by people who does not know C#.

This was the beginning of LightBDD

%d bloggers like this: