The data challenge of regression testing

OSS regression testing, for those not already familiar with the term, is to re-run old tests to ensure existing functionality in your OSS doesn’t get broken by new code releases.

With agile development methodologies becoming more widespread and regular, smaller releases occurring it’s important to have a robust regression testing regime in place. There are a number of tools on the market that make automated regression testing more readily achievable. An automated regression testing framework allows you to run the same test case(s) via script format at scheduled times to quickly determine whether any new code is creating unintended side-effects. It effectively increases the volume of testing without requiring manual test effort.

With an automated regression test framework in place, the following factors need considering:

  • The test harness needs to be in place and set up to accept and run a series of test cases
  • The test cases need to be created (and have a baseline of expected results to compare against)
  • The new code-base to run old tests against
  • A set of data to run each test against

In the context of OSS, the fourth dot-point might consist of setting up an end-to-end circuit and running a test case against it with each iteration of regression. The objective is to get the same test result with each iteration (noting that some aspects of the result might change such as timestamps and need to be considered when doing a direct comparison).

However, one aspect that is often lost in the fog is the need to retain (and curate) test data over long periods (eg months / years). Databases are often managed during those periods and test / dev databases are often not maintained with the same rigour as production databases. If data overwrites / migrations or schema updates are made, then there is a risk that regression data changes, causing test results that don’t match with the baseline.

When building an automated OSS regression test framework, keep in mind that you’ll also need to commit to maintaining a database that keeps test data for long periods. If you lose the data baseline, you lose the ability to run regression tests because the baseline results become misaligned.

Read the Passionate About OSS Blog for more or Subscribe to the Passionate About OSS Blog by Email

Leave a Reply

Your email address will not be published. Required fields are marked *