Acceptance Criteria, Acceptance Tests and Experience-Based Practices

Writing Acceptance Criteria

Specifying acceptance criteria is an important acceptance testing task. It helps to refine requirements or user stories and provides the basis for acceptance tests. Business analysts and testers should collaborate closely on the specification of these criteria. This collaboration ensures high business value from the acceptance testing phase and increases the chance of a successful iteration or product release. 

Writing acceptance criteria forces business analysts and testers to think about functionality, performance, and other characteristics from a stakeholder or usage perspective. This supports early verification and validation of the related requirement or user story and provides a better chance of detecting inconsistencies, contradictions, missing information or other problems. 

The following good practices should be considered when writing acceptance criteria:

  • Well-written acceptance criteria are precise, measurable and concise. Each criterion must be written in a way that enables the tester to measure whether or not the test object complies with the acceptance criterion.
  • Well-written acceptance criteria do not include technical solution details. They concentrate on the question “What shall be achieved?” rather than on the question “How shall it achieved?”.
  • Acceptance criteria should address non-functional requirements (quality characteristics) as well as functional requirements.

As with requirements and user stories, acceptance criteria should be reviewed through walkthroughs, technical reviews, iteration planning meetings or other methods (if necessary).

Designing Acceptance Tests

This section addresses the test techniques and approaches frequently used for acceptance testing.

Test Techniques for Acceptance Testing

In a requirements-based approach to acceptance testing, the tester derives test cases from the acceptance criteria related to each requirement or user story using black-box techniques such as equivalence partitioning or boundary value analysis.

Acceptance testing may be augmented with other test techniques or approaches:

  • Business process-based testing, possibly combined with decision table testing, validates business processes and rules.
  • Experience-based testing leverages the tester’s experience, knowledge and intuition.
  • Risk-based testing is based on risk types and levels. Prioritisation and thoroughness of testing depends on previously identified product risks.
  • Model-based testing uses graphical (or textual) models to obtain acceptance tests.

Acceptance criteria should be verified by acceptance tests and traceability between the requirements / user story and related test cases should be managed.

Using the Gherkin Language to Write Test Cases

In ATDD and BDD, acceptance tests are often formulated in a structured language, referred to as the Gherkin language. Using the Gherkin language, test cases are phrased declaratively using a standardised pattern:

  • Given [a situation]
  • When [an action on the system]
  • Then [the expected result]

The pattern allows business analysts, testers and developers to write test cases in a way that is easily shared with stakeholders and may be translated into automated tests. 

The “Given” block aims to put the test object in a state before performing test actions in the “When” block. The “Then” block specifies the consequences that can be observed from the actions defined in the “When” block. Test cases written in Gherkin do not refer to user interface elements but rather to user actions on the system. They are structured natural language test cases that can be understood by all relevant stakeholders. 

In addition, the structure “Given – When – Then” can be parsed in an automated way. This allows automated test script creation using a keyword-driven testing approach. 

Initially, Gherkin was specific to some software tools supporting BDD, but it is now synonymous with the “Given – When – Then” acceptance test design pattern. 

Experience-based Approaches for Acceptance Testing

All experience-based test techniques described in are relevant for acceptance testing. This section is focused on how exploratory testing can be used for acceptance tests, and on beta testing as a source of feedback on system usage. 

Exploratory Testing

Exploratory testing is an experience-based test technique that is not based on detailed predefined test procedures. In exploratory testing, all activities are carried out within an uninterrupted period of time called a session. The testers are domain experts. They are familiar with user needs, requirements and business processes, but they are not necessarily familiar with the product under test. 

During an exploratory testing session, the tester accomplishes the following:

  • Learns how to work with the product
  • Designs the tests
  • Performs the tests
  • Interprets the results

It is a good practice in exploratory testing to use a test charter. The test charter is prepared prior to the testing session (possibly jointly by the business analyst and the tester) and is used by the person in charge of the exploratory session (either a business analyst, tester or another stakeholder). It includes information about the purpose, target, and scope of the exploratory session, the test setup, the duration of the session, and possibly some tactics to be used during the session (such as the type of user that shall be simulated during the exploratory session). Time-boxed sessions help to control the time and effort dedicated to the exploratory session. It is also good practice to perform exploratory testing in pairs or as team work. 

In Agile development, exploratory test sessions can be conducted during an iteration by the product owner and/or the testers for acceptance testing of user stories assigned to the iteration. 

Exploratory testing should be used to complement other more formal techniques in acceptance testing. For example, it may be used to provide rapid feedback on new features before methodical testing is applied. 

Beta Testing

Beta testing is a form of acceptance testing that is often used for Commercial Off-the-Shelf Software (COTS) or for Software as a Service (SaaS) platforms. It is conducted to obtain feedback from the market after development and in-house testing are completed. 

Unlike other acceptance testing forms, beta testing is performed by potential or existing users at their own location. Beta tests neither impose predefined test procedures nor a test charter. Apart from the observed findings, the test activities are usually not documented at all. 

Because the product is tested in various realistic configurations by actual users in their business process context, beta testing may discover defects that escaped during the development process and previous test levels. Resolving issues found by beta tests helps organisations avoid costly hot-fixes or product recalls on a larger scale. 

Acceptance testing should not be limited to beta testing. Beta testing is not systematic or measurable. There is no guarantee that all requirements or user stories are covered by the tests. Moreover, beta testing is performed late in the development process whereas tests based on acceptance criteria support the “Early Testing” principle. 

Agile Testing Methods, Techniques, and Tools

Agile Testing Methods

There are certain testing practices that can be followed in every development project (agile or not) to produce quality products. These include writing tests in advance to express proper behaviour, focusing on early defect prevention, detection, and removal, and ensuring that the right test types are run at the right time and as part of the right test level. Agile practitioners aim to introduce these practices early. Testers in Agile projects play a key role in guiding the use of these testing practices throughout the lifecycle. 

Test-Driven Development, Acceptance Test-Driven Development, and Behaviour-Driven Development

Test-driven development, acceptance test-driven development, and behaviour-driven development are three complementary techniques in use among Agile teams to carry out testing across the various test levels. Each technique is an example of a fundamental principle of testing, the benefit of early testing and QA activities, since the tests are defined before the code is written. 

Test-Driven Development

Test-driven development (TDD) is used to develop code guided by automated test cases. The process for test-driven development is:

  • Add a test that captures the programmer’s concept of the desired functioning of a small piece of code
  • Run the test, which should fail since the code doesn’t exist
  • Write the code and run the test in a tight loop until the test passes
  • Refactor the code after the test is passed, re-running the test to ensure it continues to pass against the refactored code
  • Repeat this process for the next small piece of code, running the previous tests as well as the added tests

The tests written are primarily unit level and are code-focused, though tests may also be written at the integration or system levels. Test-driven development gained its popularity through Extreme Programming, but is also used in other Agile methodologies and sometimes in sequential lifecycles. It helps developers focus on clearly defined expected results. The tests are automated and are used in continuous integration.

Acceptance Test-Driven Development

Acceptance test-driven development defines acceptance criteria and tests during the creation of user stories. Acceptance test-driven development is a collaborative approach that allows every stakeholder to understand how the software component has to behave and what the developers, testers, and business representatives need to ensure this behaviour.

Acceptance test-driven development creates reusable tests for regression testing. Specific tools support creation and execution of such tests, often within the continuous integration process. These tools can connect to data and service layers of the application, which allows tests to be executed at the system or acceptance level. Acceptance test-driven development allows quick resolution of defects and validation of feature behaviour. It helps determine if the acceptance criteria are met for the feature.

Behaviour-Driven Development

Behaviour-driven development allows a developer to focus on testing the code based on the expected behaviour of the software. Because the tests are based on the exhibited behaviour from the software, the tests are generally easier for other team members and stakeholders to understand.

Specific behaviour-driven development frameworks can be used to define acceptance criteria based on the given/when/then format:

Given some initial context,

When an event occurs,

Then ensure some outcomes. 

From these requirements, the behaviour-driven development framework generates code that can be used by developers to create test cases. Behaviour-driven development helps the developer collaborate with other stakeholders, including testers, to define accurate unit tests focused on business needs. 

The Test Pyramid

A software system may be tested at different levels. Typical test levels are, from the base of the pyramid to the top, unit, integration, system, and acceptance. The test pyramid emphasises having a large number of tests at the lower levels (bottom of the pyramid) and, as development moves to the upper levels, the number of tests decreases (top of the pyramid). Usually unit and integration level tests are automated and are created using API-based tools. At the system and acceptance levels, the automated tests are created using GUI-based tools. The test pyramid concept is based on the testing principle of early QA and testing (i.e., eliminating defects as early as possible in the lifecycle). 

Testing Quadrants, Test Levels, and Testing Types

Testing quadrants, align the test levels with the appropriate test types in the Agile methodology. The testing quadrants model, and its variants, helps to ensure that all important test types and test levels are included in the development lifecycle. This model also provides a way to differentiate and describe the types of tests to all stakeholders, including developers, testers, and business representatives. 

In the testing quadrants, tests can be business (user) or technology (developer) facing. Some tests support the work done by the Agile team and confirm software behaviour. Other tests can verify the product. Tests can be fully manual, fully automated, a combination of manual and automated, or manual but supported by tools. The four quadrants are as follows: 

  • Quadrant Q1 is unit level, technology facing, and supports the developers. This quadrant contains unit tests. These tests should be automated and included in the continuous integration process.
  • Quadrant Q2 is system level, business facing, and confirms product behaviour. This quadrant contains functional tests, examples, story tests, user experience prototypes, and simulations. These tests check the acceptance criteria and can be manual or automated. They are often created during the user story development and thus improve the quality of the stories. They are useful when creating automated regression test suites.
  • Quadrant Q3 is system or user acceptance level, business facing, and contains tests that critique the product, using realistic scenarios and data. This quadrant contains exploratory testing, scenarios, process flows, usability testing, user acceptance testing, alpha testing, and beta testing. These tests are often manual and are user-oriented.
  • Quadrant Q4 is system or operational acceptance level, technology facing, and contains tests that critique the product. This quadrant contains performance, load, stress, and scalability tests, security tests, maintainability, memory management, compatibility and interoperability, data migration, infrastructure, and recovery testing. These tests are often automated.

During any given iteration, tests from any or all quadrants may be required. The testing quadrants apply to dynamic testing rather than static testing.

The Role of a Tester

Throughout this article, general reference has been made to Agile methods and techniques, and the role of a tester within various Agile lifecycles. This subsection looks specifically at the role of a tester in a project following a Scrum lifecycle. 

Teamwork 

Teamwork is a fundamental principle in Agile development. Agile emphasises the whole-team approach consisting of developers, testers, and business representatives working together. The following are organisational and behavioural best practices in Scrum teams:

  • Cross-functional: Each team member brings a different set of skills to the team. The team works together on test strategy, test planning, test specification, test execution, test evaluation, and test results reporting.
  • Self-organising: The team may consist only of developers, but, as noted before, ideally there would be one or more testers.
  • Co-located: Testers sit together with the developers and the product owner.
  • Collaborative: Testers collaborate with their team members, other teams, the stakeholders, the product owner, and the Scrum Master.
  • Empowered: Technical decisions regarding design and testing are made by the team as a whole (developers, testers, and Scrum Master), in collaboration with the product owner and other teams if needed.
  • Committed: The tester is committed to question and evaluate the product’s behaviour and characteristics with respect to the expectations and needs of the customers and users.
  • Transparent: Development and testing progress is visible on the Agile task board.
  • Credible: The tester must ensure the credibility of the strategy for testing, its implementation, and execution, otherwise the stakeholders will not trust the test results. This is often done by providing information to the stakeholders about the testing process.
  • Open to feedback: Feedback is an important aspect of being successful in any project, especially in Agile projects. Retrospectives allow teams to learn from successes and from failures.
  • Resilient: Testing must be able to respond to change, like all other activities in Agile projects.

These best practices maximise the likelihood of successful testing in Scrum projects.

Sprint Zero 

Sprint zero is the first iteration of the project where many preparation activities take place. The tester collaborates with the team on the following activities during this iteration:

  • Identify the scope of the project (i.e., the product backlog)
  • Create an initial system architecture and high-level prototypes
  • Plan, acquire, and install needed tools (e.g., for test management, defect management, test automation, and continuous integration)
  • Create an initial test strategy for all test levels, addressing (among other topics) test scope, technical risks, test types, and coverage goals
  • Perform an initial quality risk analysis
  • Define test metrics to measure the test process, the progress of testing in the project, and product quality
  • Specify the definition of “done”
  • Create the task board
  • Define when to continue or stop testing before delivering the system to the customer

Sprint zero sets the direction for what testing needs to achieve and how testing needs to achieve it throughout the sprints.

Integration 

In Agile projects, the objective is to deliver customer value on a continuous basis (preferably in every sprint). To enable this, the integration strategy should consider both design and testing. To enable a continuous testing strategy for the delivered functionality and characteristics, it is important to identify all dependencies between underlying functions and features.

Test Planning

Since testing is fully integrated into the Agile team, test planning should start during the release planning session and be updated during each sprint. Test planning for the release and each sprint should address the issues.

Sprint planning results in a set of tasks to put on the task board, where each task should have a length of one or two days of work. In addition, any testing issues should be tracked to keep a steady flow of testing.

Agile Testing Practices

Many practices may be useful for testers in a scrum team, some of which include: 

  • Pairing: Two team members (e.g., a tester and a developer, two testers, or a tester and a product owner) sit together at one workstation to perform a testing or other sprint task.
  • Incremental test design: Test cases and charters are gradually built from user stories and other test bases, starting with simple tests and moving toward more complex ones.
  • Mind mapping: Mind mapping is a useful tool when testing. For example, testers can use mind mapping to identify which test sessions to perform, to show test strategies, and to describe test data.

These practices are in addition to other practices discussed in this article and previous articles on the basics pages.

Assessing Quality Risks and Estimating Test Effort

A typical objective of testing in all projects, Agile or traditional, is to reduce the risk of product quality problems to an acceptable level prior to release. Testers in Agile projects can use the same types of techniques used in traditional projects to identify quality risks (or product risks), assess the associated level of risk, estimate the effort required to reduce those risks sufficiently, and then mitigate those risks through test design, implementation, and execution. However, given the short iterations and rate of change in Agile projects, some adaptations of those techniques are required.

Assessing Quality Risks in Agile Projects

One of the many challenges in testing is the proper selection, allocation, and prioritisation of test conditions. This includes determining the appropriate amount of effort to allocate in order to cover each condition with tests, and sequencing the resulting tests in a way that optimises the effectiveness and efficiency of the testing work to be done. Risk identification, analysis, and risk mitigation strategies can be used by the testers in Agile teams to help determine an acceptable number of test cases to execute, although many interacting constraints and variables may require compromises.

Risk is the possibility of a negative or undesirable outcome or event. The level of risk is found by assessing the likelihood of occurrence of the risk and the impact of the risk. When the primary effect of the potential problem is on product quality, potential problems are referred to as quality risks or product risks. When the primary effect of the potential problem is on project success, potential problems are referred to as project risks or planning risks.

In Agile projects, quality risk analysis takes place at two places.

  • Release planning: business representatives who know the features in the release provide a high-level overview of the risks, and the whole team, including the tester(s), may assist in the risk identification and assessment.
  • Iteration planning: the whole team identifies and assesses the quality risks.

Examples of quality risks for a system include:

  • Incorrect calculations in reports (a functional risk related to accuracy)
  • Slow response to user input (a non-functional risk related to efficiency and response time)
  • Difficulty in understanding screens and fields (a non-functional risk related to usability and understandability)

As mentioned earlier, an iteration starts with iteration planning, which culminates in estimated tasks on a task board. These tasks can be prioritised in part based on the level of quality risk associated with them. Tasks associated with higher risks should start earlier and involve more testing effort. Tasks associated with lower risks should start later and involve less testing effort.

An example of how the quality risk analysis process in an Agile project may be carried out during iteration planning is outlined in the following steps:

  1. Gather the Agile team members together, including the tester(s).
  2. List all the backlog items for the current iteration (e.g., on a task board).
  3. Identify the quality risks associated with each item, considering all relevant quality
    characteristics.
  4. Assess each identified risk, which includes two activities: categorising the risk and determining its level of risk based on the impact and the likelihood of defects.
  5. Determine the extent of testing proportional to the level of risk.
  6. Select the appropriate test technique(s) to mitigate each risk, based on the risk, the level of risk, and the relevant quality characteristic.

The tester then designs, implements, and executes tests to mitigate the risks. This includes the totality of features, behaviours, quality characteristics, and attributes that affect customer, user, and stakeholder satisfaction. 

Throughout the project, the team should remain aware of additional information that may change the set of risks and/or the level of risk associated with known quality risks. Periodic adjustment of the quality risk analysis, which results in adjustments to the tests, should occur. Adjustments include identifying new risks, re-assessing the level of existing risks, and evaluating the effectiveness of risk mitigation activities.

Quality risks can also be mitigated before test execution starts. For example, if problems with the user stories are found during risk identification, the project team can thoroughly review user stories as a mitigating strategy.

Estimating Testing Effort Based on Content and Risk

During release planning, the Agile team estimates the effort required to complete the release. The estimate addresses the testing effort as well. A common estimation technique used in Agile projects is planning poker, a consensus-based technique. The product owner or customer reads a user story to the estimators. Each estimator has a deck of cards with values similar to the Fibonacci sequence (i.e., 0, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, …), or any other progression of choice (e.g., shirt sizes ranging from extra-small to extra-extra-large). The values represent the number of story points, effort days, or other units in which the team estimates. The Fibonacci sequence is recommended because the numbers in the sequence reflect that uncertainty grows proportionally with the size of the story. A high estimate usually means that the story is not well understood or should be broken down into multiple smaller stories. 

The estimators discuss the feature, and ask questions of the product owner as needed. Aspects such as development and testing effort, complexity of the story, and scope of testing play a role in the estimation. Therefore, it is advisable to include the risk level of a backlog item, in addition to the priority specified by the product owner, before the planning poker session is initiated. When the feature has been fully discussed, each estimator privately selects one card to represent his or her estimate. All cards are then revealed at the same time. If all estimators selected the same value, that becomes the estimate. If not, the estimators discuss the differences in estimates after which the poker round is repeated until agreement is reached, either by consensus or by applying rules (e.g., use the median, use the highest score) to limit the number of poker rounds. These discussions ensure a reliable estimate of the effort needed to complete product backlog items requested by the product owner and help improve collective knowledge of what has to be done. 

Techniques in Agile Projects

Many of the test techniques and testing levels that apply to traditional projects can also be applied to Agile projects. However, for Agile projects, there are some specific considerations and variances in test techniques, terminologies, and documentation that should be considered. 

Acceptance Criteria, Adequate Coverage, and Other Information for Testing

Agile projects outline initial requirements as user stories in a prioritised backlog at the start of the project. Initial requirements are short and usually follow a predefined format. Non-functional requirements, such as usability and performance, are also important and can be specified as unique user stories or connected to other functional user stories. Non-functional requirements may follow a predefined format or standard, such as [ISO25000], or an industry specific standard.

The user stories serve as an important test basis. Other possible test bases include:

  • Experience from previous projects
  • Existing functions, features, and quality characteristics of the system
  • Code, architecture, and design
  • User profiles (context, system configurations, and user behaviour)
  • Information on defects from existing and previous projects
  • A categorisation of defects in a defect taxonomy
  • Applicable standards (e.g., [DO-178B] for avionics software)
  • Quality risks

During each iteration, developers create code which implements the functions and features described in the user stories, with the relevant quality characteristics, and this code is verified and validated via acceptance testing. To be testable, acceptance criteria should address the following topics where relevant:

  • Functional behaviour: The externally observable behaviour with user actions as input operating under certain configurations.
  • Quality characteristics: How the system performs the specified behaviour. The characteristics may also be referred to as quality attributes or non-functional requirements. Common quality characteristics are performance, reliability, usability, etc.
  • Scenarios (use cases): A sequence of actions between an external actor (often a user) and the system, in order to accomplish a specific goal or business task.
  • Business rules: Activities that can only be performed in the system under certain conditions defined by outside procedures and constraints (e.g., the procedures used by an insurance company to handle insurance claims).
  • External interfaces: Descriptions of the connections between the system to be developed and the outside world. External interfaces can be divided into different types (user interface, interface to other systems, etc.).
  • Constraints: Any design and implementation constraint that will restrict the options for the developer. Devices with embedded software must often respect physical constraints such as size, weight, and interface connections.
  • Data definitions: The customer may describe the format, data type, allowed values, and default values for a data item in the composition of a complex business data structure (e.g., the ZIP code in a US mail address).

In addition to the user stories and their associated acceptance criteria, other information is relevant for the tester, including:

  • How the system is supposed to work and be used
  • The system interfaces that can be used/accessed to test the system
  • Whether current tool support is sufficient
  • Whether the tester has enough knowledge and skill to perform the necessary tests

Testers will often discover the need for additional information (e.g., code coverage) throughout the iterations and should work collaboratively with the rest of the Agile team members to obtain that information. Relevant information plays a part in determining whether a particular activity can be considered done. This concept of the definition of done is critical in Agile projects and applies in a number of different ways as discussed in the following sub-subsections. 

Test Levels

Each test level has its own definition of done. The following list gives examples that may be relevant for the different test levels.

  • Unit testing
    • 100% decision coverage where possible, with careful reviews of any infeasible paths
    • Static analysis performed on all code
    • No unresolved major defects (ranked based on priority and severity)
    • No known unacceptable technical debt remaining in the design and the code
    • All code, unit tests, and unit test results reviewed
    • All unit tests automated
    • Important characteristics are within agreed limits (e.g., performance)
  • Integration testing
    • All functional requirements tested, including both positive and negative tests, with the number of tests based on size, complexity, and risks
    • All interfaces between units tested
    • All quality risks covered according to the agreed extent of testing
    • No unresolved major defects (prioritised according to risk and importance)
    • All defects found are reported
    • All regression tests automated, where possible, with all automated tests stored in a common repository
  • System testing
    • End-to-end tests of user stories, features, and functions
    • All user personas covered
    • The most important quality characteristics of the system covered (e.g., performance, robustness, reliability)
    • Testing done in a production-like environment(s), including all hardware and software for all supported configurations, to the extent possible
    • All quality risks covered according to the agreed extent of testing
    • All regression tests automated, where possible, with all automated tests stored in a common repository
    • All defects found are reported and possibly fixed
    • No unresolved major defects (prioritised according to risk and importance)

User Story

The definition of done for user stories may be determined by the following criteria: 

  • The user stories selected for the iteration are complete, understood by the team, and have detailed, testable acceptance criteria
  • All the elements of the user story are specified and reviewed, including the user story acceptance tests, have been completed
  • Tasks necessary to implement and test the selected user stories have been identified and estimated by the team

Feature

The definition of done for features, which may span multiple user stories or epics, may include:

  • All constituent user stories, with acceptance criteria, are defined and approved by the customer
  • The design is complete, with no known technical debt
  • The code is complete, with no known technical debt or unfinished refactoring
  • Unit tests have been performed and have achieved the defined level of coverage
  • Integration tests and system tests for the feature have been performed according to the defined coverage criteria
  • No major defects remain to be corrected
  • Feature documentation is complete, which may include release notes, user manuals, and on-line help functions

Iteration 

The definition of done for the iteration may include the following:

  • All features for the iteration are ready and individually tested according to the feature level criteria
  • Any non-critical defects that cannot be fixed within the constraints of the iteration added to the product backlog and prioritised 
  • Integration of all features for the iteration completed and tested 
  • Documentation written, reviewed, and approved 

At this point, the software is potentially releasable because the iteration has been successfully completed, but not all iterations result in a release. 

Release

The definition of done for a release, which may span multiple iterations, may include the following areas:

  • Coverage: All relevant test basis elements for all contents of the release have been covered by testing. The adequacy of the coverage is determined by what is new or changed, its complexity and size, and the associated risks of failure.
  • Quality: The defect intensity (e.g., how many defects are found per day or per transaction), the defect density (e.g., the number of defects found compared to the number of user stories, effort, and/or quality attributes), estimated number of remaining defects are within acceptable limits, the consequences of unresolved and remaining defects (e.g., the severity and priority) are understood and acceptable, the residual level of risk associated with each identified quality risk is understood and acceptable.
  • Time: If the pre-determined delivery date has been reached, the business considerations associated with releasing and not releasing need to be considered.
  • Cost: The estimated lifecycle cost should be used to calculate the return on investment for the delivered system (i.e., the calculated development and maintenance cost should be considerably lower than the expected total sales of the product). The main part of the lifecycle cost often comes from maintenance after the product has been released, due to the number of defects escaping to production. 

Applying Acceptance Test-Driven Development

Acceptance test-driven development is a test-first approach. Test cases are created prior to implementing the user story. The test cases are created by the Agile team, including the developer, the tester, and the business representatives and may be manual or automated. The first step is a specification workshop where the user story is analysed, discussed, and written by developers, testers, and business representatives. Any incompleteness, ambiguities, or errors in the user story are fixed during this process. 

The next step is to create the tests. This can be done by the team together or by the tester individually. In any case, an independent person such as a business representative validates the tests. The tests are examples that describe the specific characteristics of the user story. These examples will help the team implement the user story correctly. Since examples and tests are the same, these terms are often used interchangeably. The work starts with basic examples and open questions. 

Typically, the first tests are the positive tests, confirming the correct behaviour without exception or error conditions, comprising the sequence of activities executed if everything goes as expected. After the positive path tests are done, the team should write negative path tests and cover non-functional attributes as well (e.g., performance, usability). Tests are expressed in a way that every stakeholder is able to understand, containing sentences in natural language involving the necessary preconditions, if any, the inputs, and the related outputs. 

The examples must cover all the characteristics of the user story and should not add to the story. This means that an example should not exist which describes an aspect of the user story not documented in the story itself. In addition, no two examples should describe the same characteristics of the user story. 

Functional and Non-Functional Black Box Test Design

In Agile testing, many tests are created by testers concurrently with the developers’ programming activities. Just as the developers are programming based on the user stories and acceptance criteria, so are the testers creating tests based on user stories and their acceptance criteria. (Some tests, such as exploratory tests and some other experience-based tests, are created later, during test execution) Testers can apply traditional black box test design techniques such as equivalence partitioning, boundary value analysis, decision tables, and state transition testing to create these tests. For example, boundary value analysis could be used to select test values when a customer is limited in the number of items they may select for purchase. 

In many situations, non-functional requirements can be documented as user stories. Black box test design techniques (such as boundary value analysis) can also be used to create tests for non-functional quality characteristics. The user story might contain performance or reliability requirements. For example, a given execution cannot exceed a time limit or a number of operations may fail less than a certain number of times. 

Exploratory Testing and Agile Testing

Exploratory testing is important in Agile projects due to the limited time available for test analysis and the limited details of the user stories. In order to achieve the best results, exploratory testing should be combined with other experience-based techniques as part of a reactive testing strategy, blended with other testing strategies such as analytical risk-based testing, analytical requirements-based testing, model-based testing, and regression-averse testing. Test strategies and test strategy blending is discussed in the basics Level pages. 

In exploratory testing, test design and test execution occur at the same time, guided by a prepared test charter. A test charter provides the test conditions to cover during a time-boxed testing session. During exploratory testing, the results of the most recent tests guide the next test. The same white box and black box techniques can be used to design the tests as when performing pre-designed testing. 

A test charter may include the following information: 

  • Actor: intended user of the system
  • Purpose: the theme of the charter including what particular objective the actor wants to achieve, i.e., the test conditions
  • Setup: what needs to be in place in order to start the test execution
  • Priority: relative importance of this charter, based on the priority of the associated user story or the risk level
  • Reference: specifications (e.g., user story), risks, or other information sources
  • Data: whatever data is needed to carry out the charter
  • Activities: a list of ideas of what the actor may want to do with the system (e.g., “Log on to the system as a super user”) and what would be interesting to test (both positive and negative tests)
  • Oracle notes: how to evaluate the product to determine correct results (e.g., to capture what happens on the screen and compare to what is written in the user’s manual)
  • Variations: alternative actions and evaluations to complement the ideas described under activities

To manage exploratory testing, a method called session-based test management can be used. A session is defined as an uninterrupted period of testing which could last from 60 to 120 minutes. Test sessions include the following:

  • Survey session (to learn how it works)
  • Analysis session (evaluation of the functionality or characteristics)
  • Deep coverage (corner cases, scenarios, interactions)

The quality of the tests depends on the testers’ ability to ask relevant questions about what to test. Examples include the following:

  • What is most important to find out about the system?
  • In what way may the system fail?
  • What happens if…..?
  • What should happen when…..?
  • Are customer needs, requirements, and expectations fulfilled?
  • Is the system possible to install (and remove if necessary) in all supported upgrade paths?

During test execution, the tester uses creativity, intuition, cognition, and skill to find possible problems with the product. The tester also needs to have good knowledge and understanding of the software under test, the business domain, how the software is used, and how to determine when the system fails.

A set of heuristics can be applied when testing. A heuristic can guide the tester in how to perform the testing and to evaluate the results [Hendrickson]. Examples include:

  • Boundaries
  • CRUD (Create, Read, Update, Delete)
  • Configuration variations
  • Interruptions (e.g., log off, shut down, or reboot)

It is important for the tester to document the process as much as possible. Otherwise, it would be difficult to go back and see how a problem in the system was discovered. The following list provides examples of information that may be useful to document:

  • Test coverage: what input data have been used, how much has been covered, and how much remains to be tested
  • Evaluation notes: observations during testing, do the system and feature under test seem to be stable, were any defects found, what is planned as the next step according to the current observations, and any other list of ideas
  • Risk/strategy list: which risks have been covered and which ones remain among the most important ones, will the initial strategy be followed, does it need any changes
  • Issues, questions, and anomalies: any unexpected behaviour, any questions regarding the efficiency of the approach, any concerns about the ideas/test attempts, test environment, test data, misunderstanding of the function, test script or the system under test
  • Actual behaviour: recording of actual behaviour of the system that needs to be saved (e.g., video, screen captures, output data files)

The information logged should be captured and/or summarised into some form of status management tools (e.g., test management tools, task management tools, the task board), in a way that makes it easy for stakeholders to understand the current status for all testing that was performed.

Tools in Agile Projects

Tools described in the basics Level pages are relevant and used by testers on Agile teams. Not all tools are used the same way and some tools have more relevance for Agile projects than they have in traditional projects. For example, although the test management tools, requirements management tools, and incident management tools (defect tracking tools) can be used by Agile teams, some Agile teams opt for an all-inclusive tool (e.g., application lifecycle management or task management) that provides features relevant to Agile development, such as task boards, burn-down charts, and user stories. Configuration management tools are important to testers in Agile teams due to the high number of automated tests at all levels and the need to store and manage the associated automated test artefacts.

In addition to the tools described in the basic Level pages, testers on Agile projects may also utilise the tools described in the following subsections. These tools are used by the whole team to ensure team collaboration and information sharing, which are key to Agile practices.

Task Management and Tracking Tools

In some cases, Agile teams use physical story/task boards (e.g., whiteboard, cork-board) to manage and track user stories, tests, and other tasks throughout each sprint. Other teams will use application lifecycle management and task management software, including electronic task boards. These tools serve the following purposes:

  • Record stories and their relevant development and test tasks, to ensure that nothing gets lost during a sprint
  • Capture team members’ estimates on their tasks and automatically calculate the effort required to implement a story, to support efficient iteration planning sessions
  • Associate development tasks and test tasks with the same story, to provide a complete picture of the team’s effort required to implement the story
  • Aggregate developer and tester updates to the task status as they complete their work, automatically providing a current calculated snapshot of the status of each story, the iteration, and the overall release
  • Provide a visual representation (via metrics, charts, and dashboards) of the current state of each user story, the iteration, and the release, allowing all stakeholders, including people on geographically distributed teams, to quickly check status
  • Integrate with configuration management tools, which can allow automated recording of code check-ins and builds against tasks, and, in some cases, automated status updates for tasks

Communication and Information Sharing Tools

In addition to e-mail, documents, and spoken communication, Agile teams often use three additional types of tools to support communication and information sharing: wikis, instant messaging, and desktop sharing.

Wikis allow teams to build and share an online knowledge base on various aspects of the project, including the following:

  • Product feature diagrams, feature discussions, prototype diagrams, photos of whiteboard discussions, and other information
  • Tools and/or techniques for developing and testing found to be useful by other members of the team
  • Metrics, charts, and dashboards on product status, which is especially useful when the wiki is integrated with other tools such as the build server and task management system, since the tool can update product status automatically
  • Conversations between team members, similar to instant messaging and email, but in a way that is shared with everyone else on the team

Instant messaging, audio teleconferencing, and video chat tools provide the following benefits:

  • Allow real time direct communication between team members, especially distributed teams
  • Involve distributed teams in standup meetings
  • Reduce telephone bills by use of voice-over-IP technology, removing cost constraints that could reduce team member communication in distributed settings

Desktop sharing and capturing tools provide the following benefits:

  • In distributed teams, product demonstrations, code reviews, and even pairing can occur
  • Capturing product demonstrations at the end of each iteration, which can be posted to the team’s wiki

These tools should be used to complement and extend, not replace, face-to-face communication in Agile teams.

Software Build and Distribution Tools

As discussed earlier in this article, daily build and deployment of software is a key practice in Agile teams. This requires the use of continuous integration tools and build distribution tools. The uses, benefits, and risks of these tools was described earlier on the basics of agile page. 

Configuration Management Tools

On Agile teams, configuration management tools may be used not only to store source code and automated tests, but manual tests and other test work products are often stored in the same repository as the product source code. This provides traceability between which versions of the software were tested with which particular versions of the tests, and allows for rapid change without losing historical information. The main types of version control systems include centralised source control systems and distributed version control systems. The team size, structure, location, and requirements to integrate with other tools will determine which version control system is right for a particular Agile project.

Test Design, Implementation, and Execution Tools

Some tools are useful to Agile testers at specific points in the software testing process. While most of these tools are not new or specific to Agile, they provide important capabilities given the rapid change of Agile projects.

  • Test design tools: Use of tools such as mind maps have become more popular to quickly design and define tests for a new feature.
  • Test case management tools: The type of test case management tools used in Agile may be part of the whole team’s application lifecycle management or task management tool.
  • Test data preparation and generation tools: Tools that generate data to populate an application’s database are very beneficial when a lot of data and combinations of data are necessary to test the application. These tools can also help re-define the database structure as the product undergoes changes during an Agile project and refactor the scripts to generate the data. This allows quick updating of test data as changes occur. Some test data preparation tools use production data sources as a raw material and use scripts to remove or anonymise sensitive data. Other test data preparation tools can help with validating large data inputs or outputs.
  • Test data load tools: After data has been generated for testing, it needs to be loaded into the application. Manual data entry is often time consuming and error prone, but data load tools are available to make the process reliable and efficient. In fact, many of the data generator tools include an integrated data load component. In other cases, bulk-loading using the database management systems is also possible.
  • Automated test execution tools: There are test execution tools which are more aligned to Agile testing. Specific tools are available via both commercial and open source avenues to support test first approaches, such as behaviour-driven development, test-driven development, and acceptance test-driven development. These tools allow testers and business staff to express the expected system behaviour in tables or natural language using keywords.
  • Exploratory test tools: Tools that capture and log activities performed on an application during an exploratory test session are beneficial to the tester and developer, as they record the actions taken. This is useful when a defect is found, as the actions taken before the failure occurred have been captured and can be used to report the defect to the developers. Logging steps performed in an exploratory test session may prove to be beneficial if the test is ultimately included in the automated regression test suite.

Cloud Computing and Virtualisation Tools

Virtualisation allows a single physical resource (server) to operate as many separate, smaller resources. When virtual machines or cloud instances are used, teams have a greater number of servers available to them for development and testing. This can help to avoid delays associated with waiting for physical servers. Provisioning a new server or restoring a server is more efficient with snapshot capabilities built into most virtualisation tools. Some test management tools now utilise virtualisation technologies to snapshot servers at the point when a fault is detected, allowing testers to share the snapshot with the developers investigating the fault.

Test Techniques

Categories of Test Techniques 

The purpose of a test technique, including those discussed in this section, is to help in identifying test conditions, test cases, and test data.

The choice of which test techniques to use depends on a number of factors, including: 

  • Component or system complexity 
  • Regulatory standards 
  • Customer or contractual requirements 
  • Risk levels and types 
  • Available documentation 
  • Tester knowledge and skills 
  • Available tools 
  • Time and budget 
  • Software development lifecycle model 
  • The types of defects expected in the component or system 

Some techniques are more applicable to certain situations and test levels; others are applicable to all test levels. When creating test cases, testers generally use a combination of test techniques to achieve the best results from the test effort.

The use of test techniques in the test analysis, test design, and test implementation activities can range from very informal (little to no documentation) to very formal. The appropriate level of formality depends on the context of testing, including the maturity of test and development processes, time constraints, safety or regulatory requirements, the knowledge and skills of the people involved, and the software development lifecycle model being followed. 

Categories of Test Techniques and Their Characteristics

In this article , test techniques are classified as black-box, white-box, or experience-based. 

Black-box test techniques (also called behavioural or behaviour-based techniques) are based on an analysis of the appropriate test basis (e.g., formal requirements documents, specifications, use cases, user stories, or business processes). These techniques are applicable to both functional and non-functional testing. Black-box test techniques concentrate on the inputs and outputs of the test object without reference to its internal structure. 

White-box test techniques (also called structural or structure-based techniques) are based on an analysis of the architecture, detailed design, internal structure, or the code of the test object. Unlike black-box test techniques, white-box test techniques concentrate on the structure and processing within the test object. 

Experience-based test techniques leverage the experience of developers, testers and users to design, implement, and execute tests. These techniques are often combined with black-box and white-box test techniques.

Common characteristics of black-box test techniques include the following: 

  • Test conditions, test cases, and test data are derived from a test basis that may include software requirements, specifications, use cases, and user stories
  • Test cases may be used to detect gaps between the requirements and the implementation of the requirements, as well as deviations from the requirements 
  • Coverage is measured based on the items tested in the test basis and the technique applied to the test basis

Common characteristics of white-box test techniques include:

  • Test conditions, test cases, and test data are derived from a test basis that may include code, software architecture, detailed design, or any other source of information regarding the structure of the software
  • Coverage is measured based on the items tested within a selected structure (e.g., the code or interfaces) and the technique applied to the test basis

Common characteristics of experience-based test techniques include:

  • Test conditions, test cases, and test data are derived from a test basis that may include knowledge and experience of testers, developers, users and other stakeholders 

This knowledge and experience includes expected use of the software, its environment, likely defects, and the distribution of those defects.

Black-box Test Techniques

Equivalence Partitioning 

Equivalence partitioning divides data into partitions (also known as equivalence classes) in such a way that all the members of a given partition are expected to be processed in the same way. There are equivalence partitions for both valid and invalid values. 

  • Valid values are values that should be accepted by the component or system. An equivalence partition containing valid values is called a “valid equivalence partition.” 
  • Invalid values are values that should be rejected by the component or system. An equivalence partition containing invalid values is called an “invalid equivalence partition.” 
  • Partitions can be identified for any data element related to the test object, including inputs, outputs, internal values, time-related values (e.g., before or after an event) and for interface parameters (e.g., integrated components being tested during integration testing). 
  • Any partition may be divided into sub partitions if required. 
  • Each value must belong to one and only one equivalence partition.
  • When invalid equivalence partitions are used in test cases, they should be tested individually, i.e., not combined with other invalid equivalence partitions, to ensure that failures are not masked. Failures can be masked when several failures occur at the same time but only one is visible, causing the other failures to be undetected. 

To achieve 100% coverage with this technique, test cases must cover all identified partitions (including invalid partitions) by using a minimum of one value from each partition. Coverage is measured as the number of equivalence partitions tested by at least one value, divided by the total number of identified equivalence partitions, normally expressed as a percentage. Equivalence partitioning is applicable at all test levels.

Boundary Value Analysis

Boundary value analysis (BVA) is an extension of equivalence partitioning, but can only be used when the partition is ordered, consisting of numeric or sequential data. The minimum and maximum values (or first and last values) of a partition are its boundary values. 

For example, let suppose an input field accepts a single integer value as an input, using a keypad to limit inputs so that non-integer inputs are impossible. The valid range is from 1 to 5, inclusive. So, there are three equivalence partitions: invalid (too low); valid; invalid (too high). For the valid equivalence partition, the boundary values are 1 and 5. For the invalid (too high) partition, the boundary value is 6. For the invalid (too low) partition, there is only one boundary value, 0, because this is a partition with only one member. 

In the example above, we identify two boundary values per boundary. The boundary between invalid (too low) and valid gives the test values 0 and 1. The boundary between valid and invalid (too high) gives the test values 5 and 6. Some variations of this technique identify three boundary values per boundary: the values before, at, and just over the boundary. In the previous example, using three-point boundary values, the lower boundary test values are 0, 1, and 2, and the upper boundary test values are 4, 5, and 6. 

Behaviour at the boundaries of equivalence partitions is more likely to be incorrect than behaviour within the partitions. It is important to remember that both specified and implemented boundaries may be displaced to positions above or below their intended positions, may be omitted altogether, or may be supplemented with unwanted additional boundaries. Boundary value analysis and testing will reveal almost all such defects by forcing the software to show behaviours from a partition other than the one to which the boundary value should belong. 

Boundary value analysis can be applied at all test levels. This technique is generally used to test requirements that call for a range of numbers (including dates and times). Boundary coverage for a partition is measured as the number of boundary values tested, divided by the total number of identified boundary test values, normally expressed as a percentage.

Decision Table Testing

Decision tables are a good way to record complex business rules that a system must implement. When creating decision tables, the tester identifies conditions (often inputs) and the resulting actions (often outputs) of the system. These form the rows of the table, usually with the conditions at the top and the actions at the bottom. Each column corresponds to a decision rule that defines a unique combination of conditions which results in the execution of the actions associated with that rule. The values of the conditions and actions are usually shown as Boolean values (true or false) or discrete values (e.g., red, green, blue), but can also be numbers or ranges of numbers. These different types of conditions and actions might be found together in the same table.

The common notation in decision tables is as follows:

For conditions:

  • Y means the condition is true (may also be shown as T or 1) 
  • N means the condition is false (may also be shown as F or 0) 
  • — means the value of the condition doesn’t matter (may also be shown as N/A)

For actions: 

  • X means the action should occur (may also be shown as Y or T or 1) 
  • Blank means the action should not occur (may also be shown as – or N or F or 0)

A full decision table has enough columns (test cases) to cover every combination of conditions. By deleting columns that do not affect the outcome, the number of test cases can decrease considerably. For example by removing impossible combinations of conditions.

The common minimum coverage standard for decision table testing is to have at least one test case per decision rule in the table. This typically involves covering all combinations of conditions. Coverage is measured as the number of decision rules tested by at least one test case, divided by the total number of decision rules, normally expressed as a percentage.

The strength of decision table testing is that it helps to identify all the important combinations of conditions, some of which might otherwise be overlooked. It also helps in finding any gaps in the requirements. It may be applied to all situations in which the behaviour of the software depends on a combination of conditions, at any test level.

State Transition Testing

Components or systems may respond differently to an event depending on current conditions or previous history (e.g., the events that have occurred since the system was initialised). The previous history can be summarised using the concept of states. A state transition diagram shows the possible software states, as well as how the software enters, exits, and transitions between states. A transition is initiated by an event (e.g., user input of a value into a field). The event results in a transition. The same event can result in two or more different transitions from the same state. The state change may result in the software taking an action (e.g., outputting a calculation or error message). 

A state transition table shows all valid transitions and potentially invalid transitions between states, as well as the events, and resulting actions for valid transitions. State transition diagrams normally show only the valid transitions and exclude the invalid transitions. 

Tests can be designed to cover a typical sequence of states, to exercise all states, to exercise every transition, to exercise specific sequences of transitions, or to test invalid transitions. 

State transition testing is used for menu-based applications and is widely used within the embedded software industry. The technique is also suitable for modelling a business scenario having specific states or for testing screen navigation. The concept of a state is abstract — it may represent a few lines of code or an entire business process. 

Coverage is commonly measured as the number of identified states or transitions tested, divided by the total number of identified states or transitions in the test object, normally expressed as a percentage. For more information on coverage criteria for state transition testing.

Use Case Testing 

Tests can be derived from use cases, which are a specific way of designing interactions with software items. They incorporate requirements for the software functions. Use cases are associated with actors (human users, external hardware, or other components or systems) and subjects (the component or system to which the use case is applied).

Each use case specifies some behaviour that a subject can perform in collaboration with one or more actors. A use case can be described by interactions and activities, as well as preconditions, postconditions and natural language where appropriate. Interactions between the actors and the subject may result in changes to the state of the subject. Interactions may be represented graphically by work flows, activity diagrams, or business process models.

A use case can include possible variations of its basic behaviour, including exceptional behaviour and error handling (system response and recovery from programming, application and communication errors, e.g., resulting in an error message). Tests are designed to exercise the defined behaviours (basic, exceptional or alternative, and error handling). Coverage can be measured by the number of use case behaviours tested divided by the total number of use case behaviours, normally expressed as a percentage.

White-box Test Techniques 

White-box testing is based on the internal structure of the test object. White-box test techniques can be used at all test levels, but the two code-related techniques discussed in this section are most commonly used at the component test level. There are more advanced techniques that are used in some safety-critical, mission-critical, or high integrity environments to achieve more thorough coverage, but those are not discussed here.

Statement Testing and Coverage 

Statement testing exercises the potential executable statements in the code. Coverage is measured as the number of statements executed by the tests divided by the total number of executable statements in the test object, normally expressed as a percentage. 

Decision Testing and Coverage

Decision testing exercises the decisions in the code and tests the code that is executed based on the decision outcomes. To do this, the test cases follow the control flows that occur from a decision point (e.g., for an IF statement, one for the true outcome and one for the false outcome; for a CASE statement, test cases would be required for all the possible outcomes, including the default outcome). 

Coverage is measured as the number of decision outcomes executed by the tests divided by the total number of decision outcomes in the test object, normally expressed as a percentage.

The Value of Statement and Decision Testing

When 100% statement coverage is achieved, it ensures that all executable statements in the code have been tested at least once, but it does not ensure that all decision logic has been tested. Of the two white-box techniques discussed in this syllabus, statement testing may provide less coverage than decision testing. 

When 100% decision coverage is achieved, it executes all decision outcomes, which includes testing the true outcome and also the false outcome, even when there is no explicit false statement (e.g., in the case of an IF statement without an else in the code). Statement coverage helps to find defects in code that was not exercised by other tests. Decision coverage helps to find defects in code where other tests have not taken both true and false outcomes. 

Achieving 100% decision coverage guarantees 100% statement coverage (but not vice versa).

Experience-based Test Techniques

When applying experience-based test techniques, the test cases are derived from the tester’s skill and intuition, and their experience with similar applications and technologies. These techniques can be helpful in identifying tests that were not easily identified by other more systematic techniques. Depending on the tester’s approach and experience, these techniques may achieve widely varying degrees of coverage and effectiveness. Coverage can be difficult to assess and may not be measurable with these techniques. 

Commonly used experience-based techniques are discussed in the following sections.

Error Guessing 

Error guessing is a technique used to anticipate the occurrence of errors, defects, and failures, based on the tester’s knowledge, including: 

  • How the application has worked in the past 
  • What kind of errors tend to be made 
  • Failures that have occurred in other applications

A methodical approach to the error guessing technique is to create a list of possible errors, defects, and failures, and design tests that will expose those failures and the defects that caused them. These error, defect, failure lists can be built based on experience, defect and failure data, or from common knowledge about why software fails.

Exploratory Testing

In exploratory testing, informal (not pre-defined) tests are designed, executed, logged, and evaluated dynamically during test execution. The test results are used to learn more about the component or system, and to create tests for the areas that may need more testing. 

Exploratory testing is sometimes conducted using session-based testing to structure the activity. In session-based testing, exploratory testing is conducted within a defined time-box, and the tester uses a test charter containing test objectives to guide the testing. The tester may use test session sheets to document the steps followed and the discoveries made. 

Exploratory testing is most useful when there are few or inadequate specifications or significant time pressure on testing. Exploratory testing is also useful to complement other more formal testing techniques. 

Exploratory testing is strongly associated with reactive test strategies. Exploratory testing can incorporate the use of other black-box, white-box, and experience-based techniques.

Checklist-based Testing

In checklist-based testing, testers design, implement, and execute tests to cover test conditions found in a checklist. As part of analysis, testers create a new checklist or expand an existing checklist, but testers may also use an existing checklist without modification. Such checklists can be built based onexperience, knowledge about what is important for the user, or an understanding of why and how software fails. 

Checklists can be created to support various test types, including functional and non-functional testing. In the absence of detailed test cases, checklist-based testing can provide guidelines and a degree of consistency. As these are high-level lists, some variability in the actual testing is likely to occur, resulting in potentially greater coverage but less repeatability.