Acceptance Criteria, Acceptance Tests and Experience-Based Practices

Writing Acceptance Criteria

Specifying acceptance criteria is an important acceptance testing task. It helps to refine requirements or user stories and provides the basis for acceptance tests. Business analysts and testers should collaborate closely on the specification of these criteria. This collaboration ensures high business value from the acceptance testing phase and increases the chance of a successful iteration or product release. 

Writing acceptance criteria forces business analysts and testers to think about functionality, performance, and other characteristics from a stakeholder or usage perspective. This supports early verification and validation of the related requirement or user story and provides a better chance of detecting inconsistencies, contradictions, missing information or other problems. 

The following good practices should be considered when writing acceptance criteria:

  • Well-written acceptance criteria are precise, measurable and concise. Each criterion must be written in a way that enables the tester to measure whether or not the test object complies with the acceptance criterion.
  • Well-written acceptance criteria do not include technical solution details. They concentrate on the question “What shall be achieved?” rather than on the question “How shall it achieved?”.
  • Acceptance criteria should address non-functional requirements (quality characteristics) as well as functional requirements.

As with requirements and user stories, acceptance criteria should be reviewed through walkthroughs, technical reviews, iteration planning meetings or other methods (if necessary).

Designing Acceptance Tests

This section addresses the test techniques and approaches frequently used for acceptance testing.

Test Techniques for Acceptance Testing

In a requirements-based approach to acceptance testing, the tester derives test cases from the acceptance criteria related to each requirement or user story using black-box techniques such as equivalence partitioning or boundary value analysis.

Acceptance testing may be augmented with other test techniques or approaches:

  • Business process-based testing, possibly combined with decision table testing, validates business processes and rules.
  • Experience-based testing leverages the tester’s experience, knowledge and intuition.
  • Risk-based testing is based on risk types and levels. Prioritisation and thoroughness of testing depends on previously identified product risks.
  • Model-based testing uses graphical (or textual) models to obtain acceptance tests.

Acceptance criteria should be verified by acceptance tests and traceability between the requirements / user story and related test cases should be managed.

Using the Gherkin Language to Write Test Cases

In ATDD and BDD, acceptance tests are often formulated in a structured language, referred to as the Gherkin language. Using the Gherkin language, test cases are phrased declaratively using a standardised pattern:

  • Given [a situation]
  • When [an action on the system]
  • Then [the expected result]

The pattern allows business analysts, testers and developers to write test cases in a way that is easily shared with stakeholders and may be translated into automated tests. 

The “Given” block aims to put the test object in a state before performing test actions in the “When” block. The “Then” block specifies the consequences that can be observed from the actions defined in the “When” block. Test cases written in Gherkin do not refer to user interface elements but rather to user actions on the system. They are structured natural language test cases that can be understood by all relevant stakeholders. 

In addition, the structure “Given – When – Then” can be parsed in an automated way. This allows automated test script creation using a keyword-driven testing approach. 

Initially, Gherkin was specific to some software tools supporting BDD, but it is now synonymous with the “Given – When – Then” acceptance test design pattern. 

Experience-based Approaches for Acceptance Testing

All experience-based test techniques described in are relevant for acceptance testing. This section is focused on how exploratory testing can be used for acceptance tests, and on beta testing as a source of feedback on system usage. 

Exploratory Testing

Exploratory testing is an experience-based test technique that is not based on detailed predefined test procedures. In exploratory testing, all activities are carried out within an uninterrupted period of time called a session. The testers are domain experts. They are familiar with user needs, requirements and business processes, but they are not necessarily familiar with the product under test. 

During an exploratory testing session, the tester accomplishes the following:

  • Learns how to work with the product
  • Designs the tests
  • Performs the tests
  • Interprets the results

It is a good practice in exploratory testing to use a test charter. The test charter is prepared prior to the testing session (possibly jointly by the business analyst and the tester) and is used by the person in charge of the exploratory session (either a business analyst, tester or another stakeholder). It includes information about the purpose, target, and scope of the exploratory session, the test setup, the duration of the session, and possibly some tactics to be used during the session (such as the type of user that shall be simulated during the exploratory session). Time-boxed sessions help to control the time and effort dedicated to the exploratory session. It is also good practice to perform exploratory testing in pairs or as team work. 

In Agile development, exploratory test sessions can be conducted during an iteration by the product owner and/or the testers for acceptance testing of user stories assigned to the iteration. 

Exploratory testing should be used to complement other more formal techniques in acceptance testing. For example, it may be used to provide rapid feedback on new features before methodical testing is applied. 

Beta Testing

Beta testing is a form of acceptance testing that is often used for Commercial Off-the-Shelf Software (COTS) or for Software as a Service (SaaS) platforms. It is conducted to obtain feedback from the market after development and in-house testing are completed. 

Unlike other acceptance testing forms, beta testing is performed by potential or existing users at their own location. Beta tests neither impose predefined test procedures nor a test charter. Apart from the observed findings, the test activities are usually not documented at all. 

Because the product is tested in various realistic configurations by actual users in their business process context, beta testing may discover defects that escaped during the development process and previous test levels. Resolving issues found by beta tests helps organisations avoid costly hot-fixes or product recalls on a larger scale. 

Acceptance testing should not be limited to beta testing. Beta testing is not systematic or measurable. There is no guarantee that all requirements or user stories are covered by the tests. Moreover, beta testing is performed late in the development process whereas tests based on acceptance criteria support the “Early Testing” principle. 

Test Techniques

Categories of Test Techniques 

The purpose of a test technique, including those discussed in this section, is to help in identifying test conditions, test cases, and test data.

The choice of which test techniques to use depends on a number of factors, including: 

  • Component or system complexity 
  • Regulatory standards 
  • Customer or contractual requirements 
  • Risk levels and types 
  • Available documentation 
  • Tester knowledge and skills 
  • Available tools 
  • Time and budget 
  • Software development lifecycle model 
  • The types of defects expected in the component or system 

Some techniques are more applicable to certain situations and test levels; others are applicable to all test levels. When creating test cases, testers generally use a combination of test techniques to achieve the best results from the test effort.

The use of test techniques in the test analysis, test design, and test implementation activities can range from very informal (little to no documentation) to very formal. The appropriate level of formality depends on the context of testing, including the maturity of test and development processes, time constraints, safety or regulatory requirements, the knowledge and skills of the people involved, and the software development lifecycle model being followed. 

Categories of Test Techniques and Their Characteristics

In this article , test techniques are classified as black-box, white-box, or experience-based. 

Black-box test techniques (also called behavioural or behaviour-based techniques) are based on an analysis of the appropriate test basis (e.g., formal requirements documents, specifications, use cases, user stories, or business processes). These techniques are applicable to both functional and non-functional testing. Black-box test techniques concentrate on the inputs and outputs of the test object without reference to its internal structure. 

White-box test techniques (also called structural or structure-based techniques) are based on an analysis of the architecture, detailed design, internal structure, or the code of the test object. Unlike black-box test techniques, white-box test techniques concentrate on the structure and processing within the test object. 

Experience-based test techniques leverage the experience of developers, testers and users to design, implement, and execute tests. These techniques are often combined with black-box and white-box test techniques.

Common characteristics of black-box test techniques include the following: 

  • Test conditions, test cases, and test data are derived from a test basis that may include software requirements, specifications, use cases, and user stories
  • Test cases may be used to detect gaps between the requirements and the implementation of the requirements, as well as deviations from the requirements 
  • Coverage is measured based on the items tested in the test basis and the technique applied to the test basis

Common characteristics of white-box test techniques include:

  • Test conditions, test cases, and test data are derived from a test basis that may include code, software architecture, detailed design, or any other source of information regarding the structure of the software
  • Coverage is measured based on the items tested within a selected structure (e.g., the code or interfaces) and the technique applied to the test basis

Common characteristics of experience-based test techniques include:

  • Test conditions, test cases, and test data are derived from a test basis that may include knowledge and experience of testers, developers, users and other stakeholders 

This knowledge and experience includes expected use of the software, its environment, likely defects, and the distribution of those defects.

Black-box Test Techniques

Equivalence Partitioning 

Equivalence partitioning divides data into partitions (also known as equivalence classes) in such a way that all the members of a given partition are expected to be processed in the same way. There are equivalence partitions for both valid and invalid values. 

  • Valid values are values that should be accepted by the component or system. An equivalence partition containing valid values is called a “valid equivalence partition.” 
  • Invalid values are values that should be rejected by the component or system. An equivalence partition containing invalid values is called an “invalid equivalence partition.” 
  • Partitions can be identified for any data element related to the test object, including inputs, outputs, internal values, time-related values (e.g., before or after an event) and for interface parameters (e.g., integrated components being tested during integration testing). 
  • Any partition may be divided into sub partitions if required. 
  • Each value must belong to one and only one equivalence partition.
  • When invalid equivalence partitions are used in test cases, they should be tested individually, i.e., not combined with other invalid equivalence partitions, to ensure that failures are not masked. Failures can be masked when several failures occur at the same time but only one is visible, causing the other failures to be undetected. 

To achieve 100% coverage with this technique, test cases must cover all identified partitions (including invalid partitions) by using a minimum of one value from each partition. Coverage is measured as the number of equivalence partitions tested by at least one value, divided by the total number of identified equivalence partitions, normally expressed as a percentage. Equivalence partitioning is applicable at all test levels.

Boundary Value Analysis

Boundary value analysis (BVA) is an extension of equivalence partitioning, but can only be used when the partition is ordered, consisting of numeric or sequential data. The minimum and maximum values (or first and last values) of a partition are its boundary values. 

For example, let suppose an input field accepts a single integer value as an input, using a keypad to limit inputs so that non-integer inputs are impossible. The valid range is from 1 to 5, inclusive. So, there are three equivalence partitions: invalid (too low); valid; invalid (too high). For the valid equivalence partition, the boundary values are 1 and 5. For the invalid (too high) partition, the boundary value is 6. For the invalid (too low) partition, there is only one boundary value, 0, because this is a partition with only one member. 

In the example above, we identify two boundary values per boundary. The boundary between invalid (too low) and valid gives the test values 0 and 1. The boundary between valid and invalid (too high) gives the test values 5 and 6. Some variations of this technique identify three boundary values per boundary: the values before, at, and just over the boundary. In the previous example, using three-point boundary values, the lower boundary test values are 0, 1, and 2, and the upper boundary test values are 4, 5, and 6. 

Behaviour at the boundaries of equivalence partitions is more likely to be incorrect than behaviour within the partitions. It is important to remember that both specified and implemented boundaries may be displaced to positions above or below their intended positions, may be omitted altogether, or may be supplemented with unwanted additional boundaries. Boundary value analysis and testing will reveal almost all such defects by forcing the software to show behaviours from a partition other than the one to which the boundary value should belong. 

Boundary value analysis can be applied at all test levels. This technique is generally used to test requirements that call for a range of numbers (including dates and times). Boundary coverage for a partition is measured as the number of boundary values tested, divided by the total number of identified boundary test values, normally expressed as a percentage.

Decision Table Testing

Decision tables are a good way to record complex business rules that a system must implement. When creating decision tables, the tester identifies conditions (often inputs) and the resulting actions (often outputs) of the system. These form the rows of the table, usually with the conditions at the top and the actions at the bottom. Each column corresponds to a decision rule that defines a unique combination of conditions which results in the execution of the actions associated with that rule. The values of the conditions and actions are usually shown as Boolean values (true or false) or discrete values (e.g., red, green, blue), but can also be numbers or ranges of numbers. These different types of conditions and actions might be found together in the same table.

The common notation in decision tables is as follows:

For conditions:

  • Y means the condition is true (may also be shown as T or 1) 
  • N means the condition is false (may also be shown as F or 0) 
  • — means the value of the condition doesn’t matter (may also be shown as N/A)

For actions: 

  • X means the action should occur (may also be shown as Y or T or 1) 
  • Blank means the action should not occur (may also be shown as – or N or F or 0)

A full decision table has enough columns (test cases) to cover every combination of conditions. By deleting columns that do not affect the outcome, the number of test cases can decrease considerably. For example by removing impossible combinations of conditions.

The common minimum coverage standard for decision table testing is to have at least one test case per decision rule in the table. This typically involves covering all combinations of conditions. Coverage is measured as the number of decision rules tested by at least one test case, divided by the total number of decision rules, normally expressed as a percentage.

The strength of decision table testing is that it helps to identify all the important combinations of conditions, some of which might otherwise be overlooked. It also helps in finding any gaps in the requirements. It may be applied to all situations in which the behaviour of the software depends on a combination of conditions, at any test level.

State Transition Testing

Components or systems may respond differently to an event depending on current conditions or previous history (e.g., the events that have occurred since the system was initialised). The previous history can be summarised using the concept of states. A state transition diagram shows the possible software states, as well as how the software enters, exits, and transitions between states. A transition is initiated by an event (e.g., user input of a value into a field). The event results in a transition. The same event can result in two or more different transitions from the same state. The state change may result in the software taking an action (e.g., outputting a calculation or error message). 

A state transition table shows all valid transitions and potentially invalid transitions between states, as well as the events, and resulting actions for valid transitions. State transition diagrams normally show only the valid transitions and exclude the invalid transitions. 

Tests can be designed to cover a typical sequence of states, to exercise all states, to exercise every transition, to exercise specific sequences of transitions, or to test invalid transitions. 

State transition testing is used for menu-based applications and is widely used within the embedded software industry. The technique is also suitable for modelling a business scenario having specific states or for testing screen navigation. The concept of a state is abstract — it may represent a few lines of code or an entire business process. 

Coverage is commonly measured as the number of identified states or transitions tested, divided by the total number of identified states or transitions in the test object, normally expressed as a percentage. For more information on coverage criteria for state transition testing.

Use Case Testing 

Tests can be derived from use cases, which are a specific way of designing interactions with software items. They incorporate requirements for the software functions. Use cases are associated with actors (human users, external hardware, or other components or systems) and subjects (the component or system to which the use case is applied).

Each use case specifies some behaviour that a subject can perform in collaboration with one or more actors. A use case can be described by interactions and activities, as well as preconditions, postconditions and natural language where appropriate. Interactions between the actors and the subject may result in changes to the state of the subject. Interactions may be represented graphically by work flows, activity diagrams, or business process models.

A use case can include possible variations of its basic behaviour, including exceptional behaviour and error handling (system response and recovery from programming, application and communication errors, e.g., resulting in an error message). Tests are designed to exercise the defined behaviours (basic, exceptional or alternative, and error handling). Coverage can be measured by the number of use case behaviours tested divided by the total number of use case behaviours, normally expressed as a percentage.

White-box Test Techniques 

White-box testing is based on the internal structure of the test object. White-box test techniques can be used at all test levels, but the two code-related techniques discussed in this section are most commonly used at the component test level. There are more advanced techniques that are used in some safety-critical, mission-critical, or high integrity environments to achieve more thorough coverage, but those are not discussed here.

Statement Testing and Coverage 

Statement testing exercises the potential executable statements in the code. Coverage is measured as the number of statements executed by the tests divided by the total number of executable statements in the test object, normally expressed as a percentage. 

Decision Testing and Coverage

Decision testing exercises the decisions in the code and tests the code that is executed based on the decision outcomes. To do this, the test cases follow the control flows that occur from a decision point (e.g., for an IF statement, one for the true outcome and one for the false outcome; for a CASE statement, test cases would be required for all the possible outcomes, including the default outcome). 

Coverage is measured as the number of decision outcomes executed by the tests divided by the total number of decision outcomes in the test object, normally expressed as a percentage.

The Value of Statement and Decision Testing

When 100% statement coverage is achieved, it ensures that all executable statements in the code have been tested at least once, but it does not ensure that all decision logic has been tested. Of the two white-box techniques discussed in this syllabus, statement testing may provide less coverage than decision testing. 

When 100% decision coverage is achieved, it executes all decision outcomes, which includes testing the true outcome and also the false outcome, even when there is no explicit false statement (e.g., in the case of an IF statement without an else in the code). Statement coverage helps to find defects in code that was not exercised by other tests. Decision coverage helps to find defects in code where other tests have not taken both true and false outcomes. 

Achieving 100% decision coverage guarantees 100% statement coverage (but not vice versa).

Experience-based Test Techniques

When applying experience-based test techniques, the test cases are derived from the tester’s skill and intuition, and their experience with similar applications and technologies. These techniques can be helpful in identifying tests that were not easily identified by other more systematic techniques. Depending on the tester’s approach and experience, these techniques may achieve widely varying degrees of coverage and effectiveness. Coverage can be difficult to assess and may not be measurable with these techniques. 

Commonly used experience-based techniques are discussed in the following sections.

Error Guessing 

Error guessing is a technique used to anticipate the occurrence of errors, defects, and failures, based on the tester’s knowledge, including: 

  • How the application has worked in the past 
  • What kind of errors tend to be made 
  • Failures that have occurred in other applications

A methodical approach to the error guessing technique is to create a list of possible errors, defects, and failures, and design tests that will expose those failures and the defects that caused them. These error, defect, failure lists can be built based on experience, defect and failure data, or from common knowledge about why software fails.

Exploratory Testing

In exploratory testing, informal (not pre-defined) tests are designed, executed, logged, and evaluated dynamically during test execution. The test results are used to learn more about the component or system, and to create tests for the areas that may need more testing. 

Exploratory testing is sometimes conducted using session-based testing to structure the activity. In session-based testing, exploratory testing is conducted within a defined time-box, and the tester uses a test charter containing test objectives to guide the testing. The tester may use test session sheets to document the steps followed and the discoveries made. 

Exploratory testing is most useful when there are few or inadequate specifications or significant time pressure on testing. Exploratory testing is also useful to complement other more formal testing techniques. 

Exploratory testing is strongly associated with reactive test strategies. Exploratory testing can incorporate the use of other black-box, white-box, and experience-based techniques.

Checklist-based Testing

In checklist-based testing, testers design, implement, and execute tests to cover test conditions found in a checklist. As part of analysis, testers create a new checklist or expand an existing checklist, but testers may also use an existing checklist without modification. Such checklists can be built based onexperience, knowledge about what is important for the user, or an understanding of why and how software fails. 

Checklists can be created to support various test types, including functional and non-functional testing. In the absence of detailed test cases, checklist-based testing can provide guidelines and a degree of consistency. As these are high-level lists, some variability in the actual testing is likely to occur, resulting in potentially greater coverage but less repeatability.