Pass/Fail for run-logs. Is it useful info?
Submitted by Ainars Galvans on Mon, 05/12/2005 - 09:25.
metrics | test management | test management tools
Run-log pass/fail flag is common in T.M. tools such as test director. My studies indicate the most typically definition of pass/fail is: failed if any defect is encountered while execution; it is passed otherwise.
More over I’ve seen publications that suppose failed test case % to be the best indication of product quality. I find such measure even weaker than defect trends that are also attacked as inadequate. For example this approach doesn’t differentiate if a feature is not implemented or there is typo on user screen.
In my projects I use different approach and it gives me a really useful measure. Please be patient and read the whole story backed up with IEEE standard.
Issue:
According to IEEE (IEEE Std 829-1998) pass/fail criteria is supposed only at test plan and test design specification level. However procedure results “Record the successful or unsuccessful execution of the test“that allows for different interpretations of test execution success. In automated test suites such as JUnit this is defined as exact much of expected results with the actual results. By analogue manual test success should be defined: system response much with all validation steps. And how about defects that are found occasionally? How about defects that is not required to validate exactly, such as syntax defects or mismatch with the common sense? Not to mention Exploratory type of testing that is supposed to be a part of any test execution, even if test cases exists.
A nice try to avoid this issue is the next definition of pass/fail criteria: it is failed if Test Case or Test procedure execution has detected any (maybe already know) defect. Well this one is great, but how about the list of defects that are postponed to next release? If the list is high we have too much test cases failed due to defects that management is already accepted to be shipped. Fail records makes little sense, fail % provide no information.
More over if I have a test system supporting linking between test case and defect I could collect this information automatically, so what’s the reason for setting pas/fail value? IEEE requires run-logs to indicate success of execution. If a test execution have found a defect do we call this a successful or unsuccessful testing? While testing goal is to detect defect I wouldn’t like to agree that this is unsuccessful execution. Instead this is the most successful execution if we have found a new defect, although it indicates other processes (e.g. design or quality assurance) failure.
My (new) Approach.
As a tester I found that there are no way to evaluate/report risk of blinking defect see my post Defect severity and priority and at the same time we got superfluous field pass/fail of a test case or procedure. Guess what? I started to use pass/fail to describe risk of blinking defects. If there are too much of workaround I have to do and much of defects I have to ignore, hen I press fail. I press passed even if there are a critical defect but a single one and the one that I can ignore naturally.
I also press Failed when the defect prevents me to fully execute test procedure (e.g. blocking several steps in it). It also include cases when the reason is low "quality of test case and requirements itself". At the moment of test execution we may not know what is the reason and where the fix will be applied - in code, requirements or test case. We should simply evaluate if test as they are described are executable and mark pass/fail accordingly.
My approach – considerations.
Don’t you feel troubled about the approach? Most of the tester does feel at least the first time. I don’t! Know why? Let’s remember for pass/fail means outside IT/QA. For example – how do you fail exams? If your test got more than a certain number of defects you are failed, isn’t it? It’s not that a single mistake will wail you, as far as the mistake let you continue correctly with next questions. Let’s look at Merriam-Webster Dictionary what the fail means: Etymology: …alteration of Latin fallere to deceive, disappoint. So failed test means that as a tester I’m disappointed of functionality, not only detected a defect. It is not just a single failure of developer, it’s like he deceives me.
It could happen that "particular project might have its own business objectives ..." that don't fit with my objectives described in this paper. Still the only case that could harm my approach is when it is required by business to provide the pass/fail statistics according to "old" approach. In this case I would try to create double statistics - old one for business and new one for internal use. But I would do my best to convince business that my (new) approach is better even for them, however it is possible only if they listen to me:).
More over I’ve seen publications that suppose failed test case % to be the best indication of product quality. I find such measure even weaker than defect trends that are also attacked as inadequate. For example this approach doesn’t differentiate if a feature is not implemented or there is typo on user screen.
In my projects I use different approach and it gives me a really useful measure. Please be patient and read the whole story backed up with IEEE standard.
Issue:
According to IEEE (IEEE Std 829-1998) pass/fail criteria is supposed only at test plan and test design specification level. However procedure results “Record the successful or unsuccessful execution of the test“that allows for different interpretations of test execution success. In automated test suites such as JUnit this is defined as exact much of expected results with the actual results. By analogue manual test success should be defined: system response much with all validation steps. And how about defects that are found occasionally? How about defects that is not required to validate exactly, such as syntax defects or mismatch with the common sense? Not to mention Exploratory type of testing that is supposed to be a part of any test execution, even if test cases exists.
A nice try to avoid this issue is the next definition of pass/fail criteria: it is failed if Test Case or Test procedure execution has detected any (maybe already know) defect. Well this one is great, but how about the list of defects that are postponed to next release? If the list is high we have too much test cases failed due to defects that management is already accepted to be shipped. Fail records makes little sense, fail % provide no information.
More over if I have a test system supporting linking between test case and defect I could collect this information automatically, so what’s the reason for setting pas/fail value? IEEE requires run-logs to indicate success of execution. If a test execution have found a defect do we call this a successful or unsuccessful testing? While testing goal is to detect defect I wouldn’t like to agree that this is unsuccessful execution. Instead this is the most successful execution if we have found a new defect, although it indicates other processes (e.g. design or quality assurance) failure.
My (new) Approach.
As a tester I found that there are no way to evaluate/report risk of blinking defect see my post Defect severity and priority and at the same time we got superfluous field pass/fail of a test case or procedure. Guess what? I started to use pass/fail to describe risk of blinking defects. If there are too much of workaround I have to do and much of defects I have to ignore, hen I press fail. I press passed even if there are a critical defect but a single one and the one that I can ignore naturally.
I also press Failed when the defect prevents me to fully execute test procedure (e.g. blocking several steps in it). It also include cases when the reason is low "quality of test case and requirements itself". At the moment of test execution we may not know what is the reason and where the fix will be applied - in code, requirements or test case. We should simply evaluate if test as they are described are executable and mark pass/fail accordingly.
My approach – considerations.
Don’t you feel troubled about the approach? Most of the tester does feel at least the first time. I don’t! Know why? Let’s remember for pass/fail means outside IT/QA. For example – how do you fail exams? If your test got more than a certain number of defects you are failed, isn’t it? It’s not that a single mistake will wail you, as far as the mistake let you continue correctly with next questions. Let’s look at Merriam-Webster Dictionary what the fail means: Etymology: …alteration of Latin fallere to deceive, disappoint. So failed test means that as a tester I’m disappointed of functionality, not only detected a defect. It is not just a single failure of developer, it’s like he deceives me.
It could happen that "particular project might have its own business objectives ..." that don't fit with my objectives described in this paper. Still the only case that could harm my approach is when it is required by business to provide the pass/fail statistics according to "old" approach. In this case I would try to create double statistics - old one for business and new one for internal use. But I would do my best to convince business that my (new) approach is better even for them, however it is possible only if they listen to me:).
