Risk-based performance testing. A different practice?
Submitted by Ainars Galvans on Wed, 25/10/2006 - 12:02.
performance testing | performance testing patterns
With this post I want to continue to attack performance testing with the only goal to validate performance requirements. To me it is testing happy-path only. Some publications and even tutorials claim this to be solved by adding risk-analysis during performance requirements gathering. I’m afraid this is not my practice and not the practice of many performance testers as much as I could judge.
What’s wrong there ?
They say Risk=probability*impact, so it is enough to analyze most probable scenarios/loads and define impact in terms of acceptable response times, right? In best case they realize that probability analyze should also include architecture/technology analyze (given there are architect or lead developer who understand performance).
Two issues I want to point out:
1) Performance bug is anything that bugs any stakeholder – not only deviance from requirements or specifications.
2) Each executed performance test changes risks because we gather more information about probability and sometimes even about impact.
Let me give an example.
It will be abstract one, although is actually based on experience – observations of different system performance attributes. Let it be an accountancy application, 3 types of users accountants, managers and employees. 2 scenarios for employees to submit new expense and view status of his older submitted expenses and optionally update them. 1 scenario for accountants to pick up one pending requests and either approve or return for updates, one for boss to either approve or decline. So you’ve got requirements that detailed describe average load in terms of each user scenarios (e.g. throughput of requests, frequency for employees to view status, etc.) And you’ve got response time requirements in seconds:
Login 10, any employee action – 30 (this performance is not critical as they could alt+tab from web screen and return later on), any manager action – 10 seconds, any accountant action – 5 seconds (this is critical as slowness will cause ineffective work of accountant).
Let’s forget at the moment that there are not only average and maximal response times, but other values as well. Let’s assume it is average only. Suppose you run tests and get results either A or B and I want you to tell me which one do you think pass the tests? And how about risks of A and B? I will write them down in form Transaction Name (required response time) [case A response time] [case B response time].
Login (10) [5] [12]
Submit New Expense (30) [25] [5]
Search Pending Expenses (30) [15] [35]
Update Expense (30) [28] [3]
Accountant Process Expense (5) [4] [2]
Manager Process Expense (10) [6] [2]
Correct answer – I need more information! Although case A accord to requirements it looks like system is overloaded (reached it’s capacity limits in terms of load) and should the load increase a little bit or some noise happen in production it could appear that all transaction times will be slower than required. While in case B there are certain features that works somehow slow, but probably this slowness could be easily fixed or maybe we could negotiate new requirements for login? For example if you would ask for example accountant what they prefer to have 12 seconds login and 2 seconds processing business item or 5 seconds login but 4 seconds processing each item – I bet the first case would be a preferred one.
Testing goal: Analyze risks
Performance testing can’t be pre-planned (at least not full-scope of it). Each test you run (at least in the performance test first phase) will give you additional information about performance risks. That’s nothing new I’m talking about - Scott Barber wrote Investigation vs. Validation where he claims that investigation takes majority of time spend in performance testing. To better understand this try to realize this – there are too many unknown risks: 3rd party tool, DB structure, lack in indices or unnecessary ones, network latency impact on the technologies used, etc., etc. Each investigation step/test is typically aimed to evaluate the risk and mitigate it by fixing issues you will find. Sometimes you will mitigate them even by re-designing feature, dropping a feature (I hope not, but better faster than later) – canceling project.
What’s wrong there ?
They say Risk=probability*impact, so it is enough to analyze most probable scenarios/loads and define impact in terms of acceptable response times, right? In best case they realize that probability analyze should also include architecture/technology analyze (given there are architect or lead developer who understand performance).
Two issues I want to point out:
1) Performance bug is anything that bugs any stakeholder – not only deviance from requirements or specifications.
2) Each executed performance test changes risks because we gather more information about probability and sometimes even about impact.
Let me give an example.
It will be abstract one, although is actually based on experience – observations of different system performance attributes. Let it be an accountancy application, 3 types of users accountants, managers and employees. 2 scenarios for employees to submit new expense and view status of his older submitted expenses and optionally update them. 1 scenario for accountants to pick up one pending requests and either approve or return for updates, one for boss to either approve or decline. So you’ve got requirements that detailed describe average load in terms of each user scenarios (e.g. throughput of requests, frequency for employees to view status, etc.) And you’ve got response time requirements in seconds:
Login 10, any employee action – 30 (this performance is not critical as they could alt+tab from web screen and return later on), any manager action – 10 seconds, any accountant action – 5 seconds (this is critical as slowness will cause ineffective work of accountant).
Let’s forget at the moment that there are not only average and maximal response times, but other values as well. Let’s assume it is average only. Suppose you run tests and get results either A or B and I want you to tell me which one do you think pass the tests? And how about risks of A and B? I will write them down in form Transaction Name (required response time) [case A response time] [case B response time].
Login (10) [5] [12]
Submit New Expense (30) [25] [5]
Search Pending Expenses (30) [15] [35]
Update Expense (30) [28] [3]
Accountant Process Expense (5) [4] [2]
Manager Process Expense (10) [6] [2]
Correct answer – I need more information! Although case A accord to requirements it looks like system is overloaded (reached it’s capacity limits in terms of load) and should the load increase a little bit or some noise happen in production it could appear that all transaction times will be slower than required. While in case B there are certain features that works somehow slow, but probably this slowness could be easily fixed or maybe we could negotiate new requirements for login? For example if you would ask for example accountant what they prefer to have 12 seconds login and 2 seconds processing business item or 5 seconds login but 4 seconds processing each item – I bet the first case would be a preferred one.
Testing goal: Analyze risks
Performance testing can’t be pre-planned (at least not full-scope of it). Each test you run (at least in the performance test first phase) will give you additional information about performance risks. That’s nothing new I’m talking about - Scott Barber wrote Investigation vs. Validation where he claims that investigation takes majority of time spend in performance testing. To better understand this try to realize this – there are too many unknown risks: 3rd party tool, DB structure, lack in indices or unnecessary ones, network latency impact on the technologies used, etc., etc. Each investigation step/test is typically aimed to evaluate the risk and mitigate it by fixing issues you will find. Sometimes you will mitigate them even by re-designing feature, dropping a feature (I hope not, but better faster than later) – canceling project.
