Skip navigation.

Risk-based performance testing. A different practice?

performance testing | performance testing patterns
With this post I want to continue to attack performance testing with the only goal to validate performance requirements. To me it is testing happy-path only. Some publications and even tutorials claim this to be solved by adding risk-analysis during performance requirements gathering. I’m afraid this is not my practice and not the practice of many performance testers as much as I could judge.

What’s wrong there ?
They say Risk=probability*impact, so it is enough to analyze most probable scenarios/loads and define impact in terms of acceptable response times, right? In best case they realize that probability analyze should also include architecture/technology analyze (given there are architect or lead developer who understand performance).
Two issues I want to point out:
1) Performance bug is anything that bugs any stakeholder – not only deviance from requirements or specifications.
2) Each executed performance test changes risks because we gather more information about probability and sometimes even about impact.
Let me give an example.
It will be abstract one, although is actually based on experience – observations of different system performance attributes. Let it be an accountancy application, 3 types of users accountants, managers and employees. 2 scenarios for employees to submit new expense and view status of his older submitted expenses and optionally update them. 1 scenario for accountants to pick up one pending requests and either approve or return for updates, one for boss to either approve or decline. So you’ve got requirements that detailed describe average load in terms of each user scenarios (e.g. throughput of requests, frequency for employees to view status, etc.) And you’ve got response time requirements in seconds:
Login 10, any employee action – 30 (this performance is not critical as they could alt+tab from web screen and return later on), any manager action – 10 seconds, any accountant action – 5 seconds (this is critical as slowness will cause ineffective work of accountant).

Let’s forget at the moment that there are not only average and maximal response times, but other values as well. Let’s assume it is average only. Suppose you run tests and get results either A or B and I want you to tell me which one do you think pass the tests? And how about risks of A and B? I will write them down in form Transaction Name (required response time) [case A response time] [case B response time].
Login (10) [5] [12]
Submit New Expense (30) [25] [5]
Search Pending Expenses (30) [15] [35]
Update Expense (30) [28] [3]
Accountant Process Expense (5) [4] [2]
Manager Process Expense (10) [6] [2]

Correct answer – I need more information! Although case A accord to requirements it looks like system is overloaded (reached it’s capacity limits in terms of load) and should the load increase a little bit or some noise happen in production it could appear that all transaction times will be slower than required. While in case B there are certain features that works somehow slow, but probably this slowness could be easily fixed or maybe we could negotiate new requirements for login? For example if you would ask for example accountant what they prefer to have 12 seconds login and 2 seconds processing business item or 5 seconds login but 4 seconds processing each item – I bet the first case would be a preferred one.

Testing goal: Analyze risks
Performance testing can’t be pre-planned (at least not full-scope of it). Each test you run (at least in the performance test first phase) will give you additional information about performance risks. That’s nothing new I’m talking about - Scott Barber wrote Investigation vs. Validation where he claims that investigation takes majority of time spend in performance testing. To better understand this try to realize this – there are too many unknown risks: 3rd party tool, DB structure, lack in indices or unnecessary ones, network latency impact on the technologies used, etc., etc. Each investigation step/test is typically aimed to evaluate the risk and mitigate it by fixing issues you will find. Sometimes you will mitigate them even by re-designing feature, dropping a feature (I hope not, but better faster than later) – canceling project.

Terminology simplifies experience...

I remember heard James Bach once told about a failure to come up with definitions of terms that would include the quality. His decision was then even badly performed exploratory testing should still be called exploratory testing.
I'm afraid I made the same failure in this blog: I wanted to say that good risk-based performance testing is something more than doing good risk mitigation. Good risk mitigation is something more that most typical scenario validation. Even good validation is something more than risk mitigation.
You know what – you helped me to realize at last what experience reports is so powerful at. And I consider to start working on “Lessons learned in performance testing” only hope that there are no copyrights of the writing style of book

Not sure that I follow the discussion here

I completely agree that performance testing is a way to mitigate risk. When we test something and it is fine, we remove risk associated with that area. So yes, in a way we re-evaluate risk with each new result. But that sounds rather philosophical for me, a good point to start a thick handbook about performance testing. I am missing what is the practical point here.

Speaking about risk-based performance testing I'd think about choosing subjects for performance testing based on risk (based on some kind of our knowledge) vs. choosing subjects for performance testing based on usage ("typical" behavior). For example, we add some infrequent scenarios (use cases, user story or whatever – I use these terms as synonyms here) if we believe that there are some specific concerns about them.

Although choosing "typical" behavior to test we really base our choice on risk too: "typical" behaviors are more risky because they will happen more frequently. In that sense all performance testing is risk-based. So I feel that I am missing what is risk-based performance testing is and how it is different from other kinds of performance testing. Any simple example to illustrate the difference? Can't say that the definition from Scott's terminology page: Risk-Based Testing - Any testing organized to explore specific product risks. - James Bach helps me much.

Another subject mentioned in the discussion is performance requirements. Well, user happiness is the ultimate criterion. But you need to have the finished system to determine that. While I believe that you need some kind of performance requirements before you start to build the system. You should know what you are going to build.

While I agree that validation of performance requirements isn't the only goal of performance testing (but still much better than nothing), I strongly believe that we need to specify performance requirements at the beginning for design and development (and, of course, re-use them during performance testing).

Well, "customer to work with system simultaneously with load being carried on it. Asking customer what is the main concern now – lack of functionality or performance issues" is a good idea. It is also a very good way to verify your scripts/harness/or whatever you use to create load: if you see that you automated results match manual results you are more confident of the way you create load. But again, you need the working system for that. That can be a way to elaborate performance requirements, but not to create them.

Roland's approach with building a prototype to verify performance requirements makes more sense to me for important projects. Another way may be to adopt response time goals/requirements from analogous systems with happy customers.

Validation=risk mitigation???

I have an issue with your Validation=risk mitigation statement. Validating or testing does not automatically mitigate the risk. Acting upon the information collected by the testing process does.

BTW, I do agree with the spirit of your ideas and I think we are on the same page. Now it is just a question of finding the right language to express the ideas in.

Roland

Validation=risk mitigation. Investigation = risk evaluation

Roland, you are mostly right. However what I want to stress that instead of analyze risks->mitigate them by testing->update risks->repeat. We need to do evaluate risks by testing->analyze risks->mitigate them (test-fix)->repeat.
It is hard to write good functional requirements so that everyone understands them: developer, tester and customer. But it is twice as hard to write performance requirements. It is even worse with risks. That’s why in performance testing we need to pay much more attention to risk evaluation and less to mitigate the risk already identified.

Event versus Approach

Ainars,

A risk-based testing approach is not characterized by the one time event of a risk analysis upfront, but by continuous reviewing and updating risks based on your observations from the tests.

The problem you are describing has nothing to do with performance testing as such but more with the application of a risk-based way of working. This is very common behavior in software development.

Roland Stens

Real users: any practice?

Yes, yes, yes. In my WOPR7 diary you could read among other about my surprise for theme "Agile Performance Testing" to see no experience reports about having iterative "customer to work with system simultaneously with load being carried on it. Asking customer what is the main concern now – lack of functionality or performance issues". How about agilemanifesto value "Customer collaboration over contract negotiation"?
Unfortunately, due to my job specific I can't at this moment try this approach but I be looking forward to do it.

Here's the core problem...

I've finally struggled past the paradigm of collecting and quantifying requirements early and realized that I was attacking the problem backwards all along. I kept trying to start by quantify user happiness, but now I realize that the quantification is actually irrelevant untill they ARE happy!

Just start testing and collecting performance data, bulding a history and trends. With every release put real users in front of it. When they switch from "frustrated" to "content" or vice versa - check your trend data and you'll realize that you *HAVE* your upper and lower bounds for "content vs. frustrated" already.

I continue to contend that there is only one performance requirement... users that are not frustrated by poor performance (unless you are building to a contract that stipulates additional requirements). There certainly may be additional scalability, capacity or stability requirements, but I think for the most part all we've done as an industry in the last 5 or so years is complicate performance requirements instead of simplifying them.


--
Scott Barber
Chief Technologist, PerfTestPlus
Executive Director, Association for Software Testing
sbarber@perftestplus.com

Comment viewing options

Select your preferred way to display the comments and click 'Save settings' to activate your changes.