Re: Performance testing and coverage
Submitted by Alexander Podelko on Tue, 11/03/2008 - 07:38.
non-functional testing | performance testing | performance testing patterns
Can't leave Ainars Galvans' posting unanswered. I think that it touches very important issues I am fighting for long time with, so this post is going beyond just a comment.
It is very interesting that I completely agree with Ainars on most items except final conclusions – which I completely disagree with. I suspect that it is rather terminology difference. So let's start with what I disagree.
Ainars writes:
For years I appose those advocates defining performance requirements as the best practice.
I’ve red some people saying that you must create environment as close as possible, you must run scenarios as realistic as possible. That’s bu**^#&@ and I’m not the only one who thinks so.
Let's leave Mike Kelly alone for the moment – I don't see why his post confirms the above statements. Actually I don't see anything in Ainars' post confirming them.
Ainars writes: Let’s assume that we have requirements for 200 user load and identified 5 different scenarios – each equally likely to be used, so we have to run 40 users of each scenario to emulate load.
Let's look at this statement. It is starting from a performance requirement: we have requirements for 200 user load. A little strange for the person opposing them… I wonder what Ainars mean when he speaks about performance requirements – but definitely something different than I do.
Documenting performance requirements IS collaboration. What exactly are you going to "document" without collaboration? And why you should surrender them if it is the real requirements? Of course, if you "document" somebody's random guess you may discard it later – but I believe that it is a responsibility of performance tester/engineer/architect to help to separate the real requirements from nonsense – not to write that nonsense down. Actually it should be done before system design – you need to understand what you are going to design.
Ainars writes:
The first thing I will do is to run each scenario standalone with all 200 users doing this one scenario. Realistic – of course no!
Of course yes – you have a realistic scenario, one of 5 you identified. You doing the stress test for one of the realistic scenarios. Probably I wouldn't do that, at least first – I'd rather run 1, 5, and 40 users – but it may be a way to find some problems in the system.
Not sure what is wrong with creating environment as close as possible – the further is your test environment from what you will use, the more chances that you find false problems and miss real ones.
Other then that, I agree with Ainars in everything.
Yes, performance testing isn't exact science. Yes, it is a way to decrease the risk, not to eliminate it completely. Results are as meaningful as the test and environment you created – how close it is to the reality is a separate question (in some cases you base your test design / performance requirements on pretty reliable data, in some cases it is a pure guess). Yes, tiny functional coverage. Yes, no emulation of unexpected events. Yes, in many cases incomplete environment and data (still not the reason not to try to make it as close as possible – you will get some differences even in this case). It is the absolute truth either you telling this to your customers or not.
I don't see much sense in stressing this part too far – it is still much better than doing nothing. Explaining that you decrease the risk, not eliminate it completely and that results are as good as good are your scenarios, data, and environment, looks enough for the initial education.
As soon as I started to think about it, I believed that performance testing is agile by definition. I am pretty sure that "waterfall" model doesn't work for performance testing (except some trivial cases). In the best case you will get back to investigation / troubleshooting, but loose a lot of time. I am very upset to see that pre-production validation with "waterfall" approach (develop all scripts you get from a random person charged with this chore, run them all together, and then formally compare with requirements provided by another random person, ignore errors if you can) is what used in most cases. Only good side here is that if you have good people in charge (and it is amazingly how many good performance testers are around – I am working with many large corporations and almost in every one I see a few very smart and experienced people), you get some value of it. In many cases people doing it in an agile way just presenting it as "waterfall" – some kind of guerilla approach. Actually I am doing it all the time – in most cases you need to present a waterfall-like plan to some kind of project manager, then you are free to do whatever is necessary in this timeframe.
When I get a script ready (whatever it can be – it definitely is not limited to a load testing tool script), I am running one, a few, and many users (many depends on the system), analyze results (including, of course, system monitoring), trying to sort out any errors. The source of errors can be quite different – script error, functional error, or a consequence of a performance bottleneck. It doesn't make much sense to add load / scripts until you figure out what is going on. Even with one script you can find many system problems and make an iteration of system's tuning. Running scripts separately allows you to make some kind of system's "model". I don't mean any kind of formal model – something like workload A creates noticeable load on components X and Y and it quite could be that cpu would be a bottleneck, while component Z is hardly touched. As you run more and more complex tests you verify results you get against your "model", your understanding how the system behaves – and if they don't match, you need to figure out what is wrong.
It is very interesting that I completely agree with Ainars on most items except final conclusions – which I completely disagree with. I suspect that it is rather terminology difference. So let's start with what I disagree.
Ainars writes:
For years I appose those advocates defining performance requirements as the best practice.
I’ve red some people saying that you must create environment as close as possible, you must run scenarios as realistic as possible. That’s bu**^#&@ and I’m not the only one who thinks so.
Let's leave Mike Kelly alone for the moment – I don't see why his post confirms the above statements. Actually I don't see anything in Ainars' post confirming them.
Ainars writes: Let’s assume that we have requirements for 200 user load and identified 5 different scenarios – each equally likely to be used, so we have to run 40 users of each scenario to emulate load.
Let's look at this statement. It is starting from a performance requirement: we have requirements for 200 user load. A little strange for the person opposing them… I wonder what Ainars mean when he speaks about performance requirements – but definitely something different than I do.
Documenting performance requirements IS collaboration. What exactly are you going to "document" without collaboration? And why you should surrender them if it is the real requirements? Of course, if you "document" somebody's random guess you may discard it later – but I believe that it is a responsibility of performance tester/engineer/architect to help to separate the real requirements from nonsense – not to write that nonsense down. Actually it should be done before system design – you need to understand what you are going to design.
Ainars writes:
The first thing I will do is to run each scenario standalone with all 200 users doing this one scenario. Realistic – of course no!
Of course yes – you have a realistic scenario, one of 5 you identified. You doing the stress test for one of the realistic scenarios. Probably I wouldn't do that, at least first – I'd rather run 1, 5, and 40 users – but it may be a way to find some problems in the system.
Not sure what is wrong with creating environment as close as possible – the further is your test environment from what you will use, the more chances that you find false problems and miss real ones.
Other then that, I agree with Ainars in everything.
Yes, performance testing isn't exact science. Yes, it is a way to decrease the risk, not to eliminate it completely. Results are as meaningful as the test and environment you created – how close it is to the reality is a separate question (in some cases you base your test design / performance requirements on pretty reliable data, in some cases it is a pure guess). Yes, tiny functional coverage. Yes, no emulation of unexpected events. Yes, in many cases incomplete environment and data (still not the reason not to try to make it as close as possible – you will get some differences even in this case). It is the absolute truth either you telling this to your customers or not.
I don't see much sense in stressing this part too far – it is still much better than doing nothing. Explaining that you decrease the risk, not eliminate it completely and that results are as good as good are your scenarios, data, and environment, looks enough for the initial education.
As soon as I started to think about it, I believed that performance testing is agile by definition. I am pretty sure that "waterfall" model doesn't work for performance testing (except some trivial cases). In the best case you will get back to investigation / troubleshooting, but loose a lot of time. I am very upset to see that pre-production validation with "waterfall" approach (develop all scripts you get from a random person charged with this chore, run them all together, and then formally compare with requirements provided by another random person, ignore errors if you can) is what used in most cases. Only good side here is that if you have good people in charge (and it is amazingly how many good performance testers are around – I am working with many large corporations and almost in every one I see a few very smart and experienced people), you get some value of it. In many cases people doing it in an agile way just presenting it as "waterfall" – some kind of guerilla approach. Actually I am doing it all the time – in most cases you need to present a waterfall-like plan to some kind of project manager, then you are free to do whatever is necessary in this timeframe.
When I get a script ready (whatever it can be – it definitely is not limited to a load testing tool script), I am running one, a few, and many users (many depends on the system), analyze results (including, of course, system monitoring), trying to sort out any errors. The source of errors can be quite different – script error, functional error, or a consequence of a performance bottleneck. It doesn't make much sense to add load / scripts until you figure out what is going on. Even with one script you can find many system problems and make an iteration of system's tuning. Running scripts separately allows you to make some kind of system's "model". I don't mean any kind of formal model – something like workload A creates noticeable load on components X and Y and it quite could be that cpu would be a bottleneck, while component Z is hardly touched. As you run more and more complex tests you verify results you get against your "model", your understanding how the system behaves – and if they don't match, you need to figure out what is wrong.
