Performance testing - still in the post-hoc ghetto of a test-driven world?
Submitted by Antony Marcano on Mon, 17/03/2008 - 10:53.
performance testing | test driven development
As some of you may recall, a couple of years ago I ran a workshop on Agile Performance testing.
You can find out more about this by searching for WOPR7.
The general theme, with the exception of one experience report, focused on post-hoc performance testing. A lot of work has been done over the last decade to advance functional test-driven development. When I say 'functional' I don't mean 'application level tests' (a.k.a. acceptance tests); I mean tests at the level of unit, acceptance and every level in between where these tests are concerned with functional behaviours (of the unit or the application respectively).
The only experience that looked at driving development with both functional and performance tests was from Antony Gorman. He'd implemented a .NET equivalent to JUnitPerf so that he could make sure that he could incorporate performance requirements into his tests. Feedback about performance was still important so post-hoc performance will still be required (just like post-hoc testing is useful to provide feedback and identify new tests worth automating).
Since then, with the exception of the release of the FIT Decorator all has been relatively quiet. Even still - I think that the FIT Decorator doesn't quite address the complexities of multi-path user experiences.
Despite these (limited) advances, performance testing still seems to live in a post-hoc ghetto of many test-driven teams, often building enormous amounts of technical debt and leaving performance concerns until way near the end of a release.
So, what's holding us back? I think that the community is still trying to get its collective head around functional acceptance test driven development (ATDD) and simply hasn't had the head-space to deal with performance too.
I feel that there is an increased interest in the idea of 'agile performance testing' on the horizon. Soon, we will need to start taking a look at how to effectively make it as much a part of the work we do in each iteration as functional tests. Jamie Dobson (not linked as I can't find his blog) was also at WOPR7 and presented an experience of how he ensured that an independent performance team got to provide feedback after each iteration but I see this as an intermediary step.
One of the problems we'll face - as we did with functional test tools - is the issue that they were designed for post-hoc testing not any kind of test-driven development.
As I pondered this issue today, I had a vision of the future of performance testing in a test-driven world.
I imagined iteration 1 - we have a story to have a home page and another to simply login. The team write application level functional tests for it. The business plan (or business case) knows that we have to have 20,000 visitors per day for this new service... half of which we expect to login between 9am and 10am.
The homepage and the login screen paths are represented with a simple UCML diagram. The names of the actions conform to actions also used by the functional tests. Behind our functional tests is a fixture and some corresponding 'fittings' that glue the functional test to our application. The 'fittings' make up a DSL and in this iteration comprises two Command' classes. A
The (yet to exist) tool that helps us to represent user-community usage of the application with UCML - allows us to specify distribution of users across the end-to-end path of simply visiting the homepage and logging in or simply visiting the homepage and abandoning the session. It utilises the 'Command' classes to interact with the application but decorates them with multi-threading capabilities, wait time distribution, input data for the user-names and passwords and many more characteristics - not least performance assertions.
Exploratory performance testing happens after the code is implemented (still within the iteration) and provides feedback on different load-profiles that we also need to include in our tests. These get added to the performance tests so that different load-profiles are accounted for. It also identifies that we need to account for some of the users getting the login wrong. The feedback also highlights that a brute-force attack on the login-page would bring the application to its knees - this is added to the backlog as a new story.
In iteration 2, we want to add a feature to find a book... We repeat the above, altering the UCML view of the application to accommodate the new story. Feedback finds new load-profiles and user-flows to consider.
This process repeats, iterating over the UCML representations as our understanding of the performance requirements grows and incrementally adding to the UCML representations as our understanding of the functionality grows. In both cases - driving the design of the code to meet both functional and performance requirements.
What's important is that there is huge potential for re-use of code across both the functional and performance requirements... there are numerous considerations to be factored in at the protocol level - but there is nothing stopping me from writing an HTTP engine and full browser engine (say - using Selenium) that can be used behind each command class (using polymorphism to instantiate the appropriate 'engine').
Unfortunately, the tools for this don't exist yet but all the building blocks are there. For example, we might take some inspiration from Brian Marick's work on visualising functional tests. We might take another piece of inspiration from FitDecorator... and so, I feel that the type of project I describe above isn't that far away - it just needs a project for which performance is really important in order that the time and people are made available to make it a reality...
Or... a visionary tools vendor/well-funded open source project to take an interest...
If you have the inclination and resources to make this a reality - let me know - I'd love to be a part of it.
You can find out more about this by searching for WOPR7.
The general theme, with the exception of one experience report, focused on post-hoc performance testing. A lot of work has been done over the last decade to advance functional test-driven development. When I say 'functional' I don't mean 'application level tests' (a.k.a. acceptance tests); I mean tests at the level of unit, acceptance and every level in between where these tests are concerned with functional behaviours (of the unit or the application respectively).
The only experience that looked at driving development with both functional and performance tests was from Antony Gorman. He'd implemented a .NET equivalent to JUnitPerf so that he could make sure that he could incorporate performance requirements into his tests. Feedback about performance was still important so post-hoc performance will still be required (just like post-hoc testing is useful to provide feedback and identify new tests worth automating).
Since then, with the exception of the release of the FIT Decorator all has been relatively quiet. Even still - I think that the FIT Decorator doesn't quite address the complexities of multi-path user experiences.
Despite these (limited) advances, performance testing still seems to live in a post-hoc ghetto of many test-driven teams, often building enormous amounts of technical debt and leaving performance concerns until way near the end of a release.
So, what's holding us back? I think that the community is still trying to get its collective head around functional acceptance test driven development (ATDD) and simply hasn't had the head-space to deal with performance too.
I feel that there is an increased interest in the idea of 'agile performance testing' on the horizon. Soon, we will need to start taking a look at how to effectively make it as much a part of the work we do in each iteration as functional tests. Jamie Dobson (not linked as I can't find his blog) was also at WOPR7 and presented an experience of how he ensured that an independent performance team got to provide feedback after each iteration but I see this as an intermediary step.
One of the problems we'll face - as we did with functional test tools - is the issue that they were designed for post-hoc testing not any kind of test-driven development.
As I pondered this issue today, I had a vision of the future of performance testing in a test-driven world.
I imagined iteration 1 - we have a story to have a home page and another to simply login. The team write application level functional tests for it. The business plan (or business case) knows that we have to have 20,000 visitors per day for this new service... half of which we expect to login between 9am and 10am.
The homepage and the login screen paths are represented with a simple UCML diagram. The names of the actions conform to actions also used by the functional tests. Behind our functional tests is a fixture and some corresponding 'fittings' that glue the functional test to our application. The 'fittings' make up a DSL and in this iteration comprises two Command' classes. A
VisitHomePage class and a Login class.The (yet to exist) tool that helps us to represent user-community usage of the application with UCML - allows us to specify distribution of users across the end-to-end path of simply visiting the homepage and logging in or simply visiting the homepage and abandoning the session. It utilises the 'Command' classes to interact with the application but decorates them with multi-threading capabilities, wait time distribution, input data for the user-names and passwords and many more characteristics - not least performance assertions.
Exploratory performance testing happens after the code is implemented (still within the iteration) and provides feedback on different load-profiles that we also need to include in our tests. These get added to the performance tests so that different load-profiles are accounted for. It also identifies that we need to account for some of the users getting the login wrong. The feedback also highlights that a brute-force attack on the login-page would bring the application to its knees - this is added to the backlog as a new story.
In iteration 2, we want to add a feature to find a book... We repeat the above, altering the UCML view of the application to accommodate the new story. Feedback finds new load-profiles and user-flows to consider.
This process repeats, iterating over the UCML representations as our understanding of the performance requirements grows and incrementally adding to the UCML representations as our understanding of the functionality grows. In both cases - driving the design of the code to meet both functional and performance requirements.
What's important is that there is huge potential for re-use of code across both the functional and performance requirements... there are numerous considerations to be factored in at the protocol level - but there is nothing stopping me from writing an HTTP engine and full browser engine (say - using Selenium) that can be used behind each command class (using polymorphism to instantiate the appropriate 'engine').
Unfortunately, the tools for this don't exist yet but all the building blocks are there. For example, we might take some inspiration from Brian Marick's work on visualising functional tests. We might take another piece of inspiration from FitDecorator... and so, I feel that the type of project I describe above isn't that far away - it just needs a project for which performance is really important in order that the time and people are made available to make it a reality...
Or... a visionary tools vendor/well-funded open source project to take an interest...
If you have the inclination and resources to make this a reality - let me know - I'd love to be a part of it.
