Corey Goldberg's blog
Human Context Switching
Submitted by Corey Goldberg on Wed, 28/06/2006 - 01:27. programming languagesI'm not talking about CPU Context Switching, I am talking about the similar phenomenon your brain does to switch between what it is focused and has in memory.
Say you are working on more than one project at a time... Every time you have to shift focus to something different, your brain does something analogous to a context switch. Sometimes you have all of the details stored in your memory and you can quickly switch to many sets of tasks and be able to recall full details quickly (like a cache hit). Other times you have to go back to some research or previous notes (like a disk read) before your brain's processing fully begins.
The reason I am writing about this is within the context of programming languages. Each language you work with has its own syntactic rules, style, standard library, best practices, etc. In the past, I had a hard time keeping more than one language in my head at any given time. It would generally be my comfort language, and writing code in any other language was a very lengthy context switch where I had to dive back into old code, use online resources, and re-read books.
Lately I have found myself in the precarious situation where I am using four different languages regularly (as in extended time writing code in each; most days of the week)... Java, C#, Python, Perl. As to why, thats outside the scope of this :)
I have finally got to the point where I can context switch immediately form working in one language to the other within this set.
Which begs the question, how much can your brain handle? If you think about the numerous tools, frameworks, libraries, API's, shells, utilities, design patterns, markup languages, meta languages, programming languages, etc. that an average developer keeps on tap, it is pretty immense!
How big is your cache?
-Corey Goldberg
Say you are working on more than one project at a time... Every time you have to shift focus to something different, your brain does something analogous to a context switch. Sometimes you have all of the details stored in your memory and you can quickly switch to many sets of tasks and be able to recall full details quickly (like a cache hit). Other times you have to go back to some research or previous notes (like a disk read) before your brain's processing fully begins.
The reason I am writing about this is within the context of programming languages. Each language you work with has its own syntactic rules, style, standard library, best practices, etc. In the past, I had a hard time keeping more than one language in my head at any given time. It would generally be my comfort language, and writing code in any other language was a very lengthy context switch where I had to dive back into old code, use online resources, and re-read books.
Lately I have found myself in the precarious situation where I am using four different languages regularly (as in extended time writing code in each; most days of the week)... Java, C#, Python, Perl. As to why, thats outside the scope of this :)
I have finally got to the point where I can context switch immediately form working in one language to the other within this set.
Which begs the question, how much can your brain handle? If you think about the numerous tools, frameworks, libraries, API's, shells, utilities, design patterns, markup languages, meta languages, programming languages, etc. that an average developer keeps on tap, it is pretty immense!
How big is your cache?
-Corey Goldberg
Python - log parsing, statistical analysis, and performance graphs
Submitted by Corey Goldberg on Sun, 04/06/2006 - 19:52. design & development | pythonPython - log parsing, statistical analysis, and performance graphs:
Over the past few years, Python has become my favorite language to program in. I do tools development and work regularly on a variety of platforms, so I try to stay versatile with respect to languages. I've written lots of code in a variety of languages over the years (C, C++, Scheme, Pascal, Java, C#, Perl, Python, etc.). For a long time, Perl and Java were my goto languages. Any quick script or heavy text slinging; I would reach for Perl. Any larger project that needs an organized class structure; I would reach for Java. Then a few years back I started banging around Python and learned what a fantastic programming language it can be for _anything_.
The past few days I have been working on a tool for analyzing performance data that an application logs during load tests.
I needed a working version quickly that I can show as a proof of concept. Part of the requirements is that the tools must integrate well with a .NET environment, and be maintainable and extendable by people in the .NET shop. Many people claim Python is a great prototyping language. It has clean syntax and structure and is great for quickly building class libraries. I started to think I would create something quick in Python and then maybe later port it to C#. Well it turned out so nice and so easy to work with, that I can't imagine using anything but Python for it now. (IronPython perhaps?)
so I ended up creating a generic framework in Python and exposed a scriptable API.
It can:
- parse MS Event Logs
- slice the data up into a time-series
- run some statistical calculations on the time series
- output graphs (to gif/png images for web display, or to a GUI with more powerful viewing)
It is built with:
- Python
- Matplotlib
- MS Log Parser 2.2
I had to do a lot of work with crunching data sequences and slicing up time series data.
Python's dynamic typing and simple data structures made it very flexible to handle all the data processing with a minimal amount of code. The most useful thing was Python's List Comprehension features. List Comprehensions are very powerful constructs for list processing that allow you to do some heavy lifting with numeric sequence processing in a very concise way.
MS Log Parser 2.2
Log Parser is a tool from Microsoft that lets you query log files with an SQL dialect. I built a wrapper around this with Python's popen methods.
Matplotlib
Matplotlib is a 2D plotting library written in Python. I created a graphing API for my framework that uses this. It was simple to create and the graphs look great.
corestats.py
One of the classes I wrote was for doing simple statistical calculations. You can grab a copy here if you are interested: http://www.goldb.org/corestats.html
-Corey Goldberg
www.goldb.org
Over the past few years, Python has become my favorite language to program in. I do tools development and work regularly on a variety of platforms, so I try to stay versatile with respect to languages. I've written lots of code in a variety of languages over the years (C, C++, Scheme, Pascal, Java, C#, Perl, Python, etc.). For a long time, Perl and Java were my goto languages. Any quick script or heavy text slinging; I would reach for Perl. Any larger project that needs an organized class structure; I would reach for Java. Then a few years back I started banging around Python and learned what a fantastic programming language it can be for _anything_.
The past few days I have been working on a tool for analyzing performance data that an application logs during load tests.
I needed a working version quickly that I can show as a proof of concept. Part of the requirements is that the tools must integrate well with a .NET environment, and be maintainable and extendable by people in the .NET shop. Many people claim Python is a great prototyping language. It has clean syntax and structure and is great for quickly building class libraries. I started to think I would create something quick in Python and then maybe later port it to C#. Well it turned out so nice and so easy to work with, that I can't imagine using anything but Python for it now. (IronPython perhaps?)
so I ended up creating a generic framework in Python and exposed a scriptable API.
It can:
- parse MS Event Logs
- slice the data up into a time-series
- run some statistical calculations on the time series
- output graphs (to gif/png images for web display, or to a GUI with more powerful viewing)
It is built with:
- Python
- Matplotlib
- MS Log Parser 2.2
I had to do a lot of work with crunching data sequences and slicing up time series data.
Python's dynamic typing and simple data structures made it very flexible to handle all the data processing with a minimal amount of code. The most useful thing was Python's List Comprehension features. List Comprehensions are very powerful constructs for list processing that allow you to do some heavy lifting with numeric sequence processing in a very concise way.
MS Log Parser 2.2
Log Parser is a tool from Microsoft that lets you query log files with an SQL dialect. I built a wrapper around this with Python's popen methods.
Matplotlib
Matplotlib is a 2D plotting library written in Python. I created a graphing API for my framework that uses this. It was simple to create and the graphs look great.
corestats.py
One of the classes I wrote was for doing simple statistical calculations. You can grab a copy here if you are interested: http://www.goldb.org/corestats.html
-Corey Goldberg
www.goldb.org
Google Trends - Interesting Search Volume Comparisons
Submitted by Corey Goldberg on Thu, 11/05/2006 - 19:23. other online resourcesGoogle recently release Google Trends which allows you to view a time-series of search volume for a given search string or keyword. You can also compare searches against each other. This gives you an interesting tool for analyzing trends.
Here are some interesting comparisons:
- Dynamic Programming Languages - Perl vs. Python vs. Ruby
- Linux Distros - Fedora vs. Ubuntu
- Mercury Tools - WinRunner vs. QTP
Google Calendar API With Your Monitoring Tools
Submitted by Corey Goldberg on Wed, 26/04/2006 - 04:38. Web Serviceswow.. you can integrate Google Calendar with your monitoring tools through its web service API.. too cool:
http://www.truepathtechnologies.com/gcal.html
-Corey Goldberg
www.goldb.org
http://www.truepathtechnologies.com/gcal.html
-Corey Goldberg
www.goldb.org
Roll Your Own Tools.. Real-time Graphing and Round Robin Data Storage
Submitted by Corey Goldberg on Tue, 25/04/2006 - 04:18. java | Open Source | test toolsI have spent a lot of time playing around with graphics libraries and toolkits for integrating real-time graphs within my own testing and monitoring tools. It seems like there are many open source tools available in the world of performance testing and system monitoring. And lots of people roll their own tools in whatever programming language they are into... but many lack graphics capabilities.
Two of the toolkits/libraries I end up using often for my own homebrew test tools are: RRDTool , and JRobin.
from the RRDTool site:
"RRD is the Acronym for Round Robin Database. RRD is a system to store and display time-series data (i.e. network bandwidth, machine-room temperature, server load average). It stores the data in a very compact way that will not expand over time, and it can create beautiful graphs. It can be used via simple shell scripts or as a perl module."
So...
RRDTool is a really good back-end for storing time-series data; which is pretty much all we care about when we are doing performance testing. It has bindings for various scripting languages, or can be invoked from the command line. If you are developing tools that need a data repository and graphing capabilities, this provides you both. You create an RRD and then you begin inserting data values at regular intervals. You then call the graphing API to have a graph displayed. The cool thing about this data storage is its “round robin” nature. You define various time spans, and the granularity at which you want them stored. I fixed binary file is created, and this never grows in size over time. As you insert more data, it is inserted into each span. As results are collected, they are averaged and rolled into successive time spans. It makes a much more efficient system than using your own complex object structures, or a relational database, or file system storage.
You will probably recognize the graphs it creates, as RRDTool is integrated in many popular monitoring tools (it is Free/Open Source, GPL License). I have built many tools around RRDTool, and it is really a nice system.
If you are in the Java world, there is a very cool project named JRobin. JRobin is a clone of RRDTool in pure Java. So you can create RRD's directly from your Java code.. and all in memory if you want to!
Some days I pretend to be a Java programmer, so I had to build a tool using JRobin. As a proof of concept, I wrote a small network latency monitoring tool. It shows off some of JRobin's capabilities. It pings a host at a given interval and records the latency. A graph of the network latency is rendered in real-time onto a Swing panel.
Here is my network latency monitoring tool: NetPlot (includes Java source code, GPL Licensed)
The tool itself is just a trivial example, and really isn't the point. But you could easily adapt this code or create your own to develop real-time graphs of your own time-series data.
(hmm.. I wonder if I could hook this into JMeter? probably..)
(How freaking ironic?.. I've been using this thing for a while now, but I decided to check the JRobin web site while I'm writing this.. and the developer just ceased development of the project and turned over all related rights to OpenNMS. can someone reading this please take over JRobin maintenance? .. erm seriously)
-Corey Goldberg
www.goldb.org
Two of the toolkits/libraries I end up using often for my own homebrew test tools are: RRDTool , and JRobin.
from the RRDTool site:
"RRD is the Acronym for Round Robin Database. RRD is a system to store and display time-series data (i.e. network bandwidth, machine-room temperature, server load average). It stores the data in a very compact way that will not expand over time, and it can create beautiful graphs. It can be used via simple shell scripts or as a perl module."
So...
RRDTool is a really good back-end for storing time-series data; which is pretty much all we care about when we are doing performance testing. It has bindings for various scripting languages, or can be invoked from the command line. If you are developing tools that need a data repository and graphing capabilities, this provides you both. You create an RRD and then you begin inserting data values at regular intervals. You then call the graphing API to have a graph displayed. The cool thing about this data storage is its “round robin” nature. You define various time spans, and the granularity at which you want them stored. I fixed binary file is created, and this never grows in size over time. As you insert more data, it is inserted into each span. As results are collected, they are averaged and rolled into successive time spans. It makes a much more efficient system than using your own complex object structures, or a relational database, or file system storage.
You will probably recognize the graphs it creates, as RRDTool is integrated in many popular monitoring tools (it is Free/Open Source, GPL License). I have built many tools around RRDTool, and it is really a nice system.
If you are in the Java world, there is a very cool project named JRobin. JRobin is a clone of RRDTool in pure Java. So you can create RRD's directly from your Java code.. and all in memory if you want to!
Some days I pretend to be a Java programmer, so I had to build a tool using JRobin. As a proof of concept, I wrote a small network latency monitoring tool. It shows off some of JRobin's capabilities. It pings a host at a given interval and records the latency. A graph of the network latency is rendered in real-time onto a Swing panel.
Here is my network latency monitoring tool: NetPlot (includes Java source code, GPL Licensed)
The tool itself is just a trivial example, and really isn't the point. But you could easily adapt this code or create your own to develop real-time graphs of your own time-series data.
(hmm.. I wonder if I could hook this into JMeter? probably..)
(How freaking ironic?.. I've been using this thing for a while now, but I decided to check the JRobin web site while I'm writing this.. and the developer just ceased development of the project and turned over all related rights to OpenNMS. can someone reading this please take over JRobin maintenance? .. erm seriously)
-Corey Goldberg
www.goldb.org
Reflections on WOPR6 (Google campus, April 20-22 2006)
Submitted by Corey Goldberg on Mon, 24/04/2006 - 14:47. performance testingI'm sitting here in my office in Boston reflecting on the past few days. I returned yesterday from a trip to WOPR6 (Workshop On Performance and Reliability) at Google in Mountain View, California. It was an excellent workshop and the volume of ideas being thrown around was certainly impressive.
I have been to many software trade shows and general testing conferences before; but this was the first time I have been to an event based solely around “performance” (which is where my passion lies). Think of a larger conference.. now kick out all the vendors hawking their product.. scrap all the introductory material... send home the people who are there for a little vacation from work.. delete all the eye-glazing powerpoint presentations... and concentrate on the meaty parts... and you can get an idea of what it was like. A group of people with a passion for system performance... sharing ideas... trading war stories... laying down the roadmap for the future.
It's refreshing to see software and system Performance becoming more of an area of focus, rather than an afterthought.
-Corey Goldberg
www.goldb.org
I have been to many software trade shows and general testing conferences before; but this was the first time I have been to an event based solely around “performance” (which is where my passion lies). Think of a larger conference.. now kick out all the vendors hawking their product.. scrap all the introductory material... send home the people who are there for a little vacation from work.. delete all the eye-glazing powerpoint presentations... and concentrate on the meaty parts... and you can get an idea of what it was like. A group of people with a passion for system performance... sharing ideas... trading war stories... laying down the roadmap for the future.
It's refreshing to see software and system Performance becoming more of an area of focus, rather than an afterthought.
-Corey Goldberg
www.goldb.org
Open Source Tool Box for Performance Testing and Monitoring
Submitted by Corey Goldberg on Sat, 22/04/2006 - 19:18. Open Source | performance testingI am software engineer specializing in performance, automated testing, and tool development. I am also a Free and Open Source software advocate. I use a variety of open source tools to get my job done.
The following is a list of what is currently in my "tool box" for performance testing and monitoring. I use these tools daily and highly reccomend all of them:
OpenSTA (www.opensta.org)
Load generator for HTTP performance/load/stress testing.
Programming Language: Written in C++/MFC. Scripts are written for it using proprietary scripting language (SCL).
JMeter (jakarta.apache.org/jmeter)
Load generator for HTTP performance/load/stress testing.
Programming Language: Java/Swing
MRTG (www.mrtg.org)
System Monitoring (server/network utilization and transaction response timing)
Programming Language: Written in Perl. Plugins are written for it using any language.
RRDTool (www.rrdtool.org)
Data logging and graphing application.
Programming Language: C
Drraw (web.taranis.org/drraw)
Web front-end for viewing RRD data (system monitoring)
Programming Language: Perl/CGI with embedded HTML
WebInject (www.webinject.org)
Web/HTTP function/regression testing and performance monitoring.
Programming Language: Perl
PyMeter (coming soon to openqa.org)
Agentless system monitoring (server/network utilization)
Programming Language: Written in Python. Plugins are written for it using Python.
Nagios (www.nagios.org)
System monitoring (server/network utilization and transaction response timing)
Programming Language: Written in C. Plugins are written for it using any language.
Ethereal (www.ethereal.com)
Network Protocol Analyzer.
Programming Language: C++
what's in your open source tool box?
-Corey Goldberg
www.goldb.org
The following is a list of what is currently in my "tool box" for performance testing and monitoring. I use these tools daily and highly reccomend all of them:
OpenSTA (www.opensta.org)
Load generator for HTTP performance/load/stress testing.
Programming Language: Written in C++/MFC. Scripts are written for it using proprietary scripting language (SCL).
JMeter (jakarta.apache.org/jmeter)
Load generator for HTTP performance/load/stress testing.
Programming Language: Java/Swing
MRTG (www.mrtg.org)
System Monitoring (server/network utilization and transaction response timing)
Programming Language: Written in Perl. Plugins are written for it using any language.
RRDTool (www.rrdtool.org)
Data logging and graphing application.
Programming Language: C
Drraw (web.taranis.org/drraw)
Web front-end for viewing RRD data (system monitoring)
Programming Language: Perl/CGI with embedded HTML
WebInject (www.webinject.org)
Web/HTTP function/regression testing and performance monitoring.
Programming Language: Perl
PyMeter (coming soon to openqa.org)
Agentless system monitoring (server/network utilization)
Programming Language: Written in Python. Plugins are written for it using Python.
Nagios (www.nagios.org)
System monitoring (server/network utilization and transaction response timing)
Programming Language: Written in C. Plugins are written for it using any language.
Ethereal (www.ethereal.com)
Network Protocol Analyzer.
Programming Language: C++
what's in your open source tool box?
-Corey Goldberg
www.goldb.org
