gcocco software Home!
gcocco software, inc
Perspective on gcocco software, inc.

Users of high performance computing equipment have varied criteria when comparing systems for purchase. Performance benchmarks have always been a key component in all purchases of high performance gear.

Engineers have improved the building block technologies in hardware and software to increase the processing power, storage capacity and network bandwidth by orders of magnitude in the past 20 years. Benchmark technologies and strategies have not changed much over that time period.

Benchmarking was raised to an artform by the Marketing Team at Cray Supercomputers in the 1980's. Cray built the mystique that their systems were so fast they had to be super cooled. From the Cray archives: (no longer online)

"The CRAY 2 was completely flouronics cooled. The cooling fluid, made by 3M, allowed the whole computer to be immersed in the electrically insulated fluid, and yet conduct the heat away by conduction and ebullient vaporization. It looked much like a fish tank."
Cray Research even sold a special see-through waterfall cooling unit, so you could "see" the heat being dissipated, and "understand" how fast the machine had to be. Look closely in the back of the Cray 2 Supercomputer picture, now in "Wikipedia". You can see the transparent waterfall cooling unit in the right of the picture!

Working with the MIT RLE lab, I had firsthand exposure to how the IBM Mainframe (System/390) compared to the Cray 2. An MIT physics post-graduate researcher and myself led the two phase benchmark effort.

We first compared industry standard floating point benchmarks including the Livermore Loops and Linpack Numbers. This gave us some gross parameters with which to compare the machines.

For the custom head-to-head phase, we benchmarked workloads devised by various interested MIT professors that would "hopefully" simulate their "unique" workload. We had access to a Cray 2 and to an IBM 3090. We exploited all the tricks we could on each platform, and applied the tricks as effectively as possible to both machines. The resulting performance difference between the two machines for the custom workloads was astonishingly small and there was no clear winner!

We found that in preparing "custom" benchmarks, something MIT Professors really had little or no experience with, many problems were created in the interpretation of the results. Some examples are:

  1. By overly "simplifying" their programs, runtime profiles of the benchmark code no longer represented what the professor's original program really did.
  2. By making their data larger or smaller, they again changed critically how the results should be interpreted.
  3. One physics application in Chaos theory was numerically unstable and used an iterative solution. To measure the effectiveness of the execution of this code, the "metric" needed to be (seconds/iteration) and not (total CPU time). This was not obvious until the application itself was thoroughly understood.
  4. Since the "custom benchmarks" weren't multi-platform enabled, many changes were required to get things running. There was no definition of "what was allowed" and the actual level of change was vastly different from application to application. "Vendor neutral source code" was not a concept! Had this porting and tuning been done by separate, "competing" groups, you would have had "chaos" in reality!

These custom benchmarks are often thought to be "easier" to understand and compare, but generally are fraught with subtle problems. This is not to say that "custom benchmarks" cannot be done well, just that they usually are not done well. If done correctly and fairly, custom benchmarks should be the truest indicator of the performance you should expect to get.

Today, Sun, IBM, HP, DEC/Compaq dominate the UNIX landscape. Each has their own benchmark philosophy and their benchmark numbers reflect that philosophy. SPEC, the Transaction Processing Council and other organizations have formed in the past 10 years to put some industry definition and rigor into the benchmarking business. Now, customers and users have a solid frame of reference to compare and judge systems if performance is a factor in their decision.

Vendors have come to rely more and more on these industry benchmarks to reduce the requirement for customized application benchmarks which can be very time consuming and expensive to run. However, as systems grow to be more powerful, the measurement environment required to support even industry standard benchmarks have significantly increased. For instance, the current TPC-C leadership benchmark cost over $10,000,000 to set up. The Mid-Range Server TPC-C benchmarks (servers that are less than $100,000) cost over $1,000,000 to set up. When TPC-C was first introduced in 1992, the top system setup cost was less than $100,000. The SPECweb99 benchmark has also experienced this price creep such that benchmarking a Mid-Range Server cost well over over $250,000 to run today.

Benchmarking is still an art today. Engineers prefer to think the technical innovations are the key components to winning the benchmark war but in reality, it is the expense of running these benchmarks and the marketing of these benchmarks that determine the winners and the losers. All vendors have chosen to benchmark their high end systems with the intent on winning the high-water benchmarks and being able to make the (in this case fictitious) claim:

"MORE Furlongs/Nano-century on stunoD® than anyone on the planet!

As the market leader, Sun has chosen to publish selectively, perhaps to save money. Benchmarking is very expensive in both equipment and labor. Recently, however, Sun has gotten back into the fray with announcements on their somewhat delayed UltraSPARC-III. Due to their being "out of the game" for a while, their new benchmarks are getting plenty of focus and attention.

SPEC and the Transaction Processing Council represent a major leap forward in improving the whole area of fairness in benchmarking. These two entities (made up primarily of hardware and software vendors) have made a significant contribution to technical and commercial customers by leveling the playing field and forcing vendors to address performance problems or lose business. Neither group's benchmarks are perfect, but from where benchmarking was, this is nirvana.

Having a defined benchmark and an "unlimited" amount of time have put vendors on a more equal footing. The requirement for SPEC's full disclosure and the Transaction Processing Council's on-site audit has decreased the likelihood of one vendor having a significant advantage over another vendor by the use of any sort of "dirty trick".

However, there are other aspects of benchmarking that are still the same from 20 years ago that lead to faulty comparisons. One must always know what was run, who ran it and how. Whenever you compare two results, the question must always be asked, "Can I really even compare these two results?"

Most customers and users are paralyzed by the thought of purchasing, installing and upgrading a computer system only to run out of steam shortly thereafter. Most vendors, system integrators and consulting practices typically recommend purchasing much more than their client needs to avoid the problem of "coming up short". The companies that can afford it, simply "buy into" this strategy (so to speak). However, introducing additional hardware poses system management issues, cost issues such as space, energy, maintenance, additional software licenses as well as the inconvenience of having system downtime and the risk of the new system not coming up when scheduled.

gcocco software believes that although many performance problems certainly can be "fixed" by purchasing new hardware or upgrading existing hardware, there is much that can be done by more effectively using what you have. Tuning is an iterative process, and becomes increasingly complex in today's heterogeneous networked environment. Application tuning, Network management, Database tuning (application and administration), and I/O infrastructure are among the key factors that should be reviewed prior to purchasing additional hardware or software. You must first have a plan. A plan is based on understanding.

One analysis that gcocco software undertook for a major consumer goods manufacturer determined that an SAP R/3 application running on an Oracle Database was able to increase their systems throughput by about 4X without the introduction of any new hardware or software. Their Oracle database had not been properly tuned and was the limiting factor in the overall SAP performance problem. This pushed out the need for their planned CPU upgrade for over one year!

Another engagement with a Fortune 100 company gained over a 12X improvement in the turnaround time of their month-end/year-end processing by automating selective pieces of their process and improving the overall flow of data out of and back into their data warehouse. This company was not able to manage their day-to-day operations in a timely fashion. This solution was also accomplished without the introduction of major new hardware. There was a modest amount of new software developed (less than 10,000 lines of custom code), but most of the improvements came from understanding the overall flow of data in the system, and processing the data in the correct existing "tier" of the data processing infrastructure.

As a former member of SPEC, gcocco software continues to focus on industry standard performance benchmarks to better understand system performance. This allows us to guide clients with complex environments to achieve their best possible performance, with the lowest cost and lowest risk. As an independent third party, we continue to "call them as we see them"!

gcocco software is an independent software development and consulting firm that is available to evaluate and/or implement solutions for optimal performance in highly complex computing environments.

gcocco home
webmaster at gcocco dot com
Copyright © 1998 - 2006 gcocco software, inc. All Rights Reserved.
Page generated on: Saturday 30 September 2006 at 12:12:37 PM
URL: http://www.gcocco.com