Proceedings of the 2007 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems pages:375-376
ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems location:San Diego, California date:12-16 June 2007
Current practice in benchmarking commercial computer systems is to run a number of industry-standard benchmarks and to report performance numbers. The huge amount of machines and the large number of benchmarks for which performance numbers are published make it hard to observe clear performance trends though. In addition, these performance numbers for specific benchmarks do not provide insight into how applications of interest that are not part of the benchmark suite would perform on those machines.
In this work we build a methodology for analyzing published commercial machine performance data sets. We apply statistical data analysis techniques, more in particular principal components analysis and cluster analysis, to reduce the amount of information to a manageable amount to facilitate its understanding. Visualizing SPEC CPU2000 performance numbers for 26 benchmarks and 1000+ machines in just a few graphs gives insight into how commercial machines compare against each other.
In addition, we provide a way of relating inherent program behavior to these performance numbers so that insights can be gained into how the observed performance trends relate to the behavioral characteristics of computer programs. This results in a methodology for the ubiquitous benchmarking problem of predicting performance of an application of interest based on its similarities with the benchmarks in a published industry-standard benchmark suite.