The weighted arithmetic mean calculation gives a performance index, which does not show the distribution of the performance among the microbenchmarks used. Consider the following example:
A benchmarks suite consists of two microbenchmarks (X and Y) and the distribution of weights is: 80% of weight is given to benchmark X, and 20% of the weight is given to benchmark Y. Assume that the benchmarks suite is used to compare machine A and B with the following statistics:
Machine A 
Machine B 


Benchmark X 
10 seconds 
30 seconds 
Benchmark Y 
100 seconds 
30 seconds 
Weighted arithmetic mean 
28 
26 
From the weighted arithmetic mean, we can conclude that machine B performs better. But is the mean a fair representation of machine A's performance with benchmark X? The problem arises when the workload mix selected by the benchmarks suite and the weights distribution is only a close estimate to the actual usage. Suppose user Z wants to use the performance results for his workload mix, which comprise 95% of programs similar to benchmark X, and only 5% of the programs similar to benchmark Y. The following table shows the problem with the weighted arithmetic mean calculation:
Results with old weights 
Results with new weights 


Machine A 
Machine B 
Machine A 
Machine B 

Benchmark X 
10 seconds 
30 seconds 
10 seconds 
30 seconds 
Benchmark Y 
100 seconds 
10 seconds 
100 seconds 
10 seconds 
Weighted arithmetic mean 
28 
26 
14.5 
29 
Since the benchmarks suite uses a weight distribution close to user Z's usage (95:5 verses 80:20), user Z will select machine B based on the results. However, machine A actually performs better for his/her workload, which cannot be concluded from the weighted arithmetic mean results published by the benchmarks suite.
 986 reads