Analysis of reproducibility of SPECjbb2000 runs
Use of the "-XX:+AggressiveHeap" option gives a boost of 4833 SPECjbb2000 (43%)
over the "normal" options.
This is an impressive improvement, and
should be accompanied by the "-XX:+UseISM" parameter to lock down
the memory.
Sun was very interested in us running with the "-XX:+UseISM" parameter in
order to generate more stable runs.
Since the "-XX:+UseISM" plus the "-XX:+AggressiveHeap" parameter caused
an ism generated core dump on our
system we were at first unable to measure the performance
impact of the the ISM parameter on stability.
Since that time, we discovered the problem only occurred with both
"-XX:+UseISM" and "-XX:+AggressiveHeap" specified.
Basically, about 30% (7 of 23) of the attempts to run
this workload with a fully "tuned" setup
produce a valid benchmark result.
Below is a set of 23 runs with the same parameters
and same machine.
If you do use the -XX:+AggressiveHeap option, be sure to read the
"Problems and Dangers" section of the following Sun web page:
http://java.sun.com/docs/hotspot/ism.html.
Here are the runs with the -XX:+AggressiveHeap option from which the
submission (15,990)
was chosen.
The "set" of runs from which the final runs was selected are shown
in the Table below
in boldface:
|
Run Number |
Result(ops/s) |
Note |
1 |
063 |
15990 |
good |
2 |
064 |
Invalid |
1 |
3 |
066 |
6597 |
2 |
4 |
067 |
10516 |
2 |
5 |
068 |
15869 |
good |
6 |
069 |
15184 |
good |
7 |
070 |
15493 |
good |
8 |
071 |
3805 |
2 |
9 |
072 |
Invalid |
1 |
10 |
073 |
10683 |
2 |
11 |
074 |
6796 |
2 |
12 |
075 |
Invalid |
1 |
13 |
076 |
Invalid |
1 |
14 |
077 |
10582 |
2 |
15 |
078 |
15660 |
good |
16 |
079 |
Invalid |
1 |
17 |
080 |
Invalid |
1 |
18 |
081 |
Invalid |
1 |
19 |
082 |
Invalid |
1 |
20 |
083 |
Invalid |
1 |
21 |
084 |
Invalid |
1 |
22 |
085 |
16024 |
good |
23 |
086 |
15894 |
good |
Note (1):
"Invalid" runs are those which are
marked by the benchmark harness as invalid.
These runs don't satisfy the run
requirements for a valid SPECjbb2000 run.
They are disregarded.
Note (2):
The run completed but
needed more than 8 points for a "high performing" run.
Since more load points will always lower the
final number achieved, one should always attempt to run in the
minimum number of points allowed. These runs are
considered "good" by the benchmark harness,
but (obviously) are not used in selecting the run to submit.
Summary of runs with -XX:+AggressiveHeap option:
max |
SPECjbb2000 (ops/s) = 16,029 |
min |
SPECjbb2000 (ops/s) = 15,184 (of valid runs) |
mean |
SPECjbb2000 (ops/s) = 15,731 |
median |
SPECjbb2000 (ops/s) = 15,869 |
Generally SPEC tests are run multiple times and one of the tests is
"selected" as the official test.
In cases like SPECcpu2000 and SPECweb99 the median of 3 consecutive runs
is taken to be the "result". This means not only is some stability of the run
implied, but you can see the effect of a longer test.
SPECjbb200 does not have this requirement. If you get a "big" number,
within the run rules, you can publish it. We were somewhat concerned about
the "limited" reproducablity (our words, our concern)
of these runs, and produced a larger set or runs.
When it came time to submit the official test,
rather than simply take the "big dog"
run, we used a more "SPEC friendly - median" approach.
The
run submitted was 15,990
(slightly above the median achieved and
above
the average).
Back to top
|