Context Navigation

← Previous Changeset
Next Changeset →

Changeset 221659 in webkit

Timestamp:

Sep 5, 2017 7:37:41 PM (7 years ago)

Author:

rniwa@webkit.org

Message:

Compute the final score using geometric mean in Speedometer 2.0
https://bugs.webkit.org/show_bug.cgi?id=172968

Reviewed by Saam Barati.

Make Speedometer 2.0 use the geometric mean of the subtotal of each test suite instead of the total..

In Speedometer 1.0, we used the total time to compute the final score because we wanted to make
the slowest framework and library faster. The fastest suite (FlightJS) still accounted for ~6% and
the slowest case (React) accounted for ~25% so we felt the total time, or the arithmetic mean with
a constant factor, was a good metric to track.

In the latest version of Speedometer 2.0, however, the fastest suite (Preact) runs in ~55ms whereas
the slowest suite (Inferno) takes 1.5s on Safari. Since the total time is 6.5s, Preact's suite only
accounts for ~0.8% of the total score while Inferno's suite accounts for ~23% of the total score.
Since the goal of Speedometer is to approximate different kinds of DOM API use patterns on the Web,
we want each framework & library to have some measurement impact on the overall benchmark score.

Furthermore, after r221205, we're testing both debug build of Ember.js as well as release build.
Since debug build is 4x slower, using the total time or the arithmetic mean thereof will effectively
give 4x as much weight to debug build of Ember.js relative to release build of Ember.js. Given only
~5% of websites that deploy Ember.js use debug build, this weighting is clearly not right.

This patch, therefore, replaces the arithmetic mean by the geometric mean to compute the final score.
It also moves the code to compute the final score to BenchmarkRunner to be shared between main.js
and InteractiveRunner.html.

Speedometer/InteractiveRunner.html:

(.didRunSuites): Show geometric mean, arithmetic mean, total, as well as the score for completeness
since this is a debugging page for developers.

Speedometer/resources/benchmark-runner.js:

(BenchmarkRunner.prototype.step): Added mean, geomean, and score as measuredValues' properties.
(BenchmarkRunner.prototype._runTestAndRecordResults): Removed the dead code.
(BenchmarkRunner.prototype._finalize): Compute and add total, arithmetic mean (just mean in the code),
and geometric mean (geomean) to measuredValues.

Speedometer/resources/main.js:

(window.benchmarkClient): Replaced testsCount by stepsCount and _timeValues by _measuredValuesList.
(window.benchmarkClient.willRunTest):
(window.benchmarkClient.didRunTest):
(window.benchmarkClient.didRunSuites): Store measuredValues object instead of just the total time.
(window.benchmarkClient.didFinishLastIteration):
(window.benchmarkClient._computeResults):
(window.benchmarkClient._computeResults.valueForUnit): Renamed from totalTimeInDisplayUnit. Now simply
retrieves the values computed by BenchmarkRunner's_finalize.
(startBenchmark):
(computeScore): Deleted.

Location:

trunk/PerformanceTests

Files:

: 4 edited

ChangeLog (modified) (1 diff)
Speedometer/InteractiveRunner.html (modified) (1 diff)
Speedometer/resources/benchmark-runner.js (modified) (4 diffs)
Speedometer/resources/main.js (modified) (8 diffs)

Legend:

: Unmodified
: Added
: Removed

trunk/PerformanceTests/ChangeLog

-                      r221636
+                      r221659
+-09-05  Ryosuke Niwa  <rniwa@webkit.org>
+        Compute the final score using geometric mean in Speedometer 2.0
+        https://bugs.webkit.org/show_bug.cgi?id=172968
+        Reviewed by Saam Barati.
+        Make Speedometer 2.0 use the geometric mean of the subtotal of each test suite instead of the total..
+        In Speedometer 1.0, we used the total time to compute the final score because we wanted to make
+        the slowest framework and library faster. The fastest suite (FlightJS) still accounted for ~6% and
+        the slowest case (React) accounted for ~25% so we felt the total time, or the arithmetic mean with
+        a constant factor, was a good metric to track.
+        In the latest version of Speedometer 2.0, however, the fastest suite (Preact) runs in ~55ms whereas
+        the slowest suite (Inferno) takes 1.5s on Safari. Since the total time is 6.5s, Preact's suite only
+        accounts for ~0.8% of the total score while Inferno's suite accounts for ~23% of the total score.
+        Since the goal of Speedometer is to approximate different kinds of DOM API use patterns on the Web,
+        we want each framework & library to have some measurement impact on the overall benchmark score.
+        Furthermore, after r221205, we're testing both debug build of Ember.js as well as release build.
+        Since debug build is 4x slower, using the total time or the arithmetic mean thereof will effectively
+        give 4x as much weight to debug build of Ember.js relative to release build of Ember.js. Given only
+        ~5% of websites that deploy Ember.js use debug build, this weighting is clearly not right.
+        This patch, therefore, replaces the arithmetic mean by the geometric mean to compute the final score.
+        It also moves the code to compute the final score to BenchmarkRunner to be shared between main.js
+        and InteractiveRunner.html.
+        * Speedometer/InteractiveRunner.html:
+        (.didRunSuites): Show geometric mean, arithmetic mean, total, as well as the score for completeness
+        since this is a debugging page for developers.
+        * Speedometer/resources/benchmark-runner.js:
+        (BenchmarkRunner.prototype.step): Added mean, geomean, and score as measuredValues' properties.
+        (BenchmarkRunner.prototype._runTestAndRecordResults): Removed the dead code.
+        (BenchmarkRunner.prototype._finalize): Compute and add total, arithmetic mean (just mean in the code),
+        and geometric mean (geomean) to measuredValues.
+        * Speedometer/resources/main.js:
+        (window.benchmarkClient): Replaced testsCount by stepsCount and _timeValues by _measuredValuesList.
+        (window.benchmarkClient.willRunTest):
+        (window.benchmarkClient.didRunTest):
+        (window.benchmarkClient.didRunSuites): Store measuredValues object instead of just the total time.
+        (window.benchmarkClient.didFinishLastIteration):
+        (window.benchmarkClient._computeResults):
+        (window.benchmarkClient._computeResults.valueForUnit): Renamed from totalTimeInDisplayUnit. Now simply
+        retrieves the values computed by BenchmarkRunner's_finalize.
+        (startBenchmark):
+        (computeScore): Deleted.
 -09-05  JF Bastien  <jfbastien@apple.com>

trunk/PerformanceTests/Speedometer/InteractiveRunner.html

-                      r221039
+                      r221659
                 results += suiteName + ' : ' + suiteResults.total + ' ms\n';
+            }
+            results += 'Arithemtic Mean : ' + measuredValues.mean  + ' ms\n';
+            results += 'Geometric Mean : ' + measuredValues.geomean  + ' ms\n';
             results += 'Total : ' + measuredValues.total + ' ms\n';
+            results += 'Score : ' + measuredValues.score + ' rpm\n';
             if (!results)

trunk/PerformanceTests/Speedometer/resources/benchmark-runner.js

-                      r221106
+                      r221659
     if (!state) {
         state = new BenchmarkState(this._suites);
         this._measuredValues = {tests: {}, total: 0};
+        this._measuredValues = {tests: {}, total: 0, mean: NaN, geomean: NaN, score: NaN};
+    }
 …
     if (state.isFirstTest()) {
         this._removeFrame();
-        this._masuredValuesForCurrentSuite = {};
         var self = this;
         return state.prepareCurrentSuite(this, this._appendFrame()).then(function (prepareReturnValue) {
 …
             suiteResults.tests[test.name] = {tests: {'Sync': syncTime, 'Async': asyncTime}, total: total};
             suiteResults.total += total;
-            self._measuredValues.total += total;
             if (self._client && self._client.didRunTest)
 …
     this._removeFrame();
+    if (this._client && this._client.didRunSuites)
+    if (this._client && this._client.didRunSuites) {
+        var product = 1;
+        var values = [];
+        for (var suiteName in this._measuredValues.tests) {
+            var suiteTotal = this._measuredValues.tests[suiteName].total;
+            product *= suiteTotal;
+            values.push(suiteTotal);
+        }
+        values.sort(function (a, b) { return a - b }); // Avoid the loss of significance for the sum.
+        var total = values.reduce(function (a, b) { return a + b });
+        var geomean = Math.pow(product, 1 / values.length);
+        var correctionFactor = 3; // This factor makes the test score look reasonably fit within 0 to 140.
+        this._measuredValues.total = total;
+        this._measuredValues.mean = total / values.length;
+        this._measuredValues.geomean = geomean;
+        this._measuredValues.score = 60 * 1000 / geomean / correctionFactor;
         this._client.didRunSuites(this._measuredValues);
+    }
     if (this._runNextIteration)

trunk/PerformanceTests/Speedometer/resources/main.js

-                      r221118
+                      r221659
     displayUnit: 'runs/min',
     iterationCount: 10,
     testsCount: null,
+    stepCount: null,
     suitesCount: null,
     _timeValues: [],
+    _measuredValuesList: [],
     _finishedTestCount: 0,
     _progressCompleted: null,
 …
     },
     willRunTest: function (suite, test) {
         document.getElementById('info').textContent = suite.name + ' ( ' + this._finishedTestCount + ' / ' + this.testsCount + ' )';
+        document.getElementById('info').textContent = suite.name + ' ( ' + this._finishedTestCount + ' / ' + this.stepCount + ' )';
     },
     didRunTest: function () {
         this._finishedTestCount++;
         this._progressCompleted.style.width = (this._finishedTestCount * 100 / this.testsCount) + '%';
+        this._progressCompleted.style.width = (this._finishedTestCount * 100 / this.stepCount) + '%';
     },
     didRunSuites: function (measuredValues) {
         this._timeValues.push(measuredValues.total);
+        this._measuredValuesList.push(measuredValues);
     },
     willStartFirstIteration: function () {
         this._timeValues = [];
+        this._measuredValuesList = [];
         this._finishedTestCount = 0;
         this._progressCompleted = document.getElementById('progress-completed');
 …
         document.getElementById('logo-link').onclick = null;
         var results = this._computeResults(this._timeValues, this.displayUnit);
+        var results = this._computeResults(this._measuredValuesList, this.displayUnit);
         this._updateGaugeNeedle(results.mean);
 …
             showResultsSummary();
     },
     _computeResults: function (timeValues, displayUnit) {
+    _computeResults: function (measuredValuesList, displayUnit) {
         var suitesCount = this.suitesCount;
         function totalTimeInDisplayUnit(time) {
+        function valueForUnit(measuredValues) {
             if (displayUnit == 'ms')
                 return time;
             return computeScore(time);
+                return measuredValues.geomean;
+            return measuredValues.score;
+        }
 …
+        }
         var values = timeValues.map(totalTimeInDisplayUnit);
+        var values = measuredValuesList.map(valueForUnit);
         var sum = values.reduce(function (a, b) { return a + b; }, 0);
         var arithmeticMean = sum / values.length;
 …
         return {
             formattedValues: timeValues.map(function (time) {
                 return toSigFigPrecision(totalTimeInDisplayUnit(time), 4) + ' ' + displayUnit;
+            formattedValues: values.map(function (value) {
+                return toSigFigPrecision(value, 4) + ' ' + displayUnit;
             }),
             mean: arithmeticMean,
 …
     var enabledSuites = Suites.filter(function (suite) { return !suite.disabled; });
     var totalSubtestCount = enabledSuites.reduce(function (testsCount, suite) { return testsCount + suite.tests.length; }, 0);
     benchmarkClient.testsCount = benchmarkClient.iterationCount * totalSubtestCount;
+    var totalSubtestsCount = enabledSuites.reduce(function (testsCount, suite) { return testsCount + suite.tests.length; }, 0);
+    benchmarkClient.stepCount = benchmarkClient.iterationCount * totalSubtestsCount;
     benchmarkClient.suitesCount = enabledSuites.length;
     var runner = new BenchmarkRunner(Suites, benchmarkClient);
 …
     return true;
+}
-function computeScore(time) {
-    return 60 * 1000 * benchmarkClient.suitesCount / time;
+}

Note: See TracChangeset for help on using the changeset viewer.