Context Navigation

← Previous Changeset
Next Changeset →

Changeset 166700 in webkit

Timestamp:

Apr 3, 2014 12:07:02 AM (10 years ago)

Author:

rniwa@webkit.org

Message:

WebKitPerfMonitor: Y-axis adjustment is too aggressive
https://bugs.webkit.org/show_bug.cgi?id=130937

Reviewed by Andreas Kling.

Previously, adjusted min. and max. were defined as the two standards deviations away from EWMA of measured
results. This had two major problems:

Two standard deviations can be too small to show the confidence interval for results.
Sometimes baseline and target can be more than two standards deviations away.

Fixed the bug by completely rewriting the algorithm to compute the interval. Instead of blindly using two
standard deviations as margins, we keep adding quarter the standard deviation on each side until more than 90%
of points lie in the interval or we've expanded 4 standard deviations. Once this condition is met, we reduce
the margin on each side separately to reduce the empty space on either side.

A more rigorous approach would involve computing least squared value of results with respect to intervals
but that seems like an overkill for a simple UI problem; it's also computationally expensive.

public/index.html:

(Chart..adjustedIntervalForRun): Extracted from computeYAxisBoundsToFitLines.
(Chart..computeYAxisBoundsToFitLines): Compute the min. and max. adjusted intervals out of adjusted intervals
for each runs (current, baseline, and target) so that at least one point from each set of results is shown.
We wouldn't see the difference between measured values versus baseline and target values otherwise.

public/js/helper-classes.js:

(PerfTestResult.unscaledConfidenceIntervalDelta): Returns the default value if the confidence
interval delta cannot be computed.
(PerfTestResult.isInUnscaledInterval): Added. Returns true iff the confidence intervals lies
within the given interval.
(PerfTestRuns..filteredResults): Extracted from unscaledMeansForAllResults now that PerfTestRuns.min and
PerfTestRuns.max need to use both mean and confidence interval delta for each result.
(PerfTestRuns..unscaledMeansForAllResults):
(PerfTestRuns.min): Take the confidence interval delta into account.
(PerfTestRuns.max): Ditto.
(PerfTestRuns.countResults): Returns the number of results in the given time frame (> minTime).
(PerfTestRuns.countResultsInInterval): Returns the number of results whose confidence interval lie within the
given interval.
(PerfTestRuns.exponentialMovingArithmeticMean): Fixed the typo so that it actually computes the EWMA.

Location:

trunk/Websites/perf.webkit.org

Files:

: 3 edited

ChangeLog (modified) (1 diff)
public/index.html (modified) (1 diff)
public/js/helper-classes.js (modified) (3 diffs)

Legend:

: Unmodified
: Added
: Removed

trunk/Websites/perf.webkit.org/ChangeLog

-                      r166481
+                      r166700
+-04-03  Ryosuke Niwa  <rniwa@webkit.org>
+        WebKitPerfMonitor: Y-axis adjustment is too aggressive
+        https://bugs.webkit.org/show_bug.cgi?id=130937
+        Reviewed by Andreas Kling.
+        Previously, adjusted min. and max. were defined as the two standards deviations away from EWMA of measured
+        results. This had two major problems:
+. Two standard deviations can be too small to show the confidence interval for results.
+. Sometimes baseline and target can be more than two standards deviations away.
+        Fixed the bug by completely rewriting the algorithm to compute the interval. Instead of blindly using two
+        standard deviations as margins, we keep adding quarter the standard deviation on each side until more than 90%
+        of points lie in the interval or we've expanded 4 standard deviations. Once this condition is met, we reduce
+        the margin on each side separately to reduce the empty space on either side.
+        A more rigorous approach would involve computing least squared value of results with respect to intervals
+        but that seems like an overkill for a simple UI problem; it's also computationally expensive.
+        * public/index.html:
+        (Chart..adjustedIntervalForRun): Extracted from computeYAxisBoundsToFitLines.
+        (Chart..computeYAxisBoundsToFitLines): Compute the min. and max. adjusted intervals out of adjusted intervals
+        for each runs (current, baseline, and target) so that at least one point from each set of results is shown.
+        We wouldn't see the difference between measured values versus baseline and target values otherwise.
+        * public/js/helper-classes.js:
+        (PerfTestResult.unscaledConfidenceIntervalDelta): Returns the default value if the confidence
+        interval delta cannot be computed.
+        (PerfTestResult.isInUnscaledInterval): Added. Returns true iff the confidence intervals lies
+        within the given interval.
+        (PerfTestRuns..filteredResults): Extracted from unscaledMeansForAllResults now that PerfTestRuns.min and
+        PerfTestRuns.max need to use both mean and confidence interval delta for each result.
+        (PerfTestRuns..unscaledMeansForAllResults):
+        (PerfTestRuns.min): Take the confidence interval delta into account.
+        (PerfTestRuns.max): Ditto.
+        (PerfTestRuns.countResults): Returns the number of results in the given time frame (> minTime).
+        (PerfTestRuns.countResultsInInterval): Returns the number of results whose confidence interval lie within the
+        given interval.
+        (PerfTestRuns.exponentialMovingArithmeticMean): Fixed the typo so that it actually computes the EWMA.
 -03-31  Ryosuke Niwa  <rniwa@webkit.org>

trunk/Websites/perf.webkit.org/public/index.html

-                      r166481
+                      r166700
     };
+    function adjustedIntervalForRun(results, minTime, minRatioToFitInAdjustedInterval) {
+        if (!results)
+            return {min: Number.MAX_VALUE, max: Number.MIN_VALUE};
+        var degreeOfWeightingDecrease = 0.2;
+        var movingAverage = results.exponentialMovingArithmeticMean(minTime, degreeOfWeightingDecrease);
+        var resultsCount = results.countResults(minTime);
+        var adjustmentDelta = results.sampleStandardDeviation(minTime) / 4;
+        var adjustedMin = movingAverage;
+        var adjustedMax = movingAverage;
+        var adjustmentCount;
+        for (adjustmentCount = 0; adjustmentCount < 4 * 4; adjustmentCount++) { // Don't expand beyond 4 standard deviations.
+            adjustedMin -= adjustmentDelta;
+            adjustedMax += adjustmentDelta;
+            if (results.countResultsInInterval(minTime, adjustedMin, adjustedMax) / resultsCount >= minRatioToFitInAdjustedInterval)
+                break;
+        }
+        for (var i = 0; i < adjustmentCount; i++) {
+            if (results.countResultsInInterval(minTime, adjustedMin + adjustmentDelta, adjustedMax) / resultsCount < minRatioToFitInAdjustedInterval)
+                break;
+            adjustedMin += adjustmentDelta;
+        }
+        for (var i = 0; i < adjustmentCount; i++) {
+            if (results.countResultsInInterval(minTime, adjustedMin, adjustedMax - adjustmentDelta) / resultsCount < minRatioToFitInAdjustedInterval)
+                break;
+            adjustedMax -= adjustmentDelta;
+        }
+        return {min: adjustedMin, max: adjustedMax};
+    }
     function computeYAxisBoundsToFitLines(minTime, results, baseline, target) {
+        var stdevOfAllRuns = results.sampleStandardDeviation(minTime);
+        var movingAverage = results.exponentialMovingArithmeticMean(minTime, /* alpha, the degree of weighting decrease */ 0.3);
+        var min = results.min(minTime);
+        var max = results.max(minTime);
+        var minOfAllRuns = results.min(minTime);
+        var maxOfAllRuns = results.max(minTime);
         if (baseline) {
             min = Math.min(min, baseline.min(minTime));
             max = Math.max(max, baseline.max(minTime));
+            minOfAllRuns = Math.min(minOfAllRuns, baseline.min(minTime));
+            maxOfAllRuns = Math.max(maxOfAllRuns, baseline.max(minTime));
+        }
         if (target) {
+            min = Math.min(min, target.min(minTime));
+            max = Math.max(max, target.max(minTime));
+        }
+        var marginSize = (max - min) * 0.1;
+        return {min: min - marginSize, max: max + marginSize,
+            adjustedMin: Math.min(results.lastResult().mean() - marginSize, Math.max(movingAverage - stdevOfAllRuns * 2, min) - marginSize),
+            adjustedMax: Math.max(results.lastResult().mean() + marginSize, Math.min(movingAverage + stdevOfAllRuns * 2, max) + marginSize) };
+            minOfAllRuns = Math.min(minOfAllRuns, target.min(minTime));
+            maxOfAllRuns = Math.max(maxOfAllRuns, target.max(minTime));
+        }
+        var marginSize = (maxOfAllRuns - minOfAllRuns) * 0.1;
+        var minRatioToFitInAdjustedInterval = 0.9;
+        var intervalForResults = adjustedIntervalForRun(results, minTime, minRatioToFitInAdjustedInterval);
+        var intervalForBaseline = adjustedIntervalForRun(baseline, minTime, minRatioToFitInAdjustedInterval);
+        var intervalForTarget = adjustedIntervalForRun(target, minTime, minRatioToFitInAdjustedInterval);
+        var adjustedMin = Math.min(intervalForResults.min, intervalForBaseline.min, intervalForTarget.min);
+        var adjustedMax = Math.max(intervalForResults.max, intervalForBaseline.max, intervalForTarget.max);
+        var adjsutedMarginSize = (adjustedMax - adjustedMin) * 0.1;
+        return {min: minOfAllRuns - marginSize, max: maxOfAllRuns + marginSize,
+            adjustedMin: Math.max(minOfAllRuns - marginSize, adjustedMin - adjsutedMarginSize),
+            adjustedMax: Math.min(maxOfAllRuns + marginSize, adjustedMax + adjsutedMarginSize)};
+    }

trunk/Websites/perf.webkit.org/public/js/helper-classes.js

-                      r166479
+                      r166700
         return runs.scalingFactor() * this.unscaledConfidenceIntervalDelta();
+    }
+    this.unscaledConfidenceIntervalDelta = function () {
+        return Statistics.confidenceIntervalDelta(0.95, result.iterationCount, result.sum, result.squareSum);
+    this.unscaledConfidenceIntervalDelta = function (defaultValue) {
+        var delta = Statistics.confidenceIntervalDelta(0.95, result.iterationCount, result.sum, result.squareSum);
+        if (isNaN(delta) && defaultValue !== undefined)
+            return defaultValue;
+        return delta;
+    }
+    this.isInUnscaledInterval = function (min, max) {
+        var mean = this.unscaledMean();
+        var delta = this.unscaledConfidenceIntervalDelta(0);
+        return min <= mean - delta && mean + delta <= max;
+    }
     this.isBetterThan = function(other) { return runs.smallerIsBetter() == (this.mean() < other.mean()); }
 …
     this.resultAt = function (i) { return results[i]; }
+    var unscaledMeansCache;
+    var unscaledMeansCacheMinTime;
+    var resultsFilterCache;
+    var resultsFilterCacheMinTime;
+    function filteredResults(minTime) {
+        if (!minTime)
+            return results;
+        if (resultsFilterCacheMinTime != minTime) {
+            resultsFilterCache = results.filter(function (result) { return !minTime || result.build().time() >= minTime; });
+            resultsFilterCacheMinTime = minTime;
+        }
+        return resultsFilterCache;
+    }
     function unscaledMeansForAllResults(minTime) {
+        if (unscaledMeansCacheMinTime == minTime && unscaledMeansCache)
+            return unscaledMeansCache;
+        unscaledMeansCache = results.filter(function (result) { return !minTime || result.build().time() >= minTime; })
+            .map(function (result) { return result.unscaledMean(); });
+        unscaledMeansCacheMinTime = minTime;
+        return unscaledMeansCache;
+        return filteredResults(minTime).map(function (result) { return result.unscaledMean(); });
+    }
     this.min = function (minTime) {
         return this.scalingFactor() * unscaledMeansForAllResults(minTime)
             .reduce(function (minSoFar, currentMean) { return Math.min(minSoFar, currentMean); }, Number.MAX_VALUE);
+        return this.scalingFactor() * filteredResults(minTime)
+            .reduce(function (minSoFar, result) { return Math.min(minSoFar, result.unscaledMean() - result.unscaledConfidenceIntervalDelta(0)); }, Number.MAX_VALUE);
+    }
     this.max = function (minTime, baselineName) {
+        return this.scalingFactor() * unscaledMeansForAllResults(minTime)
+            .reduce(function (maxSoFar, currentMean) { return Math.max(maxSoFar, currentMean); }, Number.MIN_VALUE);
+        return this.scalingFactor() * filteredResults(minTime)
+            .reduce(function (maxSoFar, result) { return Math.max(maxSoFar, result.unscaledMean() + result.unscaledConfidenceIntervalDelta(0)); }, Number.MIN_VALUE);
+    }
+    this.countResults = function (minTime) {
+        return unscaledMeansForAllResults(minTime).length;
+    }
+    this.countResultsInInterval = function (minTime, min, max) {
+        var unscaledMin = min / this.scalingFactor();
+        var unscaledMax = max / this.scalingFactor();
+        return filteredResults(minTime).reduce(function (count, currentResult) {
+            return count + (currentResult.isInUnscaledInterval(unscaledMin, unscaledMax) ? 1 : 0); }, 0);
+    }
     this.sampleStandardDeviation = function (minTime) {
 …
         if (!unscaledMeans.length)
             return NaN;
         return this.scalingFactor() * unscaledMeans.reduce(function (movingAverage, currentMean) { return alpha * movingAverage + (1 - alpha) * movingAverage; });
+        return this.scalingFactor() * unscaledMeans.reduce(function (movingAverage, currentMean) { return alpha * currentMean + (1 - alpha) * movingAverage; });
+    }
     this.hasConfidenceInterval = function () { return !isNaN(this.lastResult().unscaledConfidenceIntervalDelta()); }

Note: See TracChangeset for help on using the changeset viewer.