Changeset 166700 in webkit


Ignore:
Timestamp:
Apr 3, 2014 12:07:02 AM (10 years ago)
Author:
rniwa@webkit.org
Message:

WebKitPerfMonitor: Y-axis adjustment is too aggressive
https://bugs.webkit.org/show_bug.cgi?id=130937

Reviewed by Andreas Kling.

Previously, adjusted min. and max. were defined as the two standards deviations away from EWMA of measured
results. This had two major problems:

  1. Two standard deviations can be too small to show the confidence interval for results.
  2. Sometimes baseline and target can be more than two standards deviations away.

Fixed the bug by completely rewriting the algorithm to compute the interval. Instead of blindly using two
standard deviations as margins, we keep adding quarter the standard deviation on each side until more than 90%
of points lie in the interval or we've expanded 4 standard deviations. Once this condition is met, we reduce
the margin on each side separately to reduce the empty space on either side.

A more rigorous approach would involve computing least squared value of results with respect to intervals
but that seems like an overkill for a simple UI problem; it's also computationally expensive.

  • public/index.html:

(Chart..adjustedIntervalForRun): Extracted from computeYAxisBoundsToFitLines.
(Chart..computeYAxisBoundsToFitLines): Compute the min. and max. adjusted intervals out of adjusted intervals
for each runs (current, baseline, and target) so that at least one point from each set of results is shown.
We wouldn't see the difference between measured values versus baseline and target values otherwise.

  • public/js/helper-classes.js:

(PerfTestResult.unscaledConfidenceIntervalDelta): Returns the default value if the confidence
interval delta cannot be computed.
(PerfTestResult.isInUnscaledInterval): Added. Returns true iff the confidence intervals lies
within the given interval.
(PerfTestRuns..filteredResults): Extracted from unscaledMeansForAllResults now that PerfTestRuns.min and
PerfTestRuns.max need to use both mean and confidence interval delta for each result.
(PerfTestRuns..unscaledMeansForAllResults):
(PerfTestRuns.min): Take the confidence interval delta into account.
(PerfTestRuns.max): Ditto.
(PerfTestRuns.countResults): Returns the number of results in the given time frame (> minTime).
(PerfTestRuns.countResultsInInterval): Returns the number of results whose confidence interval lie within the
given interval.
(PerfTestRuns.exponentialMovingArithmeticMean): Fixed the typo so that it actually computes the EWMA.

Location:
trunk/Websites/perf.webkit.org
Files:
3 edited

Legend:

Unmodified
Added
Removed
  • trunk/Websites/perf.webkit.org/ChangeLog

    r166481 r166700  
     12014-04-03  Ryosuke Niwa  <rniwa@webkit.org>
     2
     3        WebKitPerfMonitor: Y-axis adjustment is too aggressive
     4        https://bugs.webkit.org/show_bug.cgi?id=130937
     5
     6        Reviewed by Andreas Kling.
     7
     8        Previously, adjusted min. and max. were defined as the two standards deviations away from EWMA of measured
     9        results. This had two major problems:
     10        1. Two standard deviations can be too small to show the confidence interval for results.
     11        2. Sometimes baseline and target can be more than two standards deviations away.
     12
     13        Fixed the bug by completely rewriting the algorithm to compute the interval. Instead of blindly using two
     14        standard deviations as margins, we keep adding quarter the standard deviation on each side until more than 90%
     15        of points lie in the interval or we've expanded 4 standard deviations. Once this condition is met, we reduce
     16        the margin on each side separately to reduce the empty space on either side.
     17
     18        A more rigorous approach would involve computing least squared value of results with respect to intervals
     19        but that seems like an overkill for a simple UI problem; it's also computationally expensive.
     20
     21        * public/index.html:
     22        (Chart..adjustedIntervalForRun): Extracted from computeYAxisBoundsToFitLines.
     23        (Chart..computeYAxisBoundsToFitLines): Compute the min. and max. adjusted intervals out of adjusted intervals
     24        for each runs (current, baseline, and target) so that at least one point from each set of results is shown.
     25        We wouldn't see the difference between measured values versus baseline and target values otherwise.
     26        * public/js/helper-classes.js:
     27        (PerfTestResult.unscaledConfidenceIntervalDelta): Returns the default value if the confidence
     28        interval delta cannot be computed.
     29        (PerfTestResult.isInUnscaledInterval): Added. Returns true iff the confidence intervals lies
     30        within the given interval.
     31        (PerfTestRuns..filteredResults): Extracted from unscaledMeansForAllResults now that PerfTestRuns.min and
     32        PerfTestRuns.max need to use both mean and confidence interval delta for each result.
     33        (PerfTestRuns..unscaledMeansForAllResults):
     34        (PerfTestRuns.min): Take the confidence interval delta into account.
     35        (PerfTestRuns.max): Ditto.
     36        (PerfTestRuns.countResults): Returns the number of results in the given time frame (> minTime).
     37        (PerfTestRuns.countResultsInInterval): Returns the number of results whose confidence interval lie within the
     38        given interval.
     39        (PerfTestRuns.exponentialMovingArithmeticMean): Fixed the typo so that it actually computes the EWMA.
     40
    1412014-03-31  Ryosuke Niwa  <rniwa@webkit.org>
    242
  • trunk/Websites/perf.webkit.org/public/index.html

    r166481 r166700  
    315315    };
    316316
     317    function adjustedIntervalForRun(results, minTime, minRatioToFitInAdjustedInterval) {
     318        if (!results)
     319            return {min: Number.MAX_VALUE, max: Number.MIN_VALUE};
     320        var degreeOfWeightingDecrease = 0.2;
     321        var movingAverage = results.exponentialMovingArithmeticMean(minTime, degreeOfWeightingDecrease);
     322        var resultsCount = results.countResults(minTime);
     323        var adjustmentDelta = results.sampleStandardDeviation(minTime) / 4;
     324        var adjustedMin = movingAverage;
     325        var adjustedMax = movingAverage;
     326        var adjustmentCount;
     327        for (adjustmentCount = 0; adjustmentCount < 4 * 4; adjustmentCount++) { // Don't expand beyond 4 standard deviations.
     328            adjustedMin -= adjustmentDelta;
     329            adjustedMax += adjustmentDelta;
     330            if (results.countResultsInInterval(minTime, adjustedMin, adjustedMax) / resultsCount >= minRatioToFitInAdjustedInterval)
     331                break;
     332        }
     333        for (var i = 0; i < adjustmentCount; i++) {
     334            if (results.countResultsInInterval(minTime, adjustedMin + adjustmentDelta, adjustedMax) / resultsCount < minRatioToFitInAdjustedInterval)
     335                break;
     336            adjustedMin += adjustmentDelta;
     337        }
     338        for (var i = 0; i < adjustmentCount; i++) {
     339            if (results.countResultsInInterval(minTime, adjustedMin, adjustedMax - adjustmentDelta) / resultsCount < minRatioToFitInAdjustedInterval)
     340                break;
     341            adjustedMax -= adjustmentDelta;
     342        }
     343        return {min: adjustedMin, max: adjustedMax};
     344    }
     345
    317346    function computeYAxisBoundsToFitLines(minTime, results, baseline, target) {
    318         var stdevOfAllRuns = results.sampleStandardDeviation(minTime);
    319         var movingAverage = results.exponentialMovingArithmeticMean(minTime, /* alpha, the degree of weighting decrease */ 0.3);
    320         var min = results.min(minTime);
    321         var max = results.max(minTime);
    322 
     347        var minOfAllRuns = results.min(minTime);
     348        var maxOfAllRuns = results.max(minTime);
    323349        if (baseline) {
    324             min = Math.min(min, baseline.min(minTime));
    325             max = Math.max(max, baseline.max(minTime));
     350            minOfAllRuns = Math.min(minOfAllRuns, baseline.min(minTime));
     351            maxOfAllRuns = Math.max(maxOfAllRuns, baseline.max(minTime));
    326352        }
    327353        if (target) {
    328             min = Math.min(min, target.min(minTime));
    329             max = Math.max(max, target.max(minTime));
    330         }
    331 
    332         var marginSize = (max - min) * 0.1;
    333         return {min: min - marginSize, max: max + marginSize,
    334             adjustedMin: Math.min(results.lastResult().mean() - marginSize, Math.max(movingAverage - stdevOfAllRuns * 2, min) - marginSize),
    335             adjustedMax: Math.max(results.lastResult().mean() + marginSize, Math.min(movingAverage + stdevOfAllRuns * 2, max) + marginSize) };
     354            minOfAllRuns = Math.min(minOfAllRuns, target.min(minTime));
     355            maxOfAllRuns = Math.max(maxOfAllRuns, target.max(minTime));
     356        }
     357        var marginSize = (maxOfAllRuns - minOfAllRuns) * 0.1;
     358
     359        var minRatioToFitInAdjustedInterval = 0.9;
     360        var intervalForResults = adjustedIntervalForRun(results, minTime, minRatioToFitInAdjustedInterval);
     361        var intervalForBaseline = adjustedIntervalForRun(baseline, minTime, minRatioToFitInAdjustedInterval);
     362        var intervalForTarget = adjustedIntervalForRun(target, minTime, minRatioToFitInAdjustedInterval);
     363        var adjustedMin = Math.min(intervalForResults.min, intervalForBaseline.min, intervalForTarget.min);
     364        var adjustedMax = Math.max(intervalForResults.max, intervalForBaseline.max, intervalForTarget.max);
     365        var adjsutedMarginSize = (adjustedMax - adjustedMin) * 0.1;
     366        return {min: minOfAllRuns - marginSize, max: maxOfAllRuns + marginSize,
     367            adjustedMin: Math.max(minOfAllRuns - marginSize, adjustedMin - adjsutedMarginSize),
     368            adjustedMax: Math.min(maxOfAllRuns + marginSize, adjustedMax + adjsutedMarginSize)};
    336369    }
    337370
  • trunk/Websites/perf.webkit.org/public/js/helper-classes.js

    r166479 r166700  
    99        return runs.scalingFactor() * this.unscaledConfidenceIntervalDelta();
    1010    }
    11     this.unscaledConfidenceIntervalDelta = function () {
    12         return Statistics.confidenceIntervalDelta(0.95, result.iterationCount, result.sum, result.squareSum);
     11    this.unscaledConfidenceIntervalDelta = function (defaultValue) {
     12        var delta = Statistics.confidenceIntervalDelta(0.95, result.iterationCount, result.sum, result.squareSum);
     13        if (isNaN(delta) && defaultValue !== undefined)
     14            return defaultValue;
     15        return delta;
     16    }
     17    this.isInUnscaledInterval = function (min, max) {
     18        var mean = this.unscaledMean();
     19        var delta = this.unscaledConfidenceIntervalDelta(0);
     20        return min <= mean - delta && mean + delta <= max;
    1321    }
    1422    this.isBetterThan = function(other) { return runs.smallerIsBetter() == (this.mean() < other.mean()); }
     
    170178    this.resultAt = function (i) { return results[i]; }
    171179
    172     var unscaledMeansCache;
    173     var unscaledMeansCacheMinTime;
     180    var resultsFilterCache;
     181    var resultsFilterCacheMinTime;
     182    function filteredResults(minTime) {
     183        if (!minTime)
     184            return results;
     185        if (resultsFilterCacheMinTime != minTime) {
     186            resultsFilterCache = results.filter(function (result) { return !minTime || result.build().time() >= minTime; });
     187            resultsFilterCacheMinTime = minTime;
     188        }
     189        return resultsFilterCache;
     190    }
     191
    174192    function unscaledMeansForAllResults(minTime) {
    175         if (unscaledMeansCacheMinTime == minTime && unscaledMeansCache)
    176             return unscaledMeansCache;
    177         unscaledMeansCache = results.filter(function (result) { return !minTime || result.build().time() >= minTime; })
    178             .map(function (result) { return result.unscaledMean(); });
    179         unscaledMeansCacheMinTime = minTime;
    180         return unscaledMeansCache;
     193        return filteredResults(minTime).map(function (result) { return result.unscaledMean(); });
    181194    }
    182195
    183196    this.min = function (minTime) {
    184         return this.scalingFactor() * unscaledMeansForAllResults(minTime)
    185             .reduce(function (minSoFar, currentMean) { return Math.min(minSoFar, currentMean); }, Number.MAX_VALUE);
     197        return this.scalingFactor() * filteredResults(minTime)
     198            .reduce(function (minSoFar, result) { return Math.min(minSoFar, result.unscaledMean() - result.unscaledConfidenceIntervalDelta(0)); }, Number.MAX_VALUE);
    186199    }
    187200    this.max = function (minTime, baselineName) {
    188         return this.scalingFactor() * unscaledMeansForAllResults(minTime)
    189             .reduce(function (maxSoFar, currentMean) { return Math.max(maxSoFar, currentMean); }, Number.MIN_VALUE);
     201        return this.scalingFactor() * filteredResults(minTime)
     202            .reduce(function (maxSoFar, result) { return Math.max(maxSoFar, result.unscaledMean() + result.unscaledConfidenceIntervalDelta(0)); }, Number.MIN_VALUE);
     203    }
     204    this.countResults = function (minTime) {
     205        return unscaledMeansForAllResults(minTime).length;
     206    }
     207    this.countResultsInInterval = function (minTime, min, max) {
     208        var unscaledMin = min / this.scalingFactor();
     209        var unscaledMax = max / this.scalingFactor();
     210        return filteredResults(minTime).reduce(function (count, currentResult) {
     211            return count + (currentResult.isInUnscaledInterval(unscaledMin, unscaledMax) ? 1 : 0); }, 0);
    190212    }
    191213    this.sampleStandardDeviation = function (minTime) {
     
    197219        if (!unscaledMeans.length)
    198220            return NaN;
    199         return this.scalingFactor() * unscaledMeans.reduce(function (movingAverage, currentMean) { return alpha * movingAverage + (1 - alpha) * movingAverage; });
     221        return this.scalingFactor() * unscaledMeans.reduce(function (movingAverage, currentMean) { return alpha * currentMean + (1 - alpha) * movingAverage; });
    200222    }
    201223    this.hasConfidenceInterval = function () { return !isNaN(this.lastResult().unscaledConfidenceIntervalDelta()); }
Note: See TracChangeset for help on using the changeset viewer.