[[PageOutline]]

= What is a Performance Test =

A performance test measures the run-time performance and memory usage of WebKit. Unlike [http://www.webkit.org/quality/testing.html regression tests] (a.k.a layout tests) or conformance tests such as of W3C's, it doesn't necessarily test the correctness of WebKit features. Since the run-time and memory used by each run of the same test may vary, we can't conclude whether a given performance test passed or failed by just looking at a single run. For this reason, performance tests yields "values" such as the time taken to run the test instead of simple PASS and FAIL.

= Performance Test Results  =

We have continuous performance test bots on [http://build.webkit.org/waterfall?show=Apple%20MountainLion%20Release%20%28Perf%29&show=Apple%20Mavericks%20Release%20%28Perf%29&show=EFL%20Linux%2064-bit%20Release%20WK2%20%28Perf%29&show=GTK%20Linux%2064-bit%20Release%20%28Perf%29 build.webkit.org]. You can see test results submitted by these bots on http://perf.webkit.org/

{{{#!td style="padding: 1em;"
||''' The waterfall of the Performance bots on the [http://build.webkit.org/waterfall?show=Apple%20MountainLion%20Release%20%28Perf%29&show=Apple%20Mavericks%20Release%20%28Perf%29&show=EFL%20Linux%2064-bit%20Release%20WK2%20%28Perf%29&show=GTK%20Linux%2064-bit%20Release%20%28Perf%29 Buildbot page]'''  || '''Platform name on the [http://perf.webkit.org results page]''' ||
|| [http://build.webkit.org/waterfall?show=Apple%20Mavericks%20Release%20%28Perf%29 Apple Mavericks Release (Perf)] || [https://perf.webkit.org/#mode=charts&chartList=%5b%5b%22mac-mavericks%22%2C%22DoYouEvenBench%2FFull%3ATime%3ATotal%22%5d%2C%5b%22mac-mavericks%22%2C%22Parser%2Fhtml5-full-render%3ATime%22%5d%5d mac-mavericks] ||
|| [http://build.webkit.org/waterfall?show=Apple%20MountainLion%20Release%20%28Perf%29 Apple MountainLion Release (Perf)] || [https://perf.webkit.org/#mode=charts&chartList=%5b%5b%22mac-mountainlion%22%2C%22DoYouEvenBench%2FFull%3ATime%3ATotal%22%5d%2C%5b%22mac-mountainlion%22%2C%22Parser%2Fhtml5-full-render%3ATime%22%5d%5d mac-mountainlion] ||
|| [http://build.webkit.org/waterfall?show=EFL%20Linux%2064-bit%20Release%20WK2%20%28Perf%29 EFL Linux 64-bit Release WK2 (Perf)] || [https://perf.webkit.org/#mode=charts&chartList=%5b%5b%22efl%22%2C%22DoYouEvenBench%2FFull%3ATime%3ATotal%22%5d%2C%5b%22efl%22%2C%22Parser%2Fhtml5-full-render%3ATime%22%5d%5d efl] ||
|| [http://build.webkit.org/waterfall?show=GTK%20Linux%2064-bit%20Release%20%28Perf%29 GTK Linux 64-bit Release (Perf)] || [https://perf.webkit.org/#mode=charts&chartList=%5b%5b%22gtk%22%2C%22DoYouEvenBench%2FFull%3ATime%3ATotal%22%5d%2C%5b%22gtk%22%2C%22Parser%2Fhtml5-full-render%3ATime%22%5d%5d gtk] ||
}}}

= How to Run Performance Tests =

WebKit's performance tests can be run by [http://trac.webkit.org/browser/trunk/Tools/Scripts/run-perf-tests run-perf-tests]. Specify a list of directories or performance tests to run a subset. e.g. {{{run-perf-tests PerformanceTests/DOM}}} or {{{run-perf-tests DOM}}} will only run tests in [http://trac.webkit.org/browser/trunk/PerformanceTests/DOM]. It will automatically build DumpRenderTree and WebKitTestRunner as needed just like {{{run-webkit-tests}}}.

== Reducing noise on your machine ==

Before running performance tests, you may want to reboot your machine, disable screen savers and power saving features, and turn off anti-virus software to reduce the potential noise.
Also disable network, bluetooth, and other network and wireless devices as they might cause undesirable CPU interrupts and context switches.

On Mac, you can run the following command to disable Spotlight:
{{{
sudo launchctl unload -w /System/Library/LaunchDaemons/com.apple.metadata.mds.plist
}}}
(To re-enable, run the same command with {{{load}}} in place of {{{unload}}})

== Aggregating and Comparing Results ==

If you're running a performance test locally in order to verify your patch doesn't regress or improves the performance of WebKit, you may find {{{--output-json-path}}} useful. Specify a file path such as {{{perf-test.json}}} and {{{run-perf-tests}}} will automatically store the results in the JSON file and creates {{{perf-test.html}}} that visualizes the test results. Execute {{{run-perf-tests}}} multiple times with the same output JSON path and it will automatically aggregate results in the JSON and the corresponding HTML document.

Suppose we have two WebKit checkouts: one without a patch and another one with the patch applied. By executing {{{run-perf-tests --output-json-path=/Users/WebKitten/perf-test.json}}} in both checkouts, we can easily compare the test results from two runs by opening {{{~/perf-test.html}}}.

You can also specify a build directory as follows along with the output JSON path:
{{{
run-perf-tests --no-build --build-directory /Users/WebKitten/MyCustomBuild/ --output-json-path=/Users/WebKitten/perf-test.json
}}}
This allows you to compare results from different builds without having to locally build DumpRenderTree or WebKitTestRunner.

=== Bisecting regressions ===

Suppose you're bisecting a regression for a performance regression on [http://trac.webkit.org/browser/trunk/PerformanceTests/Bindings/node-list-access.html Bindings/node-list-access.html] as seen [http://webkit-perf.appspot.com/graph.html#tests=%5B%5B2966378%2C2001%2C32196%5D%5D&sel=1343895254495.4062,1344094768078.8474,116.85483870967741,148.30645161290323&displayrange=30&datatype=running here]. Looking at the graph we see that the culprit lies [http://trac.webkit.org/log/?rev=124582&stop_rev=124567&verbose=on "between r124567 and r124582"].

To bisect this regression, I create two WebKit checkouts one synced to r124567 and another synced to r124582, and run the following commands in each checkout:
{{{
svn up PerformanceTests
svn up Tools/Scripts/webkitpy/performance_tests
Tools/Scripts/build-webkit
Tools/Scripts/run-perf-tests --output-json-path=/Users/WebKitten/Desktop/node-list-access.json PerformanceTests/Bindings/node-list-access.html
}}}

This step automatically produces {{{/Users/WebKitten/Desktop/node-list-access.html}}} for me to compare the results, each results labeled r124567 and r124582 (you can use {{{--description}}} option to annotate the results further) and I can confirm whether the regression reproduces locally or not. Sometimes, regression doesn't produce on your local machine due to differences in environment such as compilers used, memory size, and CPU speed.

Once I confirmed that the regression is reproducible on my machine, I can start bisecting builds. Here, I sync the checkout initially synced to r124582 to a slightly older version, say, r124580 and generate results again as follows:
{{{
svn up -r 124580
svn up PerformanceTests
svn up Tools/Scripts/webkitpy/performance_tests
Tools/Scripts/build-webkit
Tools/Scripts/run-perf-tests --output-json-path=/Users/WebKitten/Desktop/node-list-access.json PerformanceTests/Bindings/node-list-access.html
}}}
I repeat this process until the results recovers to the level we had at r124567, at which I identified the culprit. I don't typically do a strict binary search on perf. regressions because that typically results to avoid rebuilding the entire WebKit all the time.

= Writing a Performance Test Using runner.js =

The easiest way to write a performance test is using [http://trac.webkit.org/browser/trunk/PerformanceTests/resources/runner.js runner.js], which provides {{{PerfTestRunner}}} with various utility functions.  Once you wrote a test, put it inside [http://trac.webkit.org/browser/trunk/PerformanceTests PerformanceTests] directory to be ran by run-perf-tests and performance bots.

== Measuring Runs Per Second ==

Our preferred method of measurement is runs (function calls) per second.  With runner.js, we can measure this metric by calling {{{PerfTestRunner.measureRunsPerSecond}}} with a test function. {{{PerfTestRunner.measureRunsPerSecond}}} measures the time of times {{{run}}} function could be called in one second, and reports the statistics after repeating it 20 times (configurable via {{{run-perf-tests}}}). The statistics includes arithmetic mean, standard deviation, median, minimum, and maximum values.

For example, see [http://trac.webkit.org/browser/trunk/PerformanceTests/Parser/tiny-innerHTML.html Parser/tiny-innerHTML.html]:
{{{
<!DOCTYPE html>
<body>
<script src="../resources/runner.js"></script>
<script>
PerfTestRunner.measureRunsPerSecond({run:function() {
    var testDiv = document.createElement("div");
    testDiv.style.display = "none";
    document.body.appendChild(testDiv);
    for (var x = 0; x < 100000; x++) {
        testDiv.innerHTML = "This is a tiny HTML document";
    }
    document.body.removeChild(testDiv);
}});
</script>
</body>
}}}

== Measuring Time ==

In some tests, however, we cannot call the {{{run}}} function for an arbitrary number of times as done in {{{measureRunsPerSecond}}}.  In those tests, we can use {{{PerfTestRunner.measureTime}}} to measure the time {{{run}}} took to execute. {{{measureTime}}} calls the specified function once in each iteration and runs 20 iterations by default.

Note that the runtime of a function gets smaller relative to the granularity of time measurement we can make as the WebKit's performance (or of the machines that run performance tests) improves.

== Measuring Asynchronous Results ==

In some tests such as ones that measure fps, values are measured asynchronously.  In those tests, we can use {{{PerfTestRunner.prepareToMeasureValuesAsync}}} and {{{PerfTestRunner.measureValueAsync}}} to report measured value at an arbitrary time.  At the beginning of a test, call {{{PerfTestRunner.prepareToMeasureValuesAsync}}} with an object with {{{unit}}} property, which specifies the name of the unit (either one of "ms", "fps", or "runs/s").  Call {{{PerfTestRunner.measureValueAsync}}} as a newly measured value comes in.  Once enough values are measured (20 by default), {{{PerfTestRunner.measureValueAsync}}} will automatically stop the test; do not expect or manually track the number of iterations in your test as this must be configurable via {{{run-perf-tests}}}.

For example, see [http://trac.webkit.org/browser/trunk/PerformanceTests/Interactive/SelectAll.html Interactive/SelectAll.html]:
{{{
<!DOCTYPE html>
<html>
<body>
<script src="../resources/runner.js"></script>
<script>

PerfTestRunner.prepareToMeasureValuesAsync({
    unit: 'ms',
    done: function () {
        var iframe = document.querySelector('iframe');
        iframe.parentNode.removeChild(iframe);
    }
});

function runTest() {
    var iframe = document.querySelector('iframe');
    iframe.contentWindow.getSelection().removeAllRanges();
    iframe.contentDocument.body.offsetTop;

    setTimeout(function () {
        var startTime = PerfTestRunner.now();
        iframe.contentDocument.execCommand('SelectAll');
        iframe.contentDocument.body.offsetTop;
        setTimeout(function () {
            if (!PerfTestRunner.measureValueAsync(PerfTestRunner.now() - startTime))
                return;
            PerfTestRunner.gc();
            setTimeout(runTest, 0);
        }, 0);
    }, 0);
}

</script>
<iframe src="../Parser/resources/html5.html" onload="runTest()" width="800" height="600">
</body>
</html>
}}}

== Optional Arguments ==

{{{measureRunsPerSecond}}}, {{{measureTime}}}, and {{{PerfTestRunner.prepareToMeasureValuesAsync}}} described above optionally support the following arguments (as properties in the object they take):
 * {{{setup}}} - In {{{measureRunsPerSecond}}} and {{{measureTime}}}, this function gets called before the start of each iteration.  With {{{measureRunsPerSecond}}}, this function gets called exactly once before {{{run}}} gets called many times in each iteration.  If there is some work to be done before {{{run}}} can be called each time, use {{{measureTime}}} instead.  {{{PerfTestRunner.prepareToMeasureValuesAsync}}} ignores this function as it can't know when the next iteration starts.
 * {{{done}}} - This function gets called once all iterations are finished.
 * {{{description}}} - The description of the test.  {{{run-perf-tests}}} will print out this text.


= Profiling Performance Tests =

{{{run-perf-tests --profile}}} can be used to attach and run the default platform CPU profiler against the provided test(s).

Additionally, the {{{--profiler=PROFILER}}} option can select which profiler to use from the built-in profilers:

|| perf || linux (default), chromium-android (default) ||
|| iprofiler || mac (default) ||
|| sample || mac ||
|| pprof || mac, linux (chromium-only, requires using tcmalloc) ||

For perf and pprof profilers {{{--profile}}} provides per-test "10 hottest functions" output, which is useful for obtaining a high level overview of where the test is spending it's time.  This has been surprisingly helpful for finding hot non-inlined functions, or other low-hanging fruit.

{{{
% run-perf-tests --profile
Running 113 tests
Running Animation/balls.html (1 of 113)
Finished: 3.079851 s

[ perf record: Woken up 3 times to write data ]
[ perf record: Captured and wrote 0.678 MB /src/WebKit/Source/WebKit/chromium/webkit/Release/layout-test-results/test-44.data (~29642 samples) ]
     5.99%  DumpRenderTree  perf-5981.map            [.] 0x250f5ef06321  
     2.46%  DumpRenderTree  DumpRenderTree           [.] v8::internal::FastDtoa(double, v8::internal::FastDtoaMode, int, v8::internal::Vector<char>, int*, int*)
     1.86%  DumpRenderTree  DumpRenderTree           [.] WebCore::Length::initFromLength(WebCore::Length const&)
     1.74%  DumpRenderTree  DumpRenderTree           [.] WebCore::RenderStyle::diff(WebCore::RenderStyle const*, unsigned int&) const
     1.69%  DumpRenderTree  libfreetype.so.6.8.0     [.] 0x493ab         
     1.35%  DumpRenderTree  DumpRenderTree           [.] tc_free
     1.30%  DumpRenderTree  [kernel.kallsyms]        [k] 0xffffffff8103b51a
     1.27%  DumpRenderTree  DumpRenderTree           [.] tc_malloc
     1.25%  DumpRenderTree  DumpRenderTree           [.] v8::internal::JSObject::SetPropertyWithInterceptor(v8::internal::String*, v8::internal::Object*, PropertyAttributes, v8::internal::StrictModeFlag)
     1.22%  DumpRenderTree  DumpRenderTree           [.] WTF::double_conversion::Strtod(WTF::double_conversion::Vector<char const>, int)

To view the full profile, run:
perf report -i /src/WebKit/Source/WebKit/chromium/webkit/Release/layout-test-results/test-44.data
}}}
(Note: perf prints out a bunch of kernel-related warnings which I've stripped from the above output sample.)

For more in-depth analysis, all profilers print instructions after profiling as to how to explore the full sample data, as shown above.

== Adding support for a new profiler==

{{{run-perf-tests --profile --profiler=PROFILER}}} uses Profilers provided by [http://trac.webkit.org/browser/trunk/Tools/Scripts/webkitpy/common/system/profiler.py webkitpy.common.system.profiler], except for {{{perf}}} on Android which is currently implemented in [http://trac.webkit.org/browser/trunk/Tools/Scripts/webkitpy/layout_tests/port/chromium_android.py chromium_android.py].  Profilers should conform to the {{{Profiler}}} class interface, and {{{ProfilerFactory}}} needs to know how to create one.  All of this is defined in profiler.py.   If you have any trouble, email eric@webkit.