[[PageOutline]] = What is a Performance Test = A performance test measures the run-time performance and memory usage (coming!) of WebKit. Unlike [http://www.webkit.org/quality/testing.html regression tests] (a.k.a layout tests) or conformance tests such as of W3C's, it doesn't necessarily test the correctness of WebKit features. Since the run-time and memory used by each run of the same test may vary, we can't conclude whether a given performance test passed or failed by just looking at the single run. For this reason, performance tests yields "values" such as the time taken to run the test instead of simple PASS and FAIL. = Performance Test Results = You can find the performance results on http://webkit-perf.appspot.com/ The performance bots can be listed on the [http://build.webkit.org/waterfall?show=Apple%20Lion%20Release%20%28Perf%29&show=Apple%20MountainLion%20Release%20%28Perf%29&show=Chromium%20Linux%20Release%20%28Perf%29&show=Chromium%20Mac%20Release%20%28Perf%29&show=Chromium%20Win%20Release%20%28Perf%29&show=Qt%20Linux%2064-bit%20Release%20%28Perf%29&show=Qt%20Linux%2064-bit%20Release%20%28WebKit2%20Perf%29 buildbot page]. {{{#!td style="padding: 1em;" ||''' The waterfall of the Performance bots on the [http://build.webkit.org/waterfall?show=Apple%20Lion%20Release%20%28Perf%29&show=Apple%20MountainLion%20Release%20%28Perf%29&show=Chromium%20Linux%20Release%20%28Perf%29&show=Chromium%20Mac%20Release%20%28Perf%29&show=Chromium%20Win%20Release%20%28Perf%29&show=Qt%20Linux%2064-bit%20Release%20%28Perf%29&show=Qt%20Linux%2064-bit%20Release%20%28WebKit2%20Perf%29 Buildbot page]''' || '''Platform name on the [http://webkit-perf.appspot.com/ Perf-o-matic results page]''' || || [http://build.webkit.org/waterfall?show=Apple%20Lion%20Release%20(Perf) Apple Lion Release (Perf)] || [http://webkit-perf.appspot.com/#displayrange=7&branch=WebKit+trunk&platform=%5B%22Mac+Lion%22%5D&test=%5B%22Parser%2Fhtml5-full-render%22%5D Mac Lion] || || [http://build.webkit.org/waterfall?show=Apple%20MountainLion%20Release%20(Perf) Apple MountainLion Release (Perf)] || [http://webkit-perf.appspot.com/#displayrange=7&branch=WebKit+trunk&platform=%5B%22Mac+MountainLion%22%5D&test=%5B%22Parser%2Fhtml5-full-render%22%5D Mac MountainLion] || || [http://build.webkit.org/waterfall?show=Chromium%20Linux%20Release%20(Perf) Chromium Linux Release (Perf)] || [http://webkit-perf.appspot.com/#displayrange=7&branch=WebKit+trunk&platform=%5B%22Chromium+Linux%22%5D&test=%5B%22Parser%2Fhtml5-full-render%22%5D Chromium Linux] || || [http://build.webkit.org/waterfall?show=Chromium%20Mac%20Release%20(Perf) Chromium Mac Release (Perf)] || [http://webkit-perf.appspot.com/#displayrange=7&branch=WebKit+trunk&platform=%5B%22Chromium+Mac%22%5D&test=%5B%22Parser%2Fhtml5-full-render%22%5D Chromium Mac] || || [http://build.webkit.org/waterfall?show=Chromium%20Win%20Release%20(Perf) Chromium Win Release (Perf)] || [http://webkit-perf.appspot.com/#displayrange=7&branch=WebKit+trunk&platform=%5B%22Chromium+Win%22%5D&test=%5B%22Parser%2Fhtml5-full-render%22%5D Chromium Win] || || [http://build.webkit.org/waterfall?show=Qt%20Linux%2064-bit%20Release%20(Perf) Qt Linux 64-bit Release (Perf)] || [http://webkit-perf.appspot.com/#displayrange=7&branch=WebKit+trunk&platform=%5B%22Qt+5.0%22%5D&test=%5B%22Parser%2Fhtml5-full-render%22%5D Qt 5.0] || || [http://build.webkit.org/waterfall?show=Qt%20Linux%2064-bit%20Release%20(WebKit2%20Perf) Qt Linux 64-bit Release (WebKit2 Perf)] || [http://webkit-perf.appspot.com/#displayrange=7&branch=WebKit+trunk&platform=%5B%22Qt+5.0+WebKit+2%22%5D&test=%5B%22Parser%2Fhtml5-full-render%22%5D Qt 5.0 WebKit 2] || }}} = How to Run Performance Tests = WebKit's performance tests can be ran by [http://trac.webkit.org/browser/trunk/Tools/Scripts/run-perf-tests run-perf-tests]. Specify a list of directories or performance tests to run a subset. e.g. {{{run-perf-tests PerformanceTests/DOM}}} or {{{run-perf-tests DOM}}} will only run tests in [http://trac.webkit.org/browser/trunk/PerformanceTests/DOM]. It will automatically build DumpRenderTree and WebKitTestRunner as needed just like {{{run-webkit-tests}}}. We also have performance bots on http://build.webkit.org/, which submits results to http://webkit-perf.appspot.com/ == Reducing noise on your machine == Before running performance tests, you may want to reboot your machine, disable screen savers and power saving features, and turn off anti-virus software to reduce the potential noise. Also disable network, bluetooth, and other network and wireless devices as they might cause undesirable CPU interrupts and context switches. On Mac, you can run the following command to disable Spotlight: {{{ sudo launchctl unload -w /System/Library/LaunchDaemons/com.apple.metadata.mds.plist }}} (To re-enable, run {{{sudo launchctl load -w /System/Library/LaunchDaemons/com.apple.metadata.mds.plist}}} == Aggregating and Comparing Results == If you're running a performance test locally in order to verify your patch doesn't regress or improves the performance of WebKit, you may find {{{--output-json-path}}} useful. Specify a file path such as {{{perf-test.json}}} and {{{run-perf-tests}}} will automatically store the results in the JSON file and creates {{{perf-test.html}}} that visualizes the test results. Execute {{{run-perf-tests}}} multiple times with the same output JSON path and it will automatically aggregate results in the JSON and the corresponding HTML document. Suppose we have two WebKit checkouts: one without the patch and another with the patch. By executing {{{run-perf-tests --output-json-path=/Users/WebKitten/perf-test.json}}} in both checkouts, we can easily compare the test results from two runs by opening {{{~/perf-test.html}}}. You can also specify a build directory as follows along with the output JSON path: {{{ run-perf-tests --no-build --build-directory /Users/WebKitten/MyCustomBuild/ --output-json-path=/Users/WebKitten/perf-test.json }}} This allows you compare results from different builds without having to locally build DumpRenderTree or WebKitTestRunner. === Bisecting regressions === Suppose you're bisecting a regression for a performance regression on [http://trac.webkit.org/browser/trunk/PerformanceTests/Bindings/node-list-access.html Bindings/node-list-access.html] as seen [http://webkit-perf.appspot.com/graph.html#tests=%5B%5B2966378%2C2001%2C32196%5D%5D&sel=1343895254495.4062,1344094768078.8474,116.85483870967741,148.30645161290323&displayrange=30&datatype=running here]. Looking at the graph we see that the culprit lies [http://trac.webkit.org/log/?rev=124582&stop_rev=124567&verbose=on "between r124567 and r124582"]. To bisect this regression, I prepare two WebKit checkouts one synced to r124567 and another synced to r124582, and run the following commands in each checkout: {{{ svn up PerformanceTests svn up Tools/Scripts/webkitpy/performance_tests Tools/Scripts/build-webkit Tools/Scripts/run-perf-tests --output-json-path=/Users/WebKitten/Desktop/node-list-access.json PerformanceTests/Bindings/node-list-access.html }}} This step automatically produces {{{/Users/WebKitten/Desktop/node-list-access.html}}} for me to compare the results, each results labeled r124567 and r124582 (you can use {{{--description}}} option to annotate the results further) and I can confirm whether the regression reproduces locally or not. Sometimes, regression doesn't produce on your local machine due to differences in environment such as compilers used, memory size, and CPU speed. Once I confirmed that the regression is reproducible on my machine, I start bisecting builds. Here, I sync the checkout initially synced to r124582 to a slightly older version, say, r124580 and generate results again as follows: {{{ svn up -r 124580 svn up PerformanceTests svn up Tools/Scripts/webkitpy/performance_tests Tools/Scripts/build-webkit Tools/Scripts/run-perf-tests --output-json-path=/Users/WebKitten/Desktop/node-list-access.json PerformanceTests/Bindings/node-list-access.html }}} I repeat this process until the results recovers to the level we had at r124567, at which I identified the culprit. I don't typically do a strict binary search on perf. regressions because that typically results in spending a lot of time on re-building WebKit. = Writing a Performance Test Using runner.js = The easiest way to write a performance test is using [http://trac.webkit.org/browser/trunk/PerformanceTests/resources/runner.js runner.js], which provides {{{PerfTestRunner}}} with various utility functions. Again, the easiest way to use this class is to call {{{PerfTestRunner.runPerSecond}}} with a test function. For example, see [http://trac.webkit.org/browser/trunk/PerformanceTests/Parser/tiny-innerHTML.html Parser/tiny-innerHTML.html]: {{{ }}} {{{PerfTestRunner.runPerSecond}}} measures the time of times {{{run}}} function could be called in one second, and reports the statistics after repeating it 20 times. The statistics includes arithmetic mean, standard deviation, median, minimum, and maximum values. You can optionally specify the number of iterations (defaults to 2) and the test description as follows: {{{ PerfTestRunner.runPerSecond({ runCount: 50, description: "Test with lots of iterations", run: function () {...} }); }}} There is also {{{PerfTestRunner.run}}}, which measures the run time of a function. The use of this method is '''discouraged''' because the time gets smaller relative to the granularity of time measurement we can make as the WebKit's performance (or of the machines that run performance tests) improves. Nonetheless, it has been used in some animation tests where we cannot dynamically adjust the number of function calls as done in {{{runPerSecond}}}. {{{run}}} calls the specified function 10 times in each iteration and runs 20 iterations by default. You can optionally specify how many times the function is called in each iteration and how many iterations are executed. You can also specify a function to be called after all iterations. {{{ PerfTestRunner.run(runFunction, callsPerIteration, numberOfIterations, doneFunction) }}} Once you wrote a test, put it inside [http://trac.webkit.org/browser/trunk/PerformanceTests PerformanceTests] directory to be ran by run-perf-tests and performance bots. = Replay Performance Tests = Replay tests are highly "experimental" page loading tests. Historically, Apple and Google have used PLT and [http://www.chromium.org/developers/testing/page-cyclers Page Cycler Tests] but they could not be part of WebKit or publicly distributed otherwise because they contain copyrighted materials. WebKit replay tests works around this problem by measuring page loading time of web pages on [http://www.archive.org Internet Archive] using local caches provided by [http://code.google.com/p/web-page-replay/ web-page-replay] as a Web proxy. A replay test consists of a single text file with an URL in it. For example, [http://trac.webkit.org/browser/trunk/PerformanceTests/Replay/English/digg.com.replay digg.com.replay] contains {{{ http://web.archive.org/web/20100730073647/http://digg.com/ }}} as of July 27th, 2012. {{{run-perf-tests}}} creates {{{digg.com.wpr}}} and {{{digg.com-expected.png}}} when preparing the local cache, and creates {{{digg.com-actual.png}}} as it runs the test. === How to Run Replay Tests === Replay tests are currently supported on Mac port and Chromium port on Mac and Linux. To run tests, you must set the local proxy to localhost at port 8080 for HTTP and port 8443 for HTTPs. This will allow DumpRenderTree or WebKitTestRunner to talk to web-page-replay to cache pages locally instead of directly accessing archive.org. Exclude {{{*.googlecode.com}}} as web-page-replay needs to be downloaded from Google Code on the initial run. - On Mac, the proxy can be set at System Preferences > Network > Advanced > Proxies. - On Linux, the proxy can be set by {{{$http_proxy}}}, {{{$https_proxy}}}, {{{$no_proxy}}} (specifies hosts to be excluded) environmental variables. Once the proxy is setup, run {{{run-perf-tests --replay}}}. Since all replay tests are located in {{{PerformanceTests/Replay}}}, you can only run replay tests by {{{run-perf-tests --replay PerformanceTests/Replay}}}. {{{run-perf-tests}}} will first prepares local caches using web-page-replay's record mode, and then makes 20 measurements of page load times using the play mode. Make sure that .wpr files created for each test contain actual contents. For example, if the .wpr file is less than 100KB, it's likely that the test runner is accessing the remote servers directly and not going through web-page-replay. You can also make sure that tests are running properly by comparing the contents of {{{digg.com-expected.png}}} and {{{digg.com-actual.png}}}. Unfortunately, this image comparison cannot be automated as the image contains copyrighted material (preventing to be checked into the SVN repository) and it changes as WebKit is updated. === How to Write a Replay Test === To write a new replay test for an URL, go visit [http://www.archive.org/ Internet Archive] and look for an archive of the URL. If there is no archive for the URL, then we cannot create a replay test for this page. Also, if the archive doesn't contain a significant amount of essential non-HTML contents such as images, css, and plugins, it may not be suitable as a replay test. Once you've obtained an archive.org URL, then create a .reply file in [http://trac.webkit.org/browser/trunk/PerformanceTests/Replay/ PerformanceTests/Replay] and run {{{run-perf-tests --replay }}} (don't forget to setup the proxy). Look for any errors web-page-replay reports. For example, failures to inject script is a very common error and can be ignored in most cases. However, "pipe broken" errors and other python exceptions tend to be an indication of the content not being served properly via web-page-replay. If these errors occur, try other archives of the same URL. When the tests finish successfully without errors, look at the mean and the standard deviation of the test. If the standard deviation is higher than 4-6% of the mean, try other archives of the same URL. It's important to recognize that different archives of the same URL can yield significantly different variances as follows: {{{ http://web.archive.org/web/20110729050650/http://www.kp.ru/ RESULT Replay: Russian: www.kp.ru.replay= 2164.80790941 ms median= 2173.47407341 ms, stdev= 216.239083036 ms, min= 2067.66700745 ms, max= 2248.77309799 ms http://web.archive.org/web/20110119021944/http://www.kp.ru/ RESULT Replay: Russian: www.kp.ru.replay= 3299.88499692 ms median= 3802.16002464 ms, stdev= 4244.51026529 ms, min= 1394.64211464 ms, max= 3824.78284836 ms http://web.archive.org/web/20110318023959/http://kp.ru/ RESULT Replay: Russian: www.kp.ru.replay= 1667.41889401 ms median= 1667.26398468 ms, stdev= 67.7172770899 ms, min= 1643.22805405 ms, max= 1702.38494873 ms }}} If python exceptions or other serious errors persist, or the ratio of standard deviation to mean is consistently higher than 7-10%, don't add the URL as a replay test regardless of how important that website is because we can't make a use of performance tests that have 10% variance.