wiki:TriagingTestFailures

Version 8 (modified by Adam Roben, 10 years ago) (diff)

Reword the intro a bit

Introduction

The build bots are most useful when they are "green" (i.e., the build isn't broken and there are no unexpected test failures). When the bots are green and a regression is introduced, the bots go from green to red, which is very easy to notice and respond to. When the bots are red and a regression is introduced, they stay red, so it's hard to notice the regression occurred at all.

This guide attempts to walk you through a process to get the bots green again when they are red by triaging test failures, filing bugs on them, and checking in new results or skipping tests.

Find out what is failing

There are two main ways to do this:

  • Browse build.webkit.org (you should probably start with this one)
    1. Find recent builds that have failed
      • Windows:
        1. Go to http://build.webkit.org/buildslaves.
        2. Click on the name of a slave you're interested in to see a summary of its recent builds.
      • Other platforms:
        1. Go to http://build.webkit.org/builders.
        2. Click on the name of a builder you're interested in to see a summary of its recent builds.
      • The Info column will tell you if any tests failed for that build.
      • To see the test output for a particular build, click on the link in the Build # column, then on view results, then on results.html
  • Use webkit-patch
    1. Run this command:
      webkit-patch failure-reason
      
    2. When prompted, specify which builder you're interested in.
    3. Press Enter to continue. webkit-patch will look back through the recent builds for that builder until it has found when all current failures were introduced.

Find out when each test started failing

You can either:

  • Look back through old builds on build.webkit.org
  • Use the output from webkit-patch failure-reason
  • Use svn log/git log to find out when the test or its results were last changed

Try to figure out why each test is failing

(You probably won't be able to figure out exactly why every test is failing, but the more information you can get now, the better.)

Look at the revision range where the failure was introduced. If you find that:

  • The test and/or its expected output was modified
    • The test might need new results for the failing platform(s).
    • Are the test's results platform-specific (i.e., are they beneath LayoutTests/platform/)?
      • Yes: the failing platforms might just need new results checked in. You'll have to verify that the current output from those platforms is correct.
      • No: the failing platforms might have some missing functionality in WebKit or DumpRenderTree.
  • Related areas of WebKit were modified
    • Were the modifications platform-specific?
      • Yes: the failing platforms might need similar modifications made.
      • No: there might be some existing platform-specific code that is responsible for the different results.

File bugs for the failures

If multiple tests are failing for the same reason, you should group them together into a single bug. If a test fails on multiple platforms and those platforms will need separate fixes, you should file one bug for each failing platform.

  1. Go to http://webkit.org/new-bug
  2. Include in your report:
  3. Apply keywords
    • LayoutTestFailure
    • Regression, if the failure is due to a regression in WebKit
    • PlatformOnly, if the test only fails on one platform
  4. If the test affects one of Apple's ports, and you work for Apple, you should migrate the bug into Radar.

Get the bots green again

  • If you know what the root cause is, and know how to fix it (e.g., by making a change to WebKit or checking in new correct results), then fix it and close the bug!
  • If the tests fail every time and the tests' output is the same every time, check in new results for the tests and include the bug URL in your ChangeLog.
    • You should do this even if the test is "failing". By running the test against these "expected failure" results rather than skipping the test entirely, we can discover when new regressions are introduced.
  • If the tests fail intermittently, or crash, or hang, add the tests to the appropriate Skipped files (e.g., LayoutTests/platform/mac-leopard/Skipped). Include a comment in the Skipped file with the bug URL and a brief description of how it fails, e.g.:
    # Sometimes times out http://webkit.org/b/12345
    fast/js/some-cool-test.html