wiki:Keeping the Tree Green

Version 2 (modified by abarth@webkit.org, 14 years ago) (diff)

--

Keeping the Tree Green

In the past, red bots on http://build.webkit.org/waterfall went unnoticed and failures (including real regressions) accumulated in the tree. The longer a failure persists in the tree, the harder it is to track down the source of the failure and the higher the chance that the failure is hiding another regression. Keeping the tree green will requires effort on the part of every contributor but, in the end, makes the project more efficient.

Sheriffbot

Sheriffbot helps us keep the tree green. If a core builder fails twice in a row, sheriffbot computes the blame list for the failure, notifies the responsible parties (committer, author, and reviewer) in IRC, and comments on the relevant bugs.

Sheriffbot can't fix the tree himself: only you can prevent forest fires. If sheriffbot complains about your patch, please take a moment to investigate whether your patch (or the patch you reviewed) actually caused a failure (ideally coordinating with other folks on IRC). It's possible that your change broke the tree, but it's also possible that your change was blamed because it was caught in the same cycle as the change that really caused a failed (or it's possible sheriffbot was tricked by two flaky tests in a row).

Q: Why wait for two failures in a row? A: We wait for two failures in a row to avoid spamming in the case of a flaky test. That system isn't perfect, and we've seen cases where sheriffbot is fooled by flaky tests. We're going to continue to iterate on the failure detection algorithm to reduce false alarms.

Q: How does sheriffbot know my IRC nick? A: We've add the IRC nicks for most members of the project to http://trac.webkit.org/browser/trunk/WebKitTools/Scripts/webkitpy/common/config/committers.py. Please take a minute to verify that your nick is correct or add your nick if it's missing. Adding your nick to committers.py doesn't require review.

Responding to Failures

Generally, there are two approaches to responding to failures: (1) attempt to fix the failure live on the tree, or (2) roll out the offending change. Historically, the WebKit project has favored fixing live because, I think, rolling changes in and out of the tree was cumbersome. The tradeoff between these approaches is that fixing live imposes the cost of a broken tree on every member of the project whereas rollouts imposes a cost on the committer of the patch. As the number of people involved in the project grows, the cost of a broken tree scales while cost of rollout remains fixed. At some point, attempting to fix the tree live is more costly to the project than rolling out the patch.

If you cause a failure, please consider rolling out your patch instead of trying to fix the failure live. There's no shame in rolling out your patch, and you can always land it again once you've tracked down the failure. Often re-landing your change doesn't require an additional review. Of course, rolling out a patch isn't appropriate for all situations. The next time you find yourself trying to fix a failure live on the tree, ask yourself whether you're selfishly imposing costs on other members of the project.

Using Sheriffbot to Roll Out a Patch

Sheriffbot can help you roll out a patch. Here's how it works. Suppose revision 57047 broke Tiger. You can send sheriffbot the following command in #webkit:

sheriffbot: rollout 57047 This patch broke xyz test on Tiger

Sheriffbot will file a bug about the failure, cc the appropriate people, mark the bug as blocking the original bug, attach a rollout patch, and give you a link to the bug in IRC. All you need to do is go to the bug and mark the rollout patch as commit-queue+. The commit-queue will then land your rollout and reopen the origin bug.

Q: Why do I have to mark the patch as commit-queue+, haven't I already told sheriffbot to roll out the patch? A: We don't trust commands received on IRC. We need you to authorize the rollout using your bugs.webkit.org credentials.

Q: Won't spammers create infinite bugs by poking the sheriffbot? A: We might need to restrict use of the rollout command to committers. We'd still use bugs.webkit.org to authorize landing the rollout, but that should reduce the spam problem, if there is one.