Context Navigation

← Previous Changeset
Next Changeset →

Changeset 83795 in webkit

Timestamp:

Apr 13, 2011 6:08:25 PM (13 years ago)

Author:

eric@webkit.org

Message:

2011-04-13 Eric Seidel <eric@webkit.org>

Reviewed by Adam Barth.

commit-queue should be able to land when tree is red
https://bugs.webkit.org/show_bug.cgi?id=58494

There is some yak hair on my hands, I will admit.

This change is mostly about adding an ExpectedFailures
class to track when the bots are red and we should be
ignoring failures when landing from the commit-queue.

However, to make intelligent decisions about patches we
need to know whether the run hit the --exit-after-N-failures limit
or not. Right now that information is not saved off in results.html
so we have to pull the information from RunTests.

I've plumbed the --exit-after-N-failures information into
LayoutTestResults for now to make the ExpectedFailures code cleaner.

As a result of adding all these additional calls to delegate.layout_test_results()
I broke some of our flaky test detection tests and had to re-write them
to not depend on the number of layout_test_results code.

At the same time I updated the commit-queue to use the newer filesystem
API (to allow us to use MockFileSystem) which required further changes
to the layout tests. Changes were required in either case, since
we're now calling layout_test_results() in more cases, which previously
would try and hit the disk (until I moved it to use tool.filesystem).

I should note that *all* of this code is disabled for now, since our
--exit-after-N-failures limit is currently 1! (Thus were always in the
case where we can't actually tell if the layout test results are legit.)
I will up that limit in a second patch (which may require a couple more unit test tweaks).

Scripts/webkitpy/common/net/layouttestresults.py:
Scripts/webkitpy/tool/bot/commitqueuetask.py:
Scripts/webkitpy/tool/bot/commitqueuetask_unittest.py:
Scripts/webkitpy/tool/bot/expectedfailures.py: Added.
Scripts/webkitpy/tool/bot/expectedfailures_unittest.py: Added.
Scripts/webkitpy/tool/commands/queues.py:
Scripts/webkitpy/tool/commands/queues_unittest.py:
Scripts/webkitpy/tool/commands/queuestest.py:
Scripts/webkitpy/tool/steps/runtests.py:

Location:

trunk/Tools

Files:

: 2 added
: 8 edited

ChangeLog (modified) (1 diff)
Scripts/webkitpy/common/net/layouttestresults.py (modified) (1 diff)
Scripts/webkitpy/tool/bot/commitqueuetask.py (modified) (7 diffs)
Scripts/webkitpy/tool/bot/commitqueuetask_unittest.py (modified) (8 diffs)
Scripts/webkitpy/tool/bot/expectedfailures.py (added)
Scripts/webkitpy/tool/bot/expectedfailures_unittest.py (added)
Scripts/webkitpy/tool/commands/queues.py (modified) (3 diffs)
Scripts/webkitpy/tool/commands/queues_unittest.py (modified) (2 diffs)
Scripts/webkitpy/tool/commands/queuestest.py (modified) (1 diff)
Scripts/webkitpy/tool/steps/runtests.py (modified) (2 diffs)

Legend:

: Unmodified
: Added
: Removed

trunk/Tools/ChangeLog

-                      r83793
+                      r83795
+-04-13  Eric Seidel  <eric@webkit.org>
+        Reviewed by Adam Barth.
+        commit-queue should be able to land when tree is red
+        https://bugs.webkit.org/show_bug.cgi?id=58494
+        There is some yak hair on my hands, I will admit.
+        This change is mostly about adding an ExpectedFailures
+        class to track when the bots are red and we should be
+        ignoring failures when landing from the commit-queue.
+        However, to make intelligent decisions about patches we
+        need to know whether the run hit the --exit-after-N-failures limit
+        or not.  Right now that information is not saved off in results.html
+        so we have to pull the information from RunTests.
+        I've plumbed the --exit-after-N-failures information into
+        LayoutTestResults for now to make the ExpectedFailures code cleaner.
+        As a result of adding all these additional calls to delegate.layout_test_results()
+        I broke some of our flaky test detection tests and had to re-write them
+        to not depend on the number of layout_test_results code.
+        At the same time I updated the commit-queue to use the newer filesystem
+        API (to allow us to use MockFileSystem) which required further changes
+        to the layout tests.  Changes were required in either case, since
+        we're now calling layout_test_results() in more cases, which previously
+        would try and hit the disk (until I moved it to use tool.filesystem).
+        I should note that *all* of this code is disabled for now, since our
+        --exit-after-N-failures limit is currently 1!  (Thus were always in the
+        case where we can't actually tell if the layout test results are legit.)
+        I will up that limit in a second patch (which may require a couple more unit test tweaks).
+        * Scripts/webkitpy/common/net/layouttestresults.py:
+        * Scripts/webkitpy/tool/bot/commitqueuetask.py:
+        * Scripts/webkitpy/tool/bot/commitqueuetask_unittest.py:
+        * Scripts/webkitpy/tool/bot/expectedfailures.py: Added.
+        * Scripts/webkitpy/tool/bot/expectedfailures_unittest.py: Added.
+        * Scripts/webkitpy/tool/commands/queues.py:
+        * Scripts/webkitpy/tool/commands/queues_unittest.py:
+        * Scripts/webkitpy/tool/commands/queuestest.py:
+        * Scripts/webkitpy/tool/steps/runtests.py:
 -04-13  Brent Fulgham  <bfulgham@webkit.org>

trunk/Tools/Scripts/webkitpy/common/net/layouttestresults.py

-                      r77763
+                      r83795
         return cls(test_results)
+    def __init__(self, test_results):
+    # FIXME: run-webkit-tests should store the --exit-after-N-failures value
+    # (or some indication of early exit) somewhere in the results.html/results.json
+    # file.  Until it does, we depend on callers of this function to create
+    # LayoutTestResults with a failure_limit_count value, representing the
+    # --exit-after-N-failures value used in that run.  Consumers of LayoutTestResults
+    # will use that value to know if absence from the failure list means PASS.
+    # https://bugs.webkit.org/show_bug.cgi?id=58481
+    def __init__(self, test_results, failure_limit_count=None):
         self._test_results = test_results
+        self._failure_limit_count = failure_limit_count
+    def failure_limit_count(self):
+        return self._failure_limit_count
     def test_results(self):

trunk/Tools/Scripts/webkitpy/tool/bot/commitqueuetask.py

-                      r83614
+                      r83795
 from webkitpy.common.system.executive import ScriptError
 from webkitpy.common.net.layouttestresults import LayoutTestResults
+from webkitpy.tool.bot.expectedfailures import ExpectedFailures
 …
         self._script_error = None
         self._results_archive_from_patch_test_run = None
+        self._expected_failures = ExpectedFailures()
     def _validate(self):
 …
     def _test(self):
         return self._run_command([
+        success = self._run_command([
             "build-and-test",
             "--no-clean",
 …
         "Patch does not pass tests")
+        self._expected_failures.shrink_expected_failures(self._delegate.layout_test_results(), success)
+        return success
     def _build_and_test_without_patch(self):
         return self._run_command([
+        success = self._run_command([
             "build-and-test",
             "--force-clean",
 …
         "Unable to pass tests without patch (tree is red?)")
+    def _failing_results_from_last_run(self):
+        results = self._delegate.layout_test_results()
+        if not results:
+            return []  # Makes callers slighty cleaner to not have to deal with None
+        return results.failing_test_results()
+        self._expected_failures.shrink_expected_failures(self._delegate.layout_test_results(), success)
+        return success
     def _land(self):
 …
         self._delegate.report_flaky_tests(self._patch, flaky_test_results, results_archive)
+    def _results_failed_different_tests(self, first, second):
+        first_failing_tests = [] if not first else first.failing_tests()
+        second_failing_tests = [] if not second else second.failing_tests()
+        return first_failing_tests != second_failing_tests
     def _test_patch(self):
         if self._test():
             return True
+        first_results = self._failing_results_from_last_run()
+        first_failing_tests = [result.filename for result in first_results]
+        # Note: archive_last_layout_test_results deletes the results directory, making these calls order-sensitve.
+        # We could remove this dependency by building the layout_test_results from the archive.
+        first_results = self._delegate.layout_test_results()
         first_results_archive = self._delegate.archive_last_layout_test_results(self._patch)
+        if self._expected_failures.failures_were_expected(first_results):
+            return True
         if self._test():
+            # Only report flaky tests if we were successful at archiving results.
+            if first_results_archive:
+                self._report_flaky_tests(first_results, first_results_archive)
+            return True
+        second_results = self._failing_results_from_last_run()
+        second_failing_tests = [result.filename for result in second_results]
+        if first_failing_tests != second_failing_tests:
+            # Only report flaky tests if we were successful at parsing results.html and archiving results.
+            if first_results and first_results_archive:
+                self._report_flaky_tests(first_results.failing_test_results(), first_results_archive)
+            return True
+        second_results = self._delegate.layout_test_results()
+        if self._results_failed_different_tests(first_results, second_results):
             # We could report flaky tests here, but since run-webkit-tests
             # is run with --exit-after-N-failures=1, we would need to
+            # is currently run with --exit-after-N-failures=1, we would need to
             # be careful not to report constant failures as flaky due to earlier
             # flaky test making them not fail (no results) in one of the runs.
 …
             return False
+        # Archive (and remove) second results so layout_test_results() after
+        # build_and_test_without_patch won't use second results instead of the clean-tree results.
+        second_results_archive = self._delegate.archive_last_layout_test_results(self._patch)
         if self._build_and_test_without_patch():
+            return self.report_failure(first_results_archive)  # The error from the previous ._test() run is real, report it.
+        return False  # Tree must be red, just retry later.
+            # The error from the previous ._test() run is real, report it.
+            return self.report_failure(first_results_archive)
+        clean_tree_results = self._delegate.layout_test_results()
+        self._expected_failures.grow_expected_failures(clean_tree_results)
+        return False  # Tree must be redder than we expected, just retry later.
     def results_archive_from_patch_test_run(self, patch):

trunk/Tools/Scripts/webkitpy/tool/bot/commitqueuetask_unittest.py

-                      r76071
+                      r83795
 from webkitpy.common.net import bugzilla
+from webkitpy.common.net.layouttestresults import LayoutTestResults
 from webkitpy.common.system.deprecated_logging import error, log
 from webkitpy.common.system.outputcapture import OutputCapture
 …
 class CommitQueueTaskTest(unittest.TestCase):
-    def _mock_test_result(self, testname):
-        return test_results.TestResult(testname, [test_failures.FailureTextMismatch()])
     def _run_through_task(self, commit_queue, expected_stderr, expected_exception=None, expect_retry=False):
         tool = MockTool(log_executive=True)
 …
             ScriptError("MOCK tests failure"),
         ])
+        # CommitQueueTask will only report flaky tests if we successfully parsed
+        # results.html and returned a LayoutTestResults object, so we fake one.
+        commit_queue.layout_test_results = lambda: LayoutTestResults([])
         expected_stderr = """run_webkit_patch: ['clean']
 command_passed: success_message='Cleaned working directory' patch='197'
 …
             ScriptError("MOCK tests failure"),
         ])
+        commit_queue.layout_test_results = lambda: LayoutTestResults([])
         # It's possible delegate to fail to archive layout tests, don't try to report
         # flaky tests when that happens.
 …
         self._run_through_task(commit_queue, expected_stderr)
-    _double_flaky_test_counter = 0
     def test_double_flaky_test_failure(self):
+        commit_queue = MockCommitQueue([
+        class DoubleFlakyCommitQueue(MockCommitQueue):
+            def __init__(self, error_plan):
+                MockCommitQueue.__init__(self, error_plan)
+                self._double_flaky_test_counter = 0
+            def run_command(self, command):
+                self._double_flaky_test_counter += 1
+                MockCommitQueue.run_command(self, command)
+            def _mock_test_result(self, testname):
+                return test_results.TestResult(testname, [test_failures.FailureTextMismatch()])
+            def layout_test_results(self):
+                if self._double_flaky_test_counter % 2:
+                    return LayoutTestResults([self._mock_test_result('foo.html')])
+                return LayoutTestResults([self._mock_test_result('bar.html')])
+        commit_queue = DoubleFlakyCommitQueue([
             None,
             None,
 …
         patch = tool.bugs.fetch_attachment(197)
         task = CommitQueueTask(commit_queue, patch)
-        self._double_flaky_test_counter = 0
-        def mock_failing_results_from_last_run():
-            CommitQueueTaskTest._double_flaky_test_counter += 1
-            if CommitQueueTaskTest._double_flaky_test_counter % 2:
-                return [self._mock_test_result('foo.html')]
-            return [self._mock_test_result('bar.html')]
-        task._failing_results_from_last_run = mock_failing_results_from_last_run
         success = OutputCapture().assert_outputs(self, task.run, expected_stderr=expected_stderr)
         self.assertEqual(success, False)
 …
 run_webkit_patch: ['build-and-test', '--no-clean', '--no-update', '--test', '--non-interactive']
 command_failed: failure_message='Patch does not pass tests' script_error='MOCK test failure again' patch='197'
+archive_last_layout_test_results: patch='197'
 run_webkit_patch: ['build-and-test', '--force-clean', '--no-update', '--build', '--test', '--non-interactive']
 command_passed: success_message='Able to pass tests without patch' patch='197'
 …
 run_webkit_patch: ['build-and-test', '--no-clean', '--no-update', '--test', '--non-interactive']
 command_failed: failure_message='Patch does not pass tests' script_error='MOCK test failure again' patch='197'
+archive_last_layout_test_results: patch='197'
 run_webkit_patch: ['build-and-test', '--force-clean', '--no-update', '--build', '--test', '--non-interactive']
 command_failed: failure_message='Unable to pass tests without patch (tree is red?)' script_error='MOCK clean test failure' patch='197'

trunk/Tools/Scripts/webkitpy/tool/commands/queues.py

-                      r83614
+                      r83795
 from webkitpy.tool.bot.flakytestreporter import FlakyTestReporter
 from webkitpy.tool.commands.stepsequence import StepSequenceErrorHandler
+from webkitpy.tool.steps.runtests import RunTests
 from webkitpy.tool.multicommandtool import Command, TryAgain
 …
     def _read_file_contents(self, path):
         try:
+            with codecs.open(path, "r", "utf-8") as open_file:
+                return open_file.read()
+            return self._tool.filesystem.read_text_file(path)
         except OSError, e:  # File does not exist or can't be read.
             return None
 …
         if not results_html:
             return None
+        return LayoutTestResults.results_from_string(results_html)
+        # FIXME: We should not have to pass a failure_limit_count, but we
+        # do until run-webkit-tests can be updated save off the value
+        # of --exit-after-N-failures in results.html/results.json.
+        # https://bugs.webkit.org/show_bug.cgi?id=58481
+        return LayoutTestResults.results_from_string(results_html, failure_limit_count=RunTests.NON_INTERACTIVE_FAILURE_LIMIT_COUNT)
     def _results_directory(self):

trunk/Tools/Scripts/webkitpy/tool/commands/queues_unittest.py

-                      r83614
+                      r83795
     def test_rollout(self):
         tool = MockTool(log_executive=True)
+        tool.filesystem.write_text_file('/mock/results.html', '')  # Otherwise the commit-queue will hit a KeyError trying to read the results from the MockFileSystem.
         tool.buildbot.light_tree_on_fire()
         expected_stderr = {
 …
         queue = SecondThoughtsCommitQueue()
         queue.bind_to_tool(MockTool())
+        queue._tool.filesystem.write_text_file('/mock/results.html', '')  # Otherwise the commit-queue will hit a KeyError trying to read the results from the MockFileSystem.
         queue._options = Mock()
         queue._options.port = None

trunk/Tools/Scripts/webkitpy/tool/commands/queuestest.py

-                      r70328
+                      r83795
         if not tool:
             tool = MockTool()
+            # This is a hack to make it easy for callers to not have to setup a custom MockFileSystem just to test the commit-queue
+            # the cq tries to read the layout test results, and will hit a KeyError in MockFileSystem if we don't do this.
+            tool.filesystem.write_text_file('/mock/results.html', "")
         if not expected_stdout:
             expected_stdout = {}

trunk/Tools/Scripts/webkitpy/tool/steps/runtests.py

-                      r70220
+                      r83795
 class RunTests(AbstractStep):
+    # FIXME: This knowledge really belongs in the commit-queue.
+    NON_INTERACTIVE_FAILURE_LIMIT_COUNT = 1
     @classmethod
     def options(cls):
 …
             args.append("--no-new-test-results")
             args.append("--no-launch-safari")
             args.append("--exit-after-n-failures=1")
+            args.append("--exit-after-n-failures=%s" % self.NON_INTERACTIVE_FAILURE_LIMIT_COUNT)
             args.append("--wait-for-httpd")
             # FIXME: Hack to work around https://bugs.webkit.org/show_bug.cgi?id=38912

Note: See TracChangeset for help on using the changeset viewer.

Context Navigation

Changeset 83795 in webkit

Legend:

Download in other formats: