Changeset 83795 in webkit


Ignore:
Timestamp:
Apr 13, 2011 6:08:25 PM (13 years ago)
Author:
eric@webkit.org
Message:

2011-04-13 Eric Seidel <eric@webkit.org>

Reviewed by Adam Barth.

commit-queue should be able to land when tree is red
https://bugs.webkit.org/show_bug.cgi?id=58494

There is some yak hair on my hands, I will admit.

This change is mostly about adding an ExpectedFailures
class to track when the bots are red and we should be
ignoring failures when landing from the commit-queue.

However, to make intelligent decisions about patches we
need to know whether the run hit the --exit-after-N-failures limit
or not. Right now that information is not saved off in results.html
so we have to pull the information from RunTests.

I've plumbed the --exit-after-N-failures information into
LayoutTestResults for now to make the ExpectedFailures code cleaner.

As a result of adding all these additional calls to delegate.layout_test_results()
I broke some of our flaky test detection tests and had to re-write them
to not depend on the number of layout_test_results code.

At the same time I updated the commit-queue to use the newer filesystem
API (to allow us to use MockFileSystem) which required further changes
to the layout tests. Changes were required in either case, since
we're now calling layout_test_results() in more cases, which previously
would try and hit the disk (until I moved it to use tool.filesystem).

I should note that *all* of this code is disabled for now, since our
--exit-after-N-failures limit is currently 1! (Thus were always in the
case where we can't actually tell if the layout test results are legit.)
I will up that limit in a second patch (which may require a couple more unit test tweaks).

  • Scripts/webkitpy/common/net/layouttestresults.py:
  • Scripts/webkitpy/tool/bot/commitqueuetask.py:
  • Scripts/webkitpy/tool/bot/commitqueuetask_unittest.py:
  • Scripts/webkitpy/tool/bot/expectedfailures.py: Added.
  • Scripts/webkitpy/tool/bot/expectedfailures_unittest.py: Added.
  • Scripts/webkitpy/tool/commands/queues.py:
  • Scripts/webkitpy/tool/commands/queues_unittest.py:
  • Scripts/webkitpy/tool/commands/queuestest.py:
  • Scripts/webkitpy/tool/steps/runtests.py:
Location:
trunk/Tools
Files:
2 added
8 edited

Legend:

Unmodified
Added
Removed
  • trunk/Tools/ChangeLog

    r83793 r83795  
     12011-04-13  Eric Seidel  <eric@webkit.org>
     2
     3        Reviewed by Adam Barth.
     4
     5        commit-queue should be able to land when tree is red
     6        https://bugs.webkit.org/show_bug.cgi?id=58494
     7
     8        There is some yak hair on my hands, I will admit.
     9
     10        This change is mostly about adding an ExpectedFailures
     11        class to track when the bots are red and we should be
     12        ignoring failures when landing from the commit-queue.
     13
     14        However, to make intelligent decisions about patches we
     15        need to know whether the run hit the --exit-after-N-failures limit
     16        or not.  Right now that information is not saved off in results.html
     17        so we have to pull the information from RunTests.
     18
     19        I've plumbed the --exit-after-N-failures information into
     20        LayoutTestResults for now to make the ExpectedFailures code cleaner.
     21
     22        As a result of adding all these additional calls to delegate.layout_test_results()
     23        I broke some of our flaky test detection tests and had to re-write them
     24        to not depend on the number of layout_test_results code.
     25
     26        At the same time I updated the commit-queue to use the newer filesystem
     27        API (to allow us to use MockFileSystem) which required further changes
     28        to the layout tests.  Changes were required in either case, since
     29        we're now calling layout_test_results() in more cases, which previously
     30        would try and hit the disk (until I moved it to use tool.filesystem).
     31
     32        I should note that *all* of this code is disabled for now, since our
     33        --exit-after-N-failures limit is currently 1!  (Thus were always in the
     34        case where we can't actually tell if the layout test results are legit.)
     35        I will up that limit in a second patch (which may require a couple more unit test tweaks).
     36
     37        * Scripts/webkitpy/common/net/layouttestresults.py:
     38        * Scripts/webkitpy/tool/bot/commitqueuetask.py:
     39        * Scripts/webkitpy/tool/bot/commitqueuetask_unittest.py:
     40        * Scripts/webkitpy/tool/bot/expectedfailures.py: Added.
     41        * Scripts/webkitpy/tool/bot/expectedfailures_unittest.py: Added.
     42        * Scripts/webkitpy/tool/commands/queues.py:
     43        * Scripts/webkitpy/tool/commands/queues_unittest.py:
     44        * Scripts/webkitpy/tool/commands/queuestest.py:
     45        * Scripts/webkitpy/tool/steps/runtests.py:
     46
    1472011-04-13  Brent Fulgham  <bfulgham@webkit.org>
    248
  • trunk/Tools/Scripts/webkitpy/common/net/layouttestresults.py

    r77763 r83795  
    133133        return cls(test_results)
    134134
    135     def __init__(self, test_results):
     135    # FIXME: run-webkit-tests should store the --exit-after-N-failures value
     136    # (or some indication of early exit) somewhere in the results.html/results.json
     137    # file.  Until it does, we depend on callers of this function to create
     138    # LayoutTestResults with a failure_limit_count value, representing the
     139    # --exit-after-N-failures value used in that run.  Consumers of LayoutTestResults
     140    # will use that value to know if absence from the failure list means PASS.
     141    # https://bugs.webkit.org/show_bug.cgi?id=58481
     142    def __init__(self, test_results, failure_limit_count=None):
    136143        self._test_results = test_results
     144        self._failure_limit_count = failure_limit_count
     145
     146    def failure_limit_count(self):
     147        return self._failure_limit_count
    137148
    138149    def test_results(self):
  • trunk/Tools/Scripts/webkitpy/tool/bot/commitqueuetask.py

    r83614 r83795  
    2929from webkitpy.common.system.executive import ScriptError
    3030from webkitpy.common.net.layouttestresults import LayoutTestResults
     31from webkitpy.tool.bot.expectedfailures import ExpectedFailures
    3132
    3233
     
    6162        self._script_error = None
    6263        self._results_archive_from_patch_test_run = None
     64        self._expected_failures = ExpectedFailures()
    6365
    6466    def _validate(self):
     
    134136
    135137    def _test(self):
    136         return self._run_command([
     138        success = self._run_command([
    137139            "build-and-test",
    138140            "--no-clean",
     
    145147        "Patch does not pass tests")
    146148
     149        self._expected_failures.shrink_expected_failures(self._delegate.layout_test_results(), success)
     150        return success
     151
    147152    def _build_and_test_without_patch(self):
    148         return self._run_command([
     153        success = self._run_command([
    149154            "build-and-test",
    150155            "--force-clean",
     
    157162        "Unable to pass tests without patch (tree is red?)")
    158163
    159     def _failing_results_from_last_run(self):
    160         results = self._delegate.layout_test_results()
    161         if not results:
    162             return []  # Makes callers slighty cleaner to not have to deal with None
    163         return results.failing_test_results()
     164        self._expected_failures.shrink_expected_failures(self._delegate.layout_test_results(), success)
     165        return success
    164166
    165167    def _land(self):
     
    179181        self._delegate.report_flaky_tests(self._patch, flaky_test_results, results_archive)
    180182
     183    def _results_failed_different_tests(self, first, second):
     184        first_failing_tests = [] if not first else first.failing_tests()
     185        second_failing_tests = [] if not second else second.failing_tests()
     186        return first_failing_tests != second_failing_tests
     187
    181188    def _test_patch(self):
    182189        if self._test():
    183190            return True
    184191
    185         first_results = self._failing_results_from_last_run()
    186         first_failing_tests = [result.filename for result in first_results]
     192        # Note: archive_last_layout_test_results deletes the results directory, making these calls order-sensitve.
     193        # We could remove this dependency by building the layout_test_results from the archive.
     194        first_results = self._delegate.layout_test_results()
    187195        first_results_archive = self._delegate.archive_last_layout_test_results(self._patch)
     196
     197        if self._expected_failures.failures_were_expected(first_results):
     198            return True
     199
    188200        if self._test():
    189             # Only report flaky tests if we were successful at archiving results.
    190             if first_results_archive:
    191                 self._report_flaky_tests(first_results, first_results_archive)
    192             return True
    193 
    194         second_results = self._failing_results_from_last_run()
    195         second_failing_tests = [result.filename for result in second_results]
    196         if first_failing_tests != second_failing_tests:
     201            # Only report flaky tests if we were successful at parsing results.html and archiving results.
     202            if first_results and first_results_archive:
     203                self._report_flaky_tests(first_results.failing_test_results(), first_results_archive)
     204            return True
     205
     206        second_results = self._delegate.layout_test_results()
     207        if self._results_failed_different_tests(first_results, second_results):
    197208            # We could report flaky tests here, but since run-webkit-tests
    198             # is run with --exit-after-N-failures=1, we would need to
     209            # is currently run with --exit-after-N-failures=1, we would need to
    199210            # be careful not to report constant failures as flaky due to earlier
    200211            # flaky test making them not fail (no results) in one of the runs.
     
    202213            return False
    203214
     215        # Archive (and remove) second results so layout_test_results() after
     216        # build_and_test_without_patch won't use second results instead of the clean-tree results.
     217        second_results_archive = self._delegate.archive_last_layout_test_results(self._patch)
     218
    204219        if self._build_and_test_without_patch():
    205             return self.report_failure(first_results_archive)  # The error from the previous ._test() run is real, report it.
    206         return False  # Tree must be red, just retry later.
     220            # The error from the previous ._test() run is real, report it.
     221            return self.report_failure(first_results_archive)
     222
     223        clean_tree_results = self._delegate.layout_test_results()
     224        self._expected_failures.grow_expected_failures(clean_tree_results)
     225
     226        return False  # Tree must be redder than we expected, just retry later.
    207227
    208228    def results_archive_from_patch_test_run(self, patch):
  • trunk/Tools/Scripts/webkitpy/tool/bot/commitqueuetask_unittest.py

    r76071 r83795  
    3131
    3232from webkitpy.common.net import bugzilla
     33from webkitpy.common.net.layouttestresults import LayoutTestResults
    3334from webkitpy.common.system.deprecated_logging import error, log
    3435from webkitpy.common.system.outputcapture import OutputCapture
     
    7879
    7980class CommitQueueTaskTest(unittest.TestCase):
    80     def _mock_test_result(self, testname):
    81         return test_results.TestResult(testname, [test_failures.FailureTextMismatch()])
    82 
    8381    def _run_through_task(self, commit_queue, expected_stderr, expected_exception=None, expect_retry=False):
    8482        tool = MockTool(log_executive=True)
     
    191189            ScriptError("MOCK tests failure"),
    192190        ])
     191        # CommitQueueTask will only report flaky tests if we successfully parsed
     192        # results.html and returned a LayoutTestResults object, so we fake one.
     193        commit_queue.layout_test_results = lambda: LayoutTestResults([])
    193194        expected_stderr = """run_webkit_patch: ['clean']
    194195command_passed: success_message='Cleaned working directory' patch='197'
     
    218219            ScriptError("MOCK tests failure"),
    219220        ])
     221        commit_queue.layout_test_results = lambda: LayoutTestResults([])
    220222        # It's possible delegate to fail to archive layout tests, don't try to report
    221223        # flaky tests when that happens.
     
    238240        self._run_through_task(commit_queue, expected_stderr)
    239241
    240     _double_flaky_test_counter = 0
    241 
    242242    def test_double_flaky_test_failure(self):
    243         commit_queue = MockCommitQueue([
     243        class DoubleFlakyCommitQueue(MockCommitQueue):
     244            def __init__(self, error_plan):
     245                MockCommitQueue.__init__(self, error_plan)
     246                self._double_flaky_test_counter = 0
     247
     248            def run_command(self, command):
     249                self._double_flaky_test_counter += 1
     250                MockCommitQueue.run_command(self, command)
     251
     252            def _mock_test_result(self, testname):
     253                return test_results.TestResult(testname, [test_failures.FailureTextMismatch()])
     254
     255            def layout_test_results(self):
     256                if self._double_flaky_test_counter % 2:
     257                    return LayoutTestResults([self._mock_test_result('foo.html')])
     258                return LayoutTestResults([self._mock_test_result('bar.html')])
     259
     260        commit_queue = DoubleFlakyCommitQueue([
    244261            None,
    245262            None,
     
    269286        patch = tool.bugs.fetch_attachment(197)
    270287        task = CommitQueueTask(commit_queue, patch)
    271         self._double_flaky_test_counter = 0
    272 
    273         def mock_failing_results_from_last_run():
    274             CommitQueueTaskTest._double_flaky_test_counter += 1
    275             if CommitQueueTaskTest._double_flaky_test_counter % 2:
    276                 return [self._mock_test_result('foo.html')]
    277             return [self._mock_test_result('bar.html')]
    278 
    279         task._failing_results_from_last_run = mock_failing_results_from_last_run
    280288        success = OutputCapture().assert_outputs(self, task.run, expected_stderr=expected_stderr)
    281289        self.assertEqual(success, False)
     
    303311run_webkit_patch: ['build-and-test', '--no-clean', '--no-update', '--test', '--non-interactive']
    304312command_failed: failure_message='Patch does not pass tests' script_error='MOCK test failure again' patch='197'
     313archive_last_layout_test_results: patch='197'
    305314run_webkit_patch: ['build-and-test', '--force-clean', '--no-update', '--build', '--test', '--non-interactive']
    306315command_passed: success_message='Able to pass tests without patch' patch='197'
     
    331340run_webkit_patch: ['build-and-test', '--no-clean', '--no-update', '--test', '--non-interactive']
    332341command_failed: failure_message='Patch does not pass tests' script_error='MOCK test failure again' patch='197'
     342archive_last_layout_test_results: patch='197'
    333343run_webkit_patch: ['build-and-test', '--force-clean', '--no-update', '--build', '--test', '--non-interactive']
    334344command_failed: failure_message='Unable to pass tests without patch (tree is red?)' script_error='MOCK clean test failure' patch='197'
  • trunk/Tools/Scripts/webkitpy/tool/commands/queues.py

    r83614 r83795  
    5151from webkitpy.tool.bot.flakytestreporter import FlakyTestReporter
    5252from webkitpy.tool.commands.stepsequence import StepSequenceErrorHandler
     53from webkitpy.tool.steps.runtests import RunTests
    5354from webkitpy.tool.multicommandtool import Command, TryAgain
    5455
     
    315316    def _read_file_contents(self, path):
    316317        try:
    317             with codecs.open(path, "r", "utf-8") as open_file:
    318                 return open_file.read()
     318            return self._tool.filesystem.read_text_file(path)
    319319        except OSError, e:  # File does not exist or can't be read.
    320320            return None
     
    326326        if not results_html:
    327327            return None
    328         return LayoutTestResults.results_from_string(results_html)
     328        # FIXME: We should not have to pass a failure_limit_count, but we
     329        # do until run-webkit-tests can be updated save off the value
     330        # of --exit-after-N-failures in results.html/results.json.
     331        # https://bugs.webkit.org/show_bug.cgi?id=58481
     332        return LayoutTestResults.results_from_string(results_html, failure_limit_count=RunTests.NON_INTERACTIVE_FAILURE_LIMIT_COUNT)
    329333
    330334    def _results_directory(self):
  • trunk/Tools/Scripts/webkitpy/tool/commands/queues_unittest.py

    r83614 r83795  
    254254    def test_rollout(self):
    255255        tool = MockTool(log_executive=True)
     256        tool.filesystem.write_text_file('/mock/results.html', '')  # Otherwise the commit-queue will hit a KeyError trying to read the results from the MockFileSystem.
    256257        tool.buildbot.light_tree_on_fire()
    257258        expected_stderr = {
     
    322323        queue = SecondThoughtsCommitQueue()
    323324        queue.bind_to_tool(MockTool())
     325        queue._tool.filesystem.write_text_file('/mock/results.html', '')  # Otherwise the commit-queue will hit a KeyError trying to read the results from the MockFileSystem.
    324326        queue._options = Mock()
    325327        queue._options.port = None
  • trunk/Tools/Scripts/webkitpy/tool/commands/queuestest.py

    r70328 r83795  
    6868        if not tool:
    6969            tool = MockTool()
     70            # This is a hack to make it easy for callers to not have to setup a custom MockFileSystem just to test the commit-queue
     71            # the cq tries to read the layout test results, and will hit a KeyError in MockFileSystem if we don't do this.
     72            tool.filesystem.write_text_file('/mock/results.html', "")
    7073        if not expected_stdout:
    7174            expected_stdout = {}
  • trunk/Tools/Scripts/webkitpy/tool/steps/runtests.py

    r70220 r83795  
    3232
    3333class RunTests(AbstractStep):
     34    # FIXME: This knowledge really belongs in the commit-queue.
     35    NON_INTERACTIVE_FAILURE_LIMIT_COUNT = 1
     36
    3437    @classmethod
    3538    def options(cls):
     
    6063            args.append("--no-new-test-results")
    6164            args.append("--no-launch-safari")
    62             args.append("--exit-after-n-failures=1")
     65            args.append("--exit-after-n-failures=%s" % self.NON_INTERACTIVE_FAILURE_LIMIT_COUNT)
    6366            args.append("--wait-for-httpd")
    6467            # FIXME: Hack to work around https://bugs.webkit.org/show_bug.cgi?id=38912
Note: See TracChangeset for help on using the changeset viewer.