Changeset 221474 in webkit
- Timestamp:
- Aug 31, 2017 11:36:32 PM (7 years ago)
- Location:
- trunk/Tools
- Files:
-
- 3 edited
Legend:
- Unmodified
- Added
- Removed
-
trunk/Tools/BuildSlaveSupport/build.webkit.org-config/master.cfg
r220238 r221474 659 659 self.statusLine = [] 660 660 661 foundItems = re.findall(" Tests failed\((\d+)\):", logText)661 foundItems = re.findall("Unexpected failures \((\d+)\):", logText) 662 662 if (foundItems): 663 663 self.incorrectTests = int(foundItems[0]) 664 664 665 foundItems = re.findall(" Tests that crashed\((\d+)\):", logText)665 foundItems = re.findall("Unexpected crashes \((\d+)\):", logText) 666 666 if (foundItems): 667 667 self.crashedTests = int(foundItems[0]) 668 668 669 foundItems = re.findall(" Tests that timed out\((\d+)\):", logText)669 foundItems = re.findall("Unexpected timeouts \((\d+)\):", logText) 670 670 if (foundItems): 671 671 self.timedOutTests = int(foundItems[0]) 672 672 673 foundItems = re.findall("Tests skipped \((\d+)\):", logText)674 if (foundItems):675 self.skippedTests = int(foundItems[0])676 677 673 self.totalFailedTests = self.incorrectTests + self.crashedTests + self.timedOutTests 678 674 679 675 if self.totalFailedTests > 0: 680 676 self.statusLine = [ 681 "%d API tests failed, %d crashed, %d timed out, %d skipped" % 682 (self.incorrectTests, self.crashedTests, self.timedOutTests, self.skippedTests) 677 "%d API tests failed, %d crashed, %d timed out" % (self.incorrectTests, self.crashedTests, self.timedOutTests) 683 678 ] 684 679 -
trunk/Tools/ChangeLog
r221468 r221474 1 2017-08-31 Carlos Garcia Campos <cgarcia@igalia.com> 2 3 [GTK] Improve the way unit test are run and the results reported 4 https://bugs.webkit.org/show_bug.cgi?id=176104 5 6 Reviewed by Carlos Alberto Lopez Perez. 7 8 There are several issues with the way unit tests are run by run-gtk-tests and also with the way results are 9 reported: 10 11 - The results summary only mentions the test binaries, not the actual test cases, so you always have to scroll 12 up to find the actual test cases failing. 13 - The number of reported failures is the number of test binaries that failed, so if a new test case fails for 14 the same binary in a new revision, we won't notice it just looking at the number of failures. 15 - We show detailed information about skipped test in the results summary, which is just noise. 16 - In the case of glib tests, when a test case times out, we finish the test suite, instead of continuing with the 17 rest of the test cases like we do for normal failures or crashes. If a new test case fails after a test case that 18 timed out we will not notice it until we fix or skip the test cases timing out. 19 - In the case of glib tests, the timeout is applied to the whole suite, instead of per test case, we have a hack 20 to make it longer only for that. It has worked so far, but it doesn't scale, and it's an ugly hack. 21 - It's not currently possible to detect flaky tests, because again, we know the binaries/suites that failed but 22 not the actual test cases. 23 24 This patch fixes all these issues and makes it possible to add support for flaky tests in a follow up patch. 25 26 * BuildSlaveSupport/build.webkit.org-config/master.cfg: 27 (RunGtkAPITests.commandComplete): Update the RunGtkAPITests step to parse the new output. 28 * Scripts/run-gtk-tests: 29 (TestRunner._start_timeout): Helper to start the timeout if needed. 30 (TestRunner._start_timeout._alarm_handler): Raise timeout exception. 31 (TestRunner._stop_timeout): Helper to stop the timeout if needed. 32 (TestRunner._waitpid): Merged waitpid and return_code_from_exit_status. 33 (TestRunner._run_test_glib): Do not double the timeout anymore, we now start/stop the timeout for every test 34 case. Return a dictionary where keys are test cases and values the results only in case of failures. 35 (TestRunner._run_test_glib.parse_line): Update the parser to also detect test cases and the results. 36 (TestRunner._run_test_glib.parse_line.set_test_result): Helper to set the result of a test case. 37 (TestRunner._run_google_test): Return a dictionary where key is the test case and value is the result only in 38 case of failure. 39 (TestRunner._run_google_test_suite): Updated now that _run_google_test returns a dictionary. 40 (TestRunner.run_tests): Handle the results dictionary and show the results with information about test cases 41 failing and grouped by test binaries. 42 1 43 2017-08-31 Filip Pizlo <fpizlo@apple.com> 2 44 -
trunk/Tools/Scripts/run-gtk-tests
r221295 r221474 259 259 return True 260 260 261 def _get_child_pid_from_test_output(self, output):262 if not output:263 return -1264 match = re.search(r'\(pid=(?P<child_pid>[0-9]+)\)', output)265 if not match:266 return -1267 return int(match.group('child_pid'))268 269 261 def _kill_process(self, pid): 270 262 try: … … 274 266 pass 275 267 276 def _run_test_command(self, command, timeout=-1): 277 def alarm_handler(signum, frame): 268 @staticmethod 269 def _start_timeout(timeout): 270 if timeout <= 0: 271 return 272 273 def _alarm_handler(signum, frame): 278 274 raise TestTimeout 279 275 280 child_pid = [-1] 281 def parse_line(line, child_pid = child_pid): 282 if child_pid[0] == -1: 283 child_pid[0] = self._get_child_pid_from_test_output(line) 284 285 sys.stdout.write(line) 286 287 def waitpid(pid): 288 while True: 289 try: 290 return os.waitpid(pid, 0) 291 except (OSError, IOError) as e: 292 if e.errno == errno.EINTR: 293 continue 294 raise 295 296 def return_code_from_exit_status(status): 297 if os.WIFSIGNALED(status): 298 return -os.WTERMSIG(status) 299 elif os.WIFEXITED(status): 300 return os.WEXITSTATUS(status) 301 else: 276 signal(SIGALRM, _alarm_handler) 277 alarm(timeout) 278 279 @staticmethod 280 def _stop_timeout(timeout): 281 if timeout <= 0: 282 return 283 284 alarm(0) 285 286 def _waitpid(self, pid): 287 while True: 288 try: 289 dummy, status = os.waitpid(pid, 0) 290 if os.WIFSIGNALED(status): 291 return -os.WTERMSIG(status) 292 if os.WIFEXITED(status): 293 return os.WEXITSTATUS(status) 294 302 295 # Should never happen 303 296 raise RuntimeError("Unknown child exit status!") 297 except (OSError, IOError) as e: 298 if e.errno == errno.EINTR: 299 continue 300 if e.errno == errno.ECHILD: 301 # This happens if SIGCLD is set to be ignored or waiting 302 # for child processes has otherwise been disabled for our 303 # process. This child is dead, we can't get the status. 304 return 0 305 raise 306 307 def _run_test_glib(self, test_program): 308 command = ['gtester', '-k'] 309 if self._options.verbose: 310 command.append('--verbose') 311 for test_case in self._test_cases_to_skip(test_program): 312 command.extend(['-s', test_case]) 313 command.append(test_program) 314 315 timeout = self._options.timeout 316 test = os.path.join(os.path.basename(os.path.dirname(test_program)), os.path.basename(test_program)) 317 if test in TestRunner.SLOW: 318 timeout *= 5 319 320 test_context = { "child-pid" : -1, "did-timeout" : False, "current_test" : None } 321 def parse_line(line, test_context = test_context): 322 if not line: 323 return 324 325 match = re.search(r'\(pid=(?P<child_pid>[0-9]+)\)', line) 326 if match: 327 test_context["child-pid"] = int(match.group('child_pid')) 328 sys.stdout.write(line) 329 return 330 331 def set_test_result(test, result): 332 if result == "FAIL": 333 if test_context["did-timeout"] and result == "FAIL": 334 test_context[test] = "TIMEOUT" 335 else: 336 test_context[test] = result 337 test_context["did-timeout"] = False 338 test_context["current_test"] = None 339 self._stop_timeout(timeout) 340 self._start_timeout(timeout) 341 342 normalized_line = line.strip().replace(' ', '') 343 if not normalized_line: 344 return 345 346 if normalized_line[0] == '/': 347 test, result = normalized_line.split(':', 1) 348 if result in ["OK", "FAIL"]: 349 set_test_result(test, result) 350 else: 351 test_context["current_test"] = test 352 elif normalized_line in ["OK", "FAIL"]: 353 set_test_result(test_context["current_test"], normalized_line) 354 355 sys.stdout.write(line) 304 356 305 357 pid, fd = os.forkpty() … … 308 360 sys.exit(0) 309 361 310 if timeout > 0: 311 signal(SIGALRM, alarm_handler) 312 alarm(timeout) 313 314 try: 315 common.parse_output_lines(fd, parse_line) 316 if timeout > 0: 317 alarm(0) 318 except TestTimeout: 319 self._kill_process(pid) 320 if child_pid[0] > 0: 321 self._kill_process(child_pid[0]) 322 raise 323 324 try: 325 dummy, status = waitpid(pid) 326 except OSError as e: 327 if e.errno != errno.ECHILD: 328 raise 329 # This happens if SIGCLD is set to be ignored or waiting 330 # for child processes has otherwise been disabled for our 331 # process. This child is dead, we can't get the status. 332 status = 0 333 334 return return_code_from_exit_status(status) 335 336 def _run_test_glib(self, test_program): 337 tester_command = ['gtester', '-k'] 338 if self._options.verbose: 339 tester_command.append('--verbose') 340 for test_case in self._test_cases_to_skip(test_program): 341 tester_command.extend(['-s', test_case]) 342 tester_command.append(test_program) 343 # This timeout is supposed to be per test case, but in the case of GLib tests it affects all the tests cases of 344 # the same test program. Some test programs like TestLoaderClient, that have a lot of test cases, often time out 345 # in the bots because the timeout is not enough to run all the tests cases. So, we use a longer timeout for GLib 346 # tests (timeout * 2). 347 timeout = self._options.timeout * 2 348 test = os.path.join(os.path.basename(os.path.dirname(test_program)), os.path.basename(test_program)) 349 if test in TestRunner.SLOW: 350 timeout *= 5 351 352 return self._run_test_command(tester_command, timeout) 362 self._start_timeout(timeout) 363 364 while (True): 365 try: 366 common.parse_output_lines(fd, parse_line) 367 break 368 except TestTimeout: 369 assert test_context["child-pid"] > 0 370 self._kill_process(test_context["child-pid"]) 371 test_context["child-pid"] = -1 372 test_context["did-timeout"] = True 373 374 self._stop_timeout(timeout) 375 del test_context["child-pid"] 376 del test_context["did-timeout"] 377 del test_context["current_test"] 378 379 self._waitpid(pid) 380 return test_context 353 381 354 382 def _get_tests_from_google_test_suite(self, test_program): … … 375 403 376 404 def _run_google_test(self, test_program, subtest): 377 test_command = [test_program, '--gtest_filter=%s' % (subtest)]405 command = [test_program, '--gtest_filter=%s' % (subtest)] 378 406 timeout = self._options.timeout 379 407 if subtest in TestRunner.SLOW: 380 408 timeout *= 5 381 409 382 status = self._run_test_command(test_command, timeout) 410 pid, fd = os.forkpty() 411 if pid == 0: 412 os.execvpe(command[0], command, self._test_env) 413 sys.exit(0) 414 415 self._start_timeout(timeout) 416 try: 417 common.parse_output_lines(fd, sys.stdout.write) 418 status = self._waitpid(pid) 419 except TestTimeout: 420 self._kill_process(pid) 421 return { subtest : "TIMEOUT" } 422 423 self._stop_timeout(timeout) 424 383 425 if status == -SIGSEGV: 384 426 sys.stdout.write("**CRASH** %s\n" % subtest) 385 427 sys.stdout.flush() 386 return status 428 return { subtest : "CRASH" } 429 430 if status != 0: 431 return { subtest : "FAIL" } 432 433 return { } 387 434 388 435 def _run_google_test_suite(self, test_program): 389 re tcode = 0436 result = { } 390 437 for subtest in self._get_tests_from_google_test_suite(test_program): 391 if self._run_google_test(test_program, subtest): 392 retcode = 1 393 return retcode 438 result.update(self._run_google_test(test_program, subtest)) 439 return result 394 440 395 441 def _run_test(self, test_program): … … 416 462 self._tests = [test for test in self._tests if self._should_run_test_program(test)] 417 463 418 crashed_tests = []419 failed_tests = []420 timed_out_tests = []464 crashed_tests = {} 465 failed_tests = {} 466 timed_out_tests = {} 421 467 try: 422 468 for test in self._tests: 423 exit_status_code = 0 424 try: 425 exit_status_code = self._run_test(test) 426 except TestTimeout: 427 sys.stdout.write("TEST: %s: TIMEOUT\n" % test) 428 sys.stdout.flush() 429 timed_out_tests.append(test) 430 431 if exit_status_code == -SIGSEGV: 432 sys.stdout.write("TEST: %s: CRASHED\n" % test) 433 sys.stdout.flush() 434 crashed_tests.append(test) 435 elif exit_status_code != 0: 436 failed_tests.append(test) 469 results = self._run_test(test) 470 for test_case, result in results.iteritems(): 471 if result == "FAIL": 472 failed_tests.setdefault(test, []).append(test_case) 473 elif result == "TIMEOUT": 474 timed_out_tests.setdefault(test, []).append(test_case) 475 elif result == "CRASH": 476 crashed_tests.setdefault(test, []).append(test_case) 437 477 finally: 438 478 self._tear_down_testing_environment() 439 479 440 480 if failed_tests: 441 names = [test.replace(self._test_programs_base_dir(), '', 1) for test in failed_tests] 442 sys.stdout.write("Tests failed (%d): %s\n" % (len(names), ", ".join(names))) 481 sys.stdout.write("\nUnexpected failures (%d)\n" % (sum(len(value) for value in failed_tests.itervalues()))) 482 for test in failed_tests: 483 sys.stdout.write(" %s\n" % (test.replace(self._test_programs_base_dir(), '', 1))) 484 for test_case in failed_tests[test]: 485 sys.stdout.write(" %s\n" % (test_case)) 443 486 sys.stdout.flush() 444 487 445 488 if crashed_tests: 446 names = [test.replace(self._test_programs_base_dir(), '', 1) for test in crashed_tests] 447 sys.stdout.write("Tests that crashed (%d): %s\n" % (len(names), ", ".join(names))) 489 sys.stdout.write("\nUnexpected crashes (%d)\n" % (sum(len(value) for value in crashed_tests.itervalues()))) 490 for test in crashed_tests: 491 sys.stdout.write(" %s\n" % (test.replace(self._test_programs_base_dir(), '', 1))) 492 for test_case in crashed_tests[test]: 493 sys.stdout.write(" %s\n" % (test_case)) 448 494 sys.stdout.flush() 449 495 450 496 if timed_out_tests: 451 names = [test.replace(self._test_programs_base_dir(), '', 1) for test in timed_out_tests] 452 sys.stdout.write("Tests that timed out (%d): %s\n" % (len(names), ", ".join(names))) 453 sys.stdout.flush() 454 455 if self._skipped_tests and self._options.skipped_action == 'skip': 456 sys.stdout.write("Tests skipped (%d):\n%s\n" % 457 (len(self._skipped_tests), 458 "\n".join([str(skipped) for skipped in self._skipped_tests]))) 497 sys.stdout.write("\nUnexpected timeouts (%d)\n" % (sum(len(value) for value in timed_out_tests.itervalues()))) 498 for test in timed_out_tests: 499 sys.stdout.write(" %s\n" % (test.replace(self._test_programs_base_dir(), '', 1))) 500 for test_case in timed_out_tests[test]: 501 sys.stdout.write(" %s\n" % (test_case)) 459 502 sys.stdout.flush() 460 503
Note: See TracChangeset
for help on using the changeset viewer.