Version 3 (modified by, 11 years ago) (diff)


Further research with regression tests

Isolation of flaky tests

Flaky tests are a nightmare for systematic analysis and enhancement of the regression tests. In many cases it was nearly impossible to make reasonable conclusions about the test results, only by tedious manual analysis. The goal of this project is to systematically assess how flaky the tests are. Can we safely isolate a set of test cases that are usualy flakes and that are never? If yes, we could create a more intelligent test sets that are consistent, hence more predictable. Currently we are working on:

  • Randomize the order of test execution and observe the differences in the results (run-webkit-tests --randomize-order)
  • Execute all tests individually and see the differences to the batch execution (run-webkit-tests --run-singly)

Invariant analysis of top-100 websites

Layout tests could be extended with real online websites such as top-100 (top-300) by using invariant analysis which hides changing content (see Top100Invariant).

Other research ideas:

  • Adding static code metrics and code warnings to EWS
  • Analyze what are the ‘good’ tests (which often fail). Employ different methods to analyze the relationship between the failing test cases and any available data of code coverage, code metrics, etc.
  • Research on moving the test coverage measurement and selection to class level.
  • Research on prioritization based on execution times.
  • Research on code complexity, prioritization of tests based on complexity. Code complexity is traditionally highly related to the optimal number of test cases, thus it is possibly a good basis of test prioritization.
  • Change impact analysis as a service to developers. This will enable developers to check the impact of their modifications, review impacted elements and possibly eliminate bugs before the patch lands in the repository.
  • Branch level coverage. A fine-grained coverage measurement that reflects to the internal structure of the methods. Branch level coverage measurement is more reliable than method level coverage, but it is also more sophisticated to implement.
  • Bugzilla analysis. The bug database could be processed and searched for some relevant information that could aid the impact analysis and defect analysis tasks.
  • Research on test creation. A tool that could give at least a hint on how to test a specific piece of code would be very helpful to the developers when developing new test cases.
  • Research on Static Analysis Results. Research and experiments can be conducted to investigate the relationship between the code complexity and test results, or the effect of moving the analysis to class level, etc. Main tasks: investigate relationship between code complexity and test results using statistical analysis, investigate the effect of using class level information instead of method level information on test case selection.
  • Monitoring of code and test quality attributes using a persistence database.