Context Navigation

← Previous Changeset
Next Changeset →

Changeset 96462 in webkit

Timestamp:

Oct 1, 2011 2:58:45 PM (13 years ago)

Author:

fpizlo@apple.com

Message:

Bencher script makes it difficult to do automated performance testing
https://bugs.webkit.org/show_bug.cgi?id=69207

Reviewed by Sam Weinig.

This adds two new features:

The ability to disable automatic VM detection, which is flaky if any
profiling features are enabled in jsc.

The ability to compute, and report, a scaled result for all benchmark
suites. It is the geometric mean of three numbers: SunSpider's
arithmetic mean, V8's geometric mean, and Kraken's arithmetic mean.
It is also possible to turn off all other output from bencher and just
get this number with the --brief option.

Scripts/bencher:

Location:

trunk/Tools

Files:

: 2 edited

ChangeLog (modified) (1 diff)
Scripts/bencher (modified) (24 diffs)

Legend:

: Unmodified
: Added
: Removed

trunk/Tools/ChangeLog

-                      r96460
+                      r96462
+-10-01  Filip Pizlo  <fpizlo@apple.com>
+        Bencher script makes it difficult to do automated performance testing
+        https://bugs.webkit.org/show_bug.cgi?id=69207
+        Reviewed by Sam Weinig.
+        This adds two new features:
+        The ability to disable automatic VM detection, which is flaky if any
+        profiling features are enabled in jsc.
+        The ability to compute, and report, a scaled result for all benchmark
+        suites. It is the geometric mean of three numbers: SunSpider's
+        arithmetic mean, V8's geometric mean, and Kraken's arithmetic mean.
+        It is also possible to turn off all other output from bencher and just
+        get this number with the --brief option.
+        * Scripts/bencher:
 -10-01  Sam Weinig  <sam@webkit.org>

trunk/Tools/Scripts/bencher

-                      r94103
+                      r96462
 $timeMode=:auto
 $keepFiles=false
+$forceVMKind=nil
+$brief=false
 # Helpful functions and classes
 …
   puts "                     are 'preciseTime', 'date', and 'auto'.  Default is"
   puts "                     'auto', which automatically detects the best way."
+  puts "--force-vm-kind      Turn off auto-detection of VM kind, and assume that it is"
+  puts "                     the one specified.  Valid arguments are 'jsc' or"
+  puts "                     'DumpRenderTree'."
   puts "--v8-only            Only run V8."
   puts "--sunspider-only     Only run SunSpider."
 …
   puts "--keep-files         Keep temporary files.  Useful for debugging."
   puts "--verbose or -v      Print more stuff."
+  puts "--brief              Print only the final result for each VM."
   puts "--help or -h         Display this message."
   puts
 …
     @nameKind = nameKind
+    Tempfile.open("bencher-vmtest") {
+      | file |
+      file.puts "print(\"here\");"
+      file.flush
+      result = nil
+      @vmType = :jsc
+      run(file.path) {
+        | inp |
+        result = inp.read
+    if $forceVMKind
+      @vmType = $forceVMKind
+    else
+      Tempfile.open("bencher-vmtest") {
+        | file |
+        file.puts "print(\"here\");"
+        file.flush
+        result = nil
+        @vmType = :jsc
+        run(file.path) {
+          | inp |
+          result = inp.read
+          $stderr.puts "stdout: #{result}" if $verbosity>=2
+        }
+        if result.chomp == "here"
+          $stderr.puts "#{@name} is definitely a jsc-style VM." if $verbosity>=1
+          @vmType = :jsc
+        else
+          $stderr.puts "Assuming that #{@name} is a DumpRenderTree-style VM." if $verbosity>=1
+          @vmType = :dumpRenderTree
+        end
+      }
+      if result.chomp == "here"
+        $stderr.puts "#{@name} is definitely a jsc-style VM." if $verbosity>=1
+        @vmType = :jsc
+      else
+        $stderr.puts "Assuming that #{@name} is a DumpRenderTree-style VM." if $verbosity>=1
+        @vmType = :dumpRenderTree
+      end
+    }
+    end
   end
 …
 class BenchmarkSuite
   def initialize(name, path)
+  def initialize(name, path, preferredMean)
     @name = name
     @path = path
+    @preferredMean = preferredMean
     @benchmarks = []
   end
 …
       not yield benchmark
+    }
+  end
+  def preferredMean
+    @preferredMean
+  end
+  def computeMean(stat)
+    stat.send @preferredMean
   end
 end
 …
 def statsToStr(stats)
+  lpad(numToStr(stats.mean),11)+"+-"+rpad(numToStr(stats.confInt),9)
+  if $inner*$outer == 1
+    string = numToStr(stats.mean)
+    raise unless string =~ /\./
+    left = $~.pre_match
+    right = $~.post_match
+    lpad(left,12)+"."+rpad(right,9)
+  else
+    lpad(numToStr(stats.mean),11)+"+-"+rpad(numToStr(stats.confInt),9)
+  end
 end
 …
                  ['--exclude-kraken', GetoptLong::NO_ARGUMENT],
                  ['--benchmarks', GetoptLong::REQUIRED_ARGUMENT],
+                 ['--force-vm-kind', GetoptLong::REQUIRED_ARGUMENT],
                  ['--load-once', GetoptLong::NO_ARGUMENT],
                  ['--keep-files', GetoptLong::NO_ARGUMENT],
                  ['--verbose', '-v', GetoptLong::NO_ARGUMENT],
+                 ['--brief', GetoptLong::NO_ARGUMENT],
                  ['--help', '-h', GetoptLong::NO_ARGUMENT]).each {
     | opt, arg |
 …
       else
         quickFail("Expected either 'preciseTime', 'date', or 'auto' for --time-mode, but got '#{arg}'.",
+                  "Invalid argument for command-line option")
+      end
+    when '--force-vm-kind'
+      if arg.upcase == "JSC"
+        $forceVMKind = :jsc
+      elsif arg.upcase == "DUMPRENDERTREE"
+        $forceVMKind = :dumpRenderTree
+      elsif arg.upcase == "AUTO"
+        $forceVMKind = nil
+      else
+        quickFail("Expected either 'jsc' or 'DumpRenderTree' for --force-vm-kind, but got '#{arg}'.",
                   "Invalid argument for command-line option")
       end
 …
     when '--verbose'
       $verbosity += 1
+    when '--brief'
+      $brief = true
     when '--help'
       usage
 …
   end
   SUNSPIDER = BenchmarkSuite.new("SunSpider", SUNSPIDER_PATH)
+  SUNSPIDER = BenchmarkSuite.new("SunSpider", SUNSPIDER_PATH, :arithmeticMean)
   ["3d-cube", "3d-morph", "3d-raytrace", "access-binary-trees",
    "access-fannkuch", "access-nbody", "access-nsieve",
 …
+  }
   V8 = BenchmarkSuite.new("V8", V8_PATH)
+  V8 = BenchmarkSuite.new("V8", V8_PATH, :geometricMean)
   ["crypto", "deltablue", "earley-boyer", "raytrace",
    "regexp", "richards", "splay"].each {
 …
+  }
   KRAKEN = BenchmarkSuite.new("Kraken", KRAKEN_PATH)
+  KRAKEN = BenchmarkSuite.new("Kraken", KRAKEN_PATH, :arithmeticMean)
   ["ai-astar", "audio-beat-detection", "audio-dft", "audio-fft",
    "audio-oscillator", "imaging-darkroom", "imaging-desaturate",
 …
     $suitesOnVMsForSuite[suite] = []
+  }
+  $suitesOnVMsForVM = {}
+  $vms.each {
+    | vm |
+    $suitesOnVMsForVM[vm] = []
+  }
   $benchmarksOnVMs = []
 …
       $suitesOnVMs << suiteOnVM
       $suitesOnVMsForSuite[suite] << suiteOnVM
+      $suitesOnVMsForVM[vm] << suiteOnVM
       suite.benchmarks.each {
         | benchmark |
 …
   $benchpad = ($benchmarks +
                ["<arithmetic>", "<geometric>", "<harmonic>"]).collect {
+               ["<arithmetic> *", "<geometric> *", "<harmonic> *"]).collect {
     | benchmark |
     benchmark.to_s.size
 …
     vm.to_s.size
   }.max + 1
+  unless $brief
+.times {
+      | idx |
+      $stderr.print "\rStarting in #{3-idx}..."
+      $stderr.flush
+      sleep 1
+    }
+    $stderr.print "\r                       \r"
+    $stderr.flush
+  end
   $plans.each_with_index {
     | plan, idx |
     if $verbosity == 0
+    if $verbosity == 0 and not $brief
       text1 = lpad(idx.to_s,$plans.size.to_s.size)+"/"+$plans.size.to_s
       text2 = plan.suite.to_s+"/"+plan.benchmark.to_s+"/"+plan.vm.to_s
 …
+  }
   if $verbosity == 0
+  if $verbosity == 0 and not $brief
     $stderr.print "\r#{$plans.size}/#{$plans.size} #{' '*($suitepad+1+$benchpad+1+$vmpad)}"
     $stderr.puts "\r#{$plans.size}/#{$plans.size}"
   end
+  # Compute the geomean of the preferred means of results on a SuiteOnVM
+  $overallResults = []
+  $vms.each {
+    | vm |
+    result = Stats.new
+    $outer.times {
+      | outerIndex |
+      $inner.times {
+        | innerIndex |
+        curResult = Stats.new
+        $suitesOnVMsForVM[vm].each {
+          | suiteOnVM |
+          # For a given iteration, suite, and VM, compute the suite's preferred mean
+          # over the data collected for all benchmarks in that suite. We'll have one
+          # sample per benchmark. For example on V8 this will be the geomean of 1
+          # sample for crypto, 1 sample for deltablue, and so on, and 1 sample for
+          # splay.
+          curResult.add(suiteOnVM.suite.computeMean(suiteOnVM.statsForIteration(outerIndex, innerIndex)))
+        }
+        # curResult now holds 1 sample for each of the means computed in the above
+        # loop. Compute the geomean over this, and store it.
+        result.add(curResult.geometricMean)
+      }
+    }
+    # $overallResults will have a Stats for each VM. That Stats object will hold
+    # $inner*$outer geomeans, allowing us to compute the arithmetic mean and
+    # confidence interval of the geomeans of preferred means. Convoluted, but
+    # useful and probably sound.
+    $overallResults << result
+  }
   if $verbosity >= 2
 …
+    }
   end
   reportName =
     (if ($vms.collect {
 …
      end) +
     "_benchReport.txt"
+  $stderr.puts "Generating benchmark report at #{reportName}"
+  unless $brief
+    $stderr.puts "Generating benchmark report at #{reportName}"
+  end
   outp = $stdout
 …
   end
+  def allSummaryStats(outp, accumulators)
+    summaryStats(outp, accumulators, "<arithmetic>") {
+  def meanName(currentMean, preferredMean)
+    result = "<#{currentMean}>"
+    if "#{currentMean}Mean" == preferredMean.to_s
+      result += " *"
+    end
+    result
+  end
+  def allSummaryStats(outp, accumulators, preferredMean)
+    summaryStats(outp, accumulators, meanName("arithmetic", preferredMean)) {
       | stat |
       stat.arithmeticMean
+    }
     summaryStats(outp, accumulators, "<geometric>") {
+    summaryStats(outp, accumulators, meanName("geometric", preferredMean)) {
       | stat |
       stat.geometricMean
+    }
     summaryStats(outp, accumulators, "<harmonic>") {
+    summaryStats(outp, accumulators, meanName("harmonic", preferredMean)) {
       | stat |
       stat.harmonicMean
 …
+    }
     outp.puts
     allSummaryStats(outp, $suitesOnVMsForSuite[suite])
+    allSummaryStats(outp, $suitesOnVMsForSuite[suite], suite.preferredMean)
     outp.puts if $suites.size > 1
+  }
 …
     printVMs(outp)
     outp.puts "All benchmarks:"
+    allSummaryStats(outp, $vms)
+  end
+  if outp != $stdout
+    allSummaryStats(outp, $vms, nil)
+    outp.puts
+    printVMs(outp)
+    outp.puts "Geomean of preferred means:"
+    outp.print "   "
+    outp.print rpad("<scaled-result>", $benchpad)
+    outp.print " "
+    $vms.size.times {
+      | index |
+      if index != 0
+        outp.print " "+$overallResults[index].compareTo($overallResults[index-1]).shortForm
+      end
+      outp.print statsToStr($overallResults[index])
+    }
+    if $overallResults.size>=2
+      outp.print("    "+$overallResults[-1].compareTo($overallResults[0]).to_s)
+    end
+    outp.puts
+    outp.puts
+  end
+  if outp != $stdout and not $brief
     outp.close
     puts
 …
   end
+  if $brief
+    puts($overallResults.collect{|stats| stats.mean}.join("\t"))
+    puts($overallResults.collect{|stats| stats.confInt}.join("\t"))
+  end
 rescue => e
   fail(e)

Note: See TracChangeset for help on using the changeset viewer.

Context Navigation

Changeset 96462 in webkit

Legend:

trunk/Tools/ChangeLog

trunk/Tools/Scripts/bencher

Download in other formats: