Changes between Initial Version and Version 1 of b3Discussion2015

Nov 12, 2015 10:07:36 AM (6 years ago)

added b3 discussion notes


  • b3Discussion2015

    v1 v1  
     2artschnica data - JSC much faster on similar CPUs if we’re drawing to the screen, slightly slower for purely computational tasks
     5FTL motivation: use c-like compiler to do final optimizations
     610-50x time spent in llvm as JSC code when compiling with FTL
     7b3 goal: reduce compiling time by 5x, this would increase scores on losing benchmarks
     8wrote new compiler from scratch, 10x faster compiling time than llvm
     9b3 uses better instruction selector, register allocator
     10llvm instruction selector uses most of its compile time
     11not done yet, probably measuring data within a month
     12targeting all 64-bit architectures, right now it works best on x86_64, working on arm64
     13b3ir has two div instructions, chillDiv does double division then converts to int, arm div is more like chillDiv
     14JSC regex engine slows down JetStream and Octane2 benchmarks
     15Kraken should speed up
     16llvm doesn’t do or want tail duplication optimization, which would speed up JSC
     17parallel compiling with llvm has a memory bottleneck, but works ok on computers with lots of CPU cores
     18Octane2 needs better garbage collection and better regex engine
     19goal is to be faster than Edge on all benchmarks
     20b3 - “barebones backend” - ssa “static single assignment” compiler like llvm
     21air - assembly intermediate representation
     22  register allocation, macro assembler
     23“bacon, butter, biscuit” - appetizer at a restaurant in campbell
     24air probably takes more memory than b3
     25b3 is lowered to air
     26b3 equivalent of llvmir
     27air equivalent of mc
     30calling conventions! We’re going to follow a calling convention.  Almost C calling convention.
     31use more registers for arguments to prevent storing/loading on stack, which slows down code when calling functions
     32plan to follow calling conventions so that LLINT or baseline JIT can be called from anywhere
     33sometimes requires shuffling registers around so that parameters are in the correct registers
     34JSC on 64-bit platforms dedicate two registers to tag values, which are in callee-saved registers and need to be pushed and popped when using llvm because llvm has different register allocation than JSC
     35coalesces unnecessary mov operations
     36tail call optimizations (unrelated) allow recursive functions without adding to the stack each time.
     37if calling functions knows about the allocated registers in the function, it could break calling convention and do more optimizations, but if the function is recompiled, anything that calls it would need to be recompiled. — this is what inlining is for
     38armv7 doesn’t have enough registers for this optimization to be useful. i386 doesn’t have any registers for calling.  pizlo: If we needed this optimization enough, we could invent our own calling convention.
     39pizlo: If a function is well-behaved, we should raise its inlining threshold
     40profiling bug - need profiling on math
     42rniwa: es6 slides