wiki:b3Discussion2015

Version 2 (modified by achristensen@apple.com, 8 years ago) (diff)

added newlines

Geoff:
artschnica data - JSC much faster on similar CPUs if we’re drawing to the screen, slightly slower for purely computational tasks

pizlo:
FTL motivation: use c-like compiler to do final optimizations
10-50x time spent in llvm as JSC code when compiling with FTL
b3 goal: reduce compiling time by 5x, this would increase scores on losing benchmarks
wrote new compiler from scratch, 10x faster compiling time than llvm
b3 uses better instruction selector, register allocator
llvm instruction selector uses most of its compile time
not done yet, probably measuring data within a month
targeting all 64-bit architectures, right now it works best on x86_64, working on arm64
b3ir has two div instructions, chillDiv does double division then converts to int, arm div is more like chillDiv
JSC regex engine slows down JetStream and Octane2 benchmarks
Kraken should speed up
llvm doesn’t do or want tail duplication optimization, which would speed up JSC
parallel compiling with llvm has a memory bottleneck, but works ok on computers with lots of CPU cores
Octane2 needs better garbage collection and better regex engine
goal is to be faster than Edge on all benchmarks
b3 - “barebones backend” - ssa “static single assignment” compiler like llvm
air - assembly intermediate representation

register allocation, macro assembler

“bacon, butter, biscuit” - appetizer at a restaurant in campbell
air probably takes more memory than b3
b3 is lowered to air
b3 equivalent of llvmir
air equivalent of mc

Michael:
calling conventions! We’re going to follow a calling convention. Almost C calling convention.
use more registers for arguments to prevent storing/loading on stack, which slows down code when calling functions
plan to follow calling conventions so that LLINT or baseline JIT can be called from anywhere
sometimes requires shuffling registers around so that parameters are in the correct registers
JSC on 64-bit platforms dedicate two registers to tag values, which are in callee-saved registers and need to be pushed and popped when using llvm because llvm has different register allocation than JSC
coalesces unnecessary mov operations
tail call optimizations (unrelated) allow recursive functions without adding to the stack each time.
if calling functions knows about the allocated registers in the function, it could break calling convention and do more optimizations, but if the function is recompiled, anything that calls it would need to be recompiled. — this is what inlining is for
armv7 doesn’t have enough registers for this optimization to be useful. i386 doesn’t have any registers for calling. pizlo: If we needed this optimization enough, we could invent our own calling convention.
pizlo: If a function is well-behaved, we should raise its inlining threshold
profiling bug - need profiling on math

rniwa: es6 slides