wiki:SquirrelFishPerfIdeas

Version 6 (modified by mjs@apple.com, 14 years ago) (diff)

--

Ideas for new optimizations for SquirrelFish

  • reduce cost of calling with too many arguments - at least one SunSpider test does this a lot. Some possibilities:
    • Allocate a few extra parameter slows for every function, so just a few extra is cheap
    • If we can limit or remove the functionality of randomFunc.arguments, then for functions that do not themselves use arguments we could just overwrite the extra args.
  • Make variants of instructions that can read directly from the constant pool, to avoid all the "load" insns you get when dealing with constants.
  • Atomize constant strings, so any given string is only in the constant pool once.
  • Maybe sorting opcode implementations by frequency of use would make things faster.

Larger ideas:

  • Store primitive types directly in registers, along with type info. Tamarin's 64 bit NaN encoding trick may work well here. We could have both instructions that statically infer a specific type, and dynamic type inference which uses instructions that think a particular type is more likely and optimize for that case with checks.
  • Explicit vtables. Right now we use C++ virtual methods for polymorphic behavior of JS types. An explicit vtable could store per-type pure data as well as functions, turning some things that are currently virtual method calls into simple pointer derefs.
  • Better codegen framework. We don't have a great way to pick from one of several instructions, using a "tile matching" algorithm may be a good way. This could enable super-instructions, type-specialized instructions, and handling of the fact that you may want different codegen in value, condition and void contexts.

Analysis of SunSpider tests that show little improvement

These tests don't show as much improvement from SquirrelFish as expected (though in some cases they are as much as 7% improved).

3d-morph: Suffering from missing cross-scope access optimization and DontDelete global optimization (9.4% deep time in resolve()). Probably suffering from lack of static type inference (lots of time in jsNumberCell, JSImmediate, NumberImp::toNumber, etc.

access-nbody: Major factor seems to be lack of type specialization (lots of time in number-related stuff). A huge proportion of time is flat time in privatExecute. Some hit (1%?) from lack of multiscope.

date-format-tofte: Lots of time spent in parsing and code generation for eval. Can codegen itself be optimized? Also lots of time in makeFunction(), a big chunk of this is making the empty prototype for the function object, as well as setting the special properties (prototype, constructor, length), perhaps those coudl be handled in a smarter way. Also function call overhead for FuncDeclNode::makeFunction itself. Perhaps it should be inlined. Also it is suspicious that call overhead for makeFunction would be a bottleneck, is it getting called more often than it should? Also taking some hit from lack of multiscope.

date-format-xparb: taking a 5% hit from lack of multiscope optimization. Also spending an awful lot of time in string appends.

math-partial-sums: Lots of time in resolve/resolveBase. (Lack of global opt + possible bug in test). Hit from lack of type specialization. This is taking a fair hit from the VM_EXCEPTION_CHECK call in put_prop_id. We should move this onto our new zero cost exception check scheme (pass vPC in and out etc).

regexp-dna: practically all time in this test is in the regexp engine. Will need regexp engine hacking to improve.

string-base64: 5% of time spent in slow global lookup. Other improvement opportunities: some form of immediate for one-char strings; void allocating a fresh StringInstance wrapper all the time.

string-fasta:

string-tagcloud:

string-unpack-code:

string-validate-input: