== squirrelfish == '''Major bits left over:''' * Getters and Setters * toString/valueOf * It occurs to me that these could both (s|g)etters and toString could be "trivially" handled through the standard js -> native -> js call model -- at least in the short-medium term * eval * proper values for "this" '''List of tasks in no order -- pick one, tell everyone what you're doing: ''' '''Geoff is working on: ''' Re-entry into Machine::privateExecute: * More indirection in the register file * Don't overwrite nested register frames when re-entering global execution * Provide API for re-entering non-global execution (i.e., function callbacks) Arguments object '''Cameron is working on: ''' Investigating performance regressions caused by the introduction of new opcodes. In particular, a simple for loop with no body regresses about 25%. It seemed at first that this has to do with the fact that these instructions call arbitrary external functions, but some odd performance differences still occur without these calls. '''Done''': the regression seems to be related to inlining of large function bodies into Machine::privateExecute(). In order to solve this, we have to remove ALWAYS_INLINE from some larger functions, and move some opcodes out into individual functions marked NEVER_INLINE. This is suboptimal, and we still don't know the exact reason why it is happening, but hopefully it will be easy enough to work around for now. Geoff and I worked this out and he landed [http://trac.webkit.org/projects/webkit/changeset/31277 r31277], which fixes the problems we have seen thusfar. Better code generation. We have been pondering whether to have a separate peephole optimization pass or to incorporate peephole optimization into code generation. Either way, we should look at some code generation algorithms based on tile matching. We also want to choose an approach that will be compatible with planned extensions, e.g. superinstructions. Function-related nodes '''Oliver is working on: ''' Making VM throw exceptions for invalid behaviour; Finally blocks. '''Sam is working on (when he sees fit to do so): ''' Implement more "emitCode" functions, along with support in the CodeGenerator and the Machine. IfNode would be a good place to start. '''Maciej is working on: ''' Bracket nodes '''You could take something from Geoff, or make something up yourself, or do one of these:''' Leftover opcodes: * ArgumentsNode * AssignErrorNode * BreakpointCheckStatement * ConstDeclNode * ConstStatementNode * ElementNode * EvalFunctionCallNode * FunctionCallResolveNode (still needs to be fixed to handle thisObj correctly) * NewExprNode (still needs to be fixed to work with js functions) * ParameterNode * PostDecConstNode * PostfixErrorNode * PreDecConstNode * PrefixErrorNode * PropertyNode * ReadModifyConstNode * ThrowNode * TryNode Where SunSpider tests currently fail codegen: * 3d-cube: ungettableGetter * 3d-morph: ungettableGetter * 3d-raytrace: crash on failing toObject conversion for base of call (bad Args to constructor) * access-binary-trees: crash on failing toObject conversion for base of call (bad Args to constructor) * access-fannkuch: crash on failing toObject conversion for base of call (bad Args to constructor) * access-nbody: new expr (NBodySystem) * access-nsieve: ungettableGetter * bitops-3bit-bits-in-byte: SUCCESS * bitops-bits-in-byte: SUCCESS * bitops-bitwise-and: SUCCESS * bitops-nsieve-bits: ungettableGetter * controlflow-recursive: ungettableGetter * crypto-aes: crash on failing toObject conversion for base of call (bad Args to constructor) * crypto-md5: ungettableGetter * crypto-sha1: ungettableGetter * date-format-tofte: local eval - eval(ia[ij] + "()") * date-format-xparb: bracket call - this[func]() * math-cordic: ungettableGetter * math-partial-sums: crash on failing toObject conversion for base of call (bad Args to constructor) * math-spectral-norm: ungettableGetter * regexp-dna: regexp literal * string-base64: crash on failing toObject conversion for base of call (bad Args to constructor) * string-fasta: crash on failing toObject conversion for base of call (bad Args to constructor) * string-tagcloud: trying to get from a base register containing 0x0 (bad codegen?) * string-unpack-code: crash on failing toObject conversion for base of call (bad Args to constructor) * string-validate-input: regexp literal Harness failures: * sunspider-analyze-results: KJS::resolve SHOULD NEVER BE REACHED * sunspider-compare-results: SUCCESS (but not running yet) * sunspider-standalone-compare: KJS::resolve SHOULD NEVER BE REACHED * sunspider-standalone-driver: new expr (invoking Array constructor) Optimize dynamic scopes that aren't closures not to save the environment on return Statically detect presence of "with" and/or "catch" in the parser. Evaluation of a script is supposed to produce a value. This requires storing the value of the last value-producing statement to execute. We need to detect the last top-level value-producing statement in a program, and save its value. Basically, that just means passing an explicit "dst" register to its emitCode function. Make const work -- const info has to go in the symbol table, so writes to const vars can turn to no-ops at compile time. Recover lost optimizations: {{{ optimized multiscope access optimized access to global built-ins (http://trac.webkit.org/projects/webkit/changeset/31226) static type inference of things like "numeric less than" and "string add" // WARNING: If code generation wants to optimize resolves to parent scopes, // it needs to be aware that, for functions that require activations, // the scope chain is off by one, since the activation hasn't been pushed yet. }}} Is it safe for Lists to store a direct pointer to the register file? What if the register file reallocates? Verify that current function gets marked by virtue of being in the register file Change conservative mark of register file to exact mark -- use zero fill plus type tagging to know whether to mark a register Must mark all scope chains in all active scopes -- can do this by walking up the scopeChain pointers in the register file automatic conversion of "this" to global object doesn't work. phase out implementsCall in favor of all clients using an inline function that calls getCallData. Phase out implementsConstruct in favor of all clients using an inline function that calls getContructData. For memory's sake, functions should probably shrink the register file when they return, but doing so causes a minor performance regression. Turn built-in object construct functions non-virtual, since their callers inside the engine know their types. Pointers to registers and labels become invalid if the register or label vector resizes. Mark constant pools for global and eval code Avoid copying the register file when adding globals by keeping spare capacity at the beginning of the register file, just like at the end. GC mark for possibly uninitialized register file Add relevant files to AllInOneFile.cpp. remove irrelevent files If a nested program overwrites the global slot holding a currently executing function, the function won't be marked during GC What things should go in dedicated local variables? CodeBlock::jsValues? CodeBlock::identifiers? VarStatementNode should just be nixed in favor of AssignmentNode. Remove ::execute, ::evaluate, ::optimizeVariableAccess Future optimizations: Find a way to put pre-capacity at the beginning of the register file, so we can add new global symbols without having to move or copy anything. Use RefPtr to indicate use of register -- moves to un-refed registers should be stripped or consolidated to other instructions. - i++ => ++i - less, jtrue => jless optimize out redundant initializations of vars -- often, the var initialization will be dead code. any read of variable before init can statically become "load undefined". a single run of SunSpider performs 1,191,803 var initializations -1 means "never happend" var buckets: [846461] [40445] [350197] [9412] [7531] [50] [9] [178] [35000] [3] [1022] [3] [4499] [-1] [1353] [-1] [-1] [1851] [-1] [0] [-1] [0] [-1] [0] [-1] [0] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] fun buckets: [1297008] [7] [3] [2] [1] [0] [-1] [0] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [0] [-1] [1] [-1] [-1] [0] [-1] [0] [-1] [-1] [-1] [-1] [-1] [999] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] ] for resolve-evaluate-put, we can have a { DontCare, Clean, Dirty } switch -- get slot and if DontCare, set clean, evaluate, set slot if clean instead of branching to see if you've emitted code, just start out with a stub that does that emitting when invoked. single, shared constant pool At least for loops with fewer iterations it would probably be a win to duplicate the loop condition at the start and end of the loop Perhaps we should have a distinguished "condition code" register for expressions in a boolean context. For relational and logical operators we can output directly to the condition code register, for other opcodes you get an extra instruction. Jump instructions can read implicitly from the condition code. That avoids the less writing to r0, it just puts a bool in the condition code register. Can't you just make all opcodes have variants that use constant table operands directly? A named function expression can just enter its name into the symbol table instead of adding an object to the scope chain. Shrink instructions -- usually, don't need a whole word to store int values. Perhaps use tagging of opcodes to encode the first operand. Special work-around instructions when whole words are needed GCC is crazy: {{{ For the program for (var i = 0; i < 100000000; ++i) ; at r31276 of the squirrelfish branch, adding the line Machine.cpp:354 scopeChain = new (&returnInfo[6]) ScopeChain(function->scope()); // scope chain for this activation causes a ~25% slowdown We should write a reduction of this issue for the compiler team, and see what they have to say }}} {{{ Revision 31432 was a 1.4% performance regression because it moved the register vector from a local to a parameter. Making the register vector a data member has the same effect. WTF? }}} {{{ Exception handling throw logic has to pass vPC to a function, and assign the result to vPC, eg. if (!(vPC = throwException(codeBlock, k, scopeChain, registers, r, vPC))) But this causes a 25% regression on the above empty-for-loop test, despite never being hit. To avoid this we need to do: void* throwTarget; ... void Machine::privateExecute(..) { ... // in address table initialiser throwTarget = &&gcc_dependency_hack; ... BEGIN_OPCODE(op_throw) { ... if (!(exceptionTarget = throwException(codeBlock, k, scopeChain, registers, r, vPC))) { ... } ... goto *throwTarget; } gcc_dependency_hack: { vPC = exceptionTarget; NEXT_OPCODE; } } Without this _indirect_ goto we get a 25% regression, if we use a direct goto we still get an 18% regression. }}}