Changeset 278576 in webkit


Ignore:
Timestamp:
Jun 7, 2021 3:51:25 PM (3 years ago)
Author:
mark.lam@apple.com
Message:

Put the Baseline JIT prologue and op_loop_hint code in JIT thunks.
https://bugs.webkit.org/show_bug.cgi?id=226375

Reviewed by Keith Miller and Robin Morisset.

Baseline JIT prologue code varies in behavior based on several variables. These
variables include (1) whether the prologue does any arguments value profiling,
(2) whether the prologue is for a constructor, and (3) whether the compiled
CodeBlock will have such a large frame that it is greater than the stack reserved
zone (aka red zone) which would require additional stack check logic.

The pre-existing code would generate specialized code based on these (and other
variables). In converting to using thunks for the prologue, we opt not to
convert these specializations into runtime checks. Instead, the implementation
uses 1 of 8 possible specialized thunks to reduce the need to pass arguments for
runtime checks. The only needed argument passed to the prologue thunks is the
codeBlock pointer.

There are 8 possible thunks because we specialize based on 3 variables:

  1. doesProfiling
  2. isConstructor
  3. hasHugeFrame

23 yields 8 permutations of prologue thunk specializations.

Similarly, there are also 8 analogous arity fixup prologues that work similarly.

The op_loop_hint thunk only takes 1 runtime argument: the bytecode offset.

We've tried doing the loop_hint optimization check in the thunk (in order to move
both the fast and slow path into the thunk for maximum space savings). However,
this seems to have some slight negative impact on benchmark performance. We ended
up just keeping the fast path and instead have the slow path call a thunk to do
its work. This realizes the bulk of the size savings without the perf impact.

This patch also optimizes op_enter a bit more by eliminating the need to pass any
arguments to the thunk. The thunk previously took 2 arguments: localsToInit and
canBeOptimized. localsToInit is now computed in the thunk at runtime, and
canBeOptimized is used as a specialization argument to generate 2 variants of the
op_enter thunk: op_enter_canBeOptimized_Generator and op_enter_cannotBeOptimized_Generator,
thereby removing the need to pass it as a runtime argument.

LinkBuffer size results (from a single run of Speedometer2):

BaselineJIT: 93319628 (88.996532 MB) => 83851824 (79.967331 MB) 0.90x

ExtraCTIThunk: 5992 (5.851562 KB) => 6984 (6.820312 KB) 1.17x

...

Total: 197530008 (188.379295 MB) => 188459444 (179.728931 MB) 0.95x

Speedometer2 and JetStream2 results (as measured on an M1 Mac) are neutral.

  • assembler/AbstractMacroAssembler.h:

(JSC::AbstractMacroAssembler::untagReturnAddressWithoutExtraValidation):

  • assembler/MacroAssemblerARM64E.h:

(JSC::MacroAssemblerARM64E::untagReturnAddress):
(JSC::MacroAssemblerARM64E::untagReturnAddressWithoutExtraValidation):

  • assembler/MacroAssemblerARMv7.h:

(JSC::MacroAssemblerARMv7::branchAdd32):

  • assembler/MacroAssemblerMIPS.h:

(JSC::MacroAssemblerMIPS::branchAdd32):

  • bytecode/CodeBlock.h:

(JSC::CodeBlock::offsetOfNumCalleeLocals):
(JSC::CodeBlock::offsetOfNumVars):
(JSC::CodeBlock::offsetOfArgumentValueProfiles):
(JSC::CodeBlock::offsetOfShouldAlwaysBeInlined):

  • jit/AssemblyHelpers.h:

(JSC::AssemblyHelpers::emitSaveCalleeSavesFor):
(JSC::AssemblyHelpers::emitSaveCalleeSavesForBaselineJIT):
(JSC::AssemblyHelpers::emitRestoreCalleeSavesForBaselineJIT):

  • jit/JIT.cpp:

(JSC::JIT::compileAndLinkWithoutFinalizing):
(JSC::JIT::prologueGenerator):
(JSC::JIT::arityFixupPrologueGenerator):
(JSC::JIT::privateCompileExceptionHandlers):

  • jit/JIT.h:
  • jit/JITInlines.h:

(JSC::JIT::emitNakedNearCall):

  • jit/JITOpcodes.cpp:

(JSC::JIT::op_ret_handlerGenerator):
(JSC::JIT::emit_op_enter):
(JSC::JIT::op_enter_Generator):
(JSC::JIT::op_enter_canBeOptimized_Generator):
(JSC::JIT::op_enter_cannotBeOptimized_Generator):
(JSC::JIT::emit_op_loop_hint):
(JSC::JIT::emitSlow_op_loop_hint):
(JSC::JIT::op_loop_hint_Generator):
(JSC::JIT::op_enter_handlerGenerator): Deleted.

  • jit/JITOpcodes32_64.cpp:

(JSC::JIT::emit_op_enter):

  • jit/ThunkGenerators.cpp:

(JSC::popThunkStackPreservesAndHandleExceptionGenerator):

Location:
trunk/Source/JavaScriptCore
Files:
13 edited

Legend:

Unmodified
Added
Removed
  • trunk/Source/JavaScriptCore/ChangeLog

    r278568 r278576  
     12021-06-07  Mark Lam  <mark.lam@apple.com>
     2
     3        Put the Baseline JIT prologue and op_loop_hint code in JIT thunks.
     4        https://bugs.webkit.org/show_bug.cgi?id=226375
     5
     6        Reviewed by Keith Miller and Robin Morisset.
     7
     8        Baseline JIT prologue code varies in behavior based on several variables.  These
     9        variables include (1) whether the prologue does any arguments value profiling,
     10        (2) whether the prologue is for a constructor, and (3) whether the compiled
     11        CodeBlock will have such a large frame that it is greater than the stack reserved
     12        zone (aka red zone) which would require additional stack check logic.
     13
     14        The pre-existing code would generate specialized code based on these (and other
     15        variables).  In converting to using thunks for the prologue, we opt not to
     16        convert these specializations into runtime checks.  Instead, the implementation
     17        uses 1 of 8 possible specialized thunks to reduce the need to pass arguments for
     18        runtime checks.  The only needed argument passed to the prologue thunks is the
     19        codeBlock pointer.
     20
     21        There are 8 possible thunks because we specialize based on 3 variables:
     22        1. doesProfiling
     23        2. isConstructor
     24        3. hasHugeFrame
     25
     26        2**3 yields 8 permutations of prologue thunk specializations.
     27
     28        Similarly, there are also 8 analogous arity fixup prologues that work similarly.
     29
     30        The op_loop_hint thunk only takes 1 runtime argument: the bytecode offset.
     31
     32        We've tried doing the loop_hint optimization check in the thunk (in order to move
     33        both the fast and slow path into the thunk for maximum space savings).  However,
     34        this seems to have some slight negative impact on benchmark performance.  We ended
     35        up just keeping the fast path and instead have the slow path call a thunk to do
     36        its work.  This realizes the bulk of the size savings without the perf impact.
     37
     38        This patch also optimizes op_enter a bit more by eliminating the need to pass any
     39        arguments to the thunk.  The thunk previously took 2 arguments: localsToInit and
     40        canBeOptimized.  localsToInit is now computed in the thunk at runtime, and
     41        canBeOptimized is used as a specialization argument to generate 2 variants of the
     42        op_enter thunk: op_enter_canBeOptimized_Generator and op_enter_cannotBeOptimized_Generator,
     43        thereby removing the need to pass it as a runtime argument.
     44
     45        LinkBuffer size results (from a single run of Speedometer2):
     46
     47           BaselineJIT: 93319628 (88.996532 MB)   => 83851824 (79.967331 MB)   0.90x
     48         ExtraCTIThunk: 5992 (5.851562 KB)        => 6984 (6.820312 KB)        1.17x
     49                        ...
     50                 Total: 197530008 (188.379295 MB) => 188459444 (179.728931 MB) 0.95x
     51
     52        Speedometer2 and JetStream2 results (as measured on an M1 Mac) are neutral.
     53
     54        * assembler/AbstractMacroAssembler.h:
     55        (JSC::AbstractMacroAssembler::untagReturnAddressWithoutExtraValidation):
     56        * assembler/MacroAssemblerARM64E.h:
     57        (JSC::MacroAssemblerARM64E::untagReturnAddress):
     58        (JSC::MacroAssemblerARM64E::untagReturnAddressWithoutExtraValidation):
     59        * assembler/MacroAssemblerARMv7.h:
     60        (JSC::MacroAssemblerARMv7::branchAdd32):
     61        * assembler/MacroAssemblerMIPS.h:
     62        (JSC::MacroAssemblerMIPS::branchAdd32):
     63        * bytecode/CodeBlock.h:
     64        (JSC::CodeBlock::offsetOfNumCalleeLocals):
     65        (JSC::CodeBlock::offsetOfNumVars):
     66        (JSC::CodeBlock::offsetOfArgumentValueProfiles):
     67        (JSC::CodeBlock::offsetOfShouldAlwaysBeInlined):
     68        * jit/AssemblyHelpers.h:
     69        (JSC::AssemblyHelpers::emitSaveCalleeSavesFor):
     70        (JSC::AssemblyHelpers::emitSaveCalleeSavesForBaselineJIT):
     71        (JSC::AssemblyHelpers::emitRestoreCalleeSavesForBaselineJIT):
     72        * jit/JIT.cpp:
     73        (JSC::JIT::compileAndLinkWithoutFinalizing):
     74        (JSC::JIT::prologueGenerator):
     75        (JSC::JIT::arityFixupPrologueGenerator):
     76        (JSC::JIT::privateCompileExceptionHandlers):
     77        * jit/JIT.h:
     78        * jit/JITInlines.h:
     79        (JSC::JIT::emitNakedNearCall):
     80        * jit/JITOpcodes.cpp:
     81        (JSC::JIT::op_ret_handlerGenerator):
     82        (JSC::JIT::emit_op_enter):
     83        (JSC::JIT::op_enter_Generator):
     84        (JSC::JIT::op_enter_canBeOptimized_Generator):
     85        (JSC::JIT::op_enter_cannotBeOptimized_Generator):
     86        (JSC::JIT::emit_op_loop_hint):
     87        (JSC::JIT::emitSlow_op_loop_hint):
     88        (JSC::JIT::op_loop_hint_Generator):
     89        (JSC::JIT::op_enter_handlerGenerator): Deleted.
     90        * jit/JITOpcodes32_64.cpp:
     91        (JSC::JIT::emit_op_enter):
     92        * jit/ThunkGenerators.cpp:
     93        (JSC::popThunkStackPreservesAndHandleExceptionGenerator):
     94
    1952021-06-07  Robin Morisset  <rmorisset@apple.com>
    296
  • trunk/Source/JavaScriptCore/assembler/AbstractMacroAssembler.h

    r277680 r278576  
    10041004    ALWAYS_INLINE void tagReturnAddress() { }
    10051005    ALWAYS_INLINE void untagReturnAddress(RegisterID = RegisterID::InvalidGPRReg) { }
     1006    ALWAYS_INLINE void untagReturnAddressWithoutExtraValidation() { }
    10061007
    10071008    ALWAYS_INLINE void tagPtr(PtrTag, RegisterID) { }
  • trunk/Source/JavaScriptCore/assembler/MacroAssemblerARM64E.h

    r275597 r278576  
    5555    ALWAYS_INLINE void untagReturnAddress(RegisterID scratch = InvalidGPR)
    5656    {
     57        untagReturnAddressWithoutExtraValidation();
     58        validateUntaggedPtr(ARM64Registers::lr, scratch);
     59    }
     60
     61    ALWAYS_INLINE void untagReturnAddressWithoutExtraValidation()
     62    {
    5763        untagPtr(ARM64Registers::sp, ARM64Registers::lr);
    58         validateUntaggedPtr(ARM64Registers::lr, scratch);
    5964    }
    6065
  • trunk/Source/JavaScriptCore/assembler/MacroAssemblerARMv7.h

    r275797 r278576  
    17961796    }
    17971797
     1798    Jump branchAdd32(ResultCondition cond, TrustedImm32 imm, Address dest)
     1799    {
     1800        load32(dest, dataTempRegister);
     1801
     1802        // Do the add.
     1803        ARMThumbImmediate armImm = ARMThumbImmediate::makeEncodedImm(imm.m_value);
     1804        if (armImm.isValid())
     1805            m_assembler.add_S(dataTempRegister, dataTempRegister, armImm);
     1806        else {
     1807            move(imm, addressTempRegister);
     1808            m_assembler.add_S(dataTempRegister, dataTempRegister, addressTempRegister);
     1809        }
     1810
     1811        store32(dataTempRegister, dest);
     1812        return Jump(makeBranch(cond));
     1813    }
     1814
    17981815    Jump branchAdd32(ResultCondition cond, TrustedImm32 imm, AbsoluteAddress dest)
    17991816    {
  • trunk/Source/JavaScriptCore/assembler/MacroAssemblerMIPS.h

    r275797 r278576  
    23112311    }
    23122312
     2313    Jump branchAdd32(ResultCondition cond, TrustedImm32 imm, ImplicitAddress destAddress)
     2314    {
     2315        bool useAddrTempRegister = !(destAddress.offset >= -32768 && destAddress.offset <= 32767
     2316            && !m_fixedWidth);
     2317
     2318        if (useAddrTempRegister) {
     2319            m_assembler.lui(addrTempRegister, (destAddress.offset + 0x8000) >> 16);
     2320            m_assembler.addu(addrTempRegister, addrTempRegister, destAddress.base);
     2321        }
     2322
     2323        auto loadDest = [&] (RegisterID dest) {
     2324            if (useAddrTempRegister)
     2325                m_assembler.lw(dest, addrTempRegister, destAddress.offset);
     2326            else
     2327                m_assembler.lw(dest, destAddress.base, destAddress.offset);
     2328        };
     2329
     2330        auto storeDest = [&] (RegisterID src) {
     2331            if (useAddrTempRegister)
     2332                m_assembler.sw(src, addrTempRegister, destAddress.offset);
     2333            else
     2334                m_assembler.sw(src, destAddress.base, destAddress.offset);
     2335        };
     2336
     2337        ASSERT((cond == Overflow) || (cond == Signed) || (cond == PositiveOrZero) || (cond == Zero) || (cond == NonZero));
     2338        if (cond == Overflow) {
     2339            if (m_fixedWidth) {
     2340                /*
     2341                    load    dest, dataTemp
     2342                    move    imm, immTemp
     2343                    xor     cmpTemp, dataTemp, immTemp
     2344                    addu    dataTemp, dataTemp, immTemp
     2345                    store   dataTemp, dest
     2346                    bltz    cmpTemp, No_overflow    # diff sign bit -> no overflow
     2347                    xor     cmpTemp, dataTemp, immTemp
     2348                    bgez    cmpTemp, No_overflow    # same sign big -> no overflow
     2349                    nop
     2350                    b       Overflow
     2351                    nop
     2352                    b       No_overflow
     2353                    nop
     2354                    nop
     2355                    nop
     2356                No_overflow:
     2357                */
     2358                loadDest(dataTempRegister);
     2359                move(imm, immTempRegister);
     2360                m_assembler.xorInsn(cmpTempRegister, dataTempRegister, immTempRegister);
     2361                m_assembler.addu(dataTempRegister, dataTempRegister, immTempRegister);
     2362                storeDest(dataTempRegister);
     2363                m_assembler.bltz(cmpTempRegister, 9);
     2364                m_assembler.xorInsn(cmpTempRegister, dataTempRegister, immTempRegister);
     2365                m_assembler.bgez(cmpTempRegister, 7);
     2366                m_assembler.nop();
     2367            } else {
     2368                loadDest(dataTempRegister);
     2369                if (imm.m_value >= 0 && imm.m_value  <= 32767) {
     2370                    move(dataTempRegister, cmpTempRegister);
     2371                    m_assembler.addiu(dataTempRegister, dataTempRegister, imm.m_value);
     2372                    m_assembler.bltz(cmpTempRegister, 9);
     2373                    storeDest(dataTempRegister);
     2374                    m_assembler.bgez(dataTempRegister, 7);
     2375                    m_assembler.nop();
     2376                } else if (imm.m_value >= -32768 && imm.m_value < 0) {
     2377                    move(dataTempRegister, cmpTempRegister);
     2378                    m_assembler.addiu(dataTempRegister, dataTempRegister, imm.m_value);
     2379                    m_assembler.bgez(cmpTempRegister, 9);
     2380                    storeDest(dataTempRegister);
     2381                    m_assembler.bltz(cmpTempRegister, 7);
     2382                    m_assembler.nop();
     2383                } else {
     2384                    move(imm, immTempRegister);
     2385                    m_assembler.xorInsn(cmpTempRegister, dataTempRegister, immTempRegister);
     2386                    m_assembler.addu(dataTempRegister, dataTempRegister, immTempRegister);
     2387                    m_assembler.bltz(cmpTempRegister, 10);
     2388                    storeDest(dataTempRegister);
     2389                    m_assembler.xorInsn(cmpTempRegister, dataTempRegister, immTempRegister);
     2390                    m_assembler.bgez(cmpTempRegister, 7);
     2391                    m_assembler.nop();
     2392                }
     2393            }
     2394            return jump();
     2395        }
     2396        move(imm, immTempRegister);
     2397        loadDest(dataTempRegister);
     2398        add32(immTempRegister, dataTempRegister);
     2399        storeDest(dataTempRegister);
     2400        if (cond == Signed) {
     2401            // Check if dest is negative.
     2402            m_assembler.slt(cmpTempRegister, dataTempRegister, MIPSRegisters::zero);
     2403            return branchNotEqual(cmpTempRegister, MIPSRegisters::zero);
     2404        }
     2405        if (cond == PositiveOrZero) {
     2406            // Check if dest is not negative.
     2407            m_assembler.slt(cmpTempRegister, dataTempRegister, MIPSRegisters::zero);
     2408            return branchEqual(cmpTempRegister, MIPSRegisters::zero);
     2409        }
     2410        if (cond == Zero)
     2411            return branchEqual(dataTempRegister, MIPSRegisters::zero);
     2412        if (cond == NonZero)
     2413            return branchNotEqual(dataTempRegister, MIPSRegisters::zero);
     2414        ASSERT(0);
     2415        return Jump();
     2416    }
     2417
    23132418    Jump branchMul32(ResultCondition cond, RegisterID src1, RegisterID src2, RegisterID dest)
    23142419    {
  • trunk/Source/JavaScriptCore/bytecode/CodeBlock.h

    r278253 r278576  
    170170
    171171    unsigned* addressOfNumParameters() { return &m_numParameters; }
     172
     173    static ptrdiff_t offsetOfNumCalleeLocals() { return OBJECT_OFFSETOF(CodeBlock, m_numCalleeLocals); }
    172174    static ptrdiff_t offsetOfNumParameters() { return OBJECT_OFFSETOF(CodeBlock, m_numParameters); }
     175    static ptrdiff_t offsetOfNumVars() { return OBJECT_OFFSETOF(CodeBlock, m_numVars); }
    173176
    174177    CodeBlock* alternative() const { return static_cast<CodeBlock*>(m_alternative.get()); }
     
    486489        return result;
    487490    }
     491
     492    static ptrdiff_t offsetOfArgumentValueProfiles() { return OBJECT_OFFSETOF(CodeBlock, m_argumentValueProfiles); }
    488493
    489494    ValueProfile& valueProfileForBytecodeIndex(BytecodeIndex);
     
    820825
    821826    bool wasCompiledWithDebuggingOpcodes() const { return m_unlinkedCode->wasCompiledWithDebuggingOpcodes(); }
    822    
     827
    823828    // This is intentionally public; it's the responsibility of anyone doing any
    824829    // of the following to hold the lock:
     
    907912    static ptrdiff_t offsetOfMetadataTable() { return OBJECT_OFFSETOF(CodeBlock, m_metadata); }
    908913    static ptrdiff_t offsetOfInstructionsRawPointer() { return OBJECT_OFFSETOF(CodeBlock, m_instructionsRawPointer); }
     914    static ptrdiff_t offsetOfShouldAlwaysBeInlined() { return OBJECT_OFFSETOF(CodeBlock, m_shouldAlwaysBeInlined); }
    909915
    910916    bool loopHintsAreEligibleForFuzzingEarlyReturn()
  • trunk/Source/JavaScriptCore/jit/AssemblyHelpers.h

    r278253 r278576  
    327327
    328328        const RegisterAtOffsetList* calleeSaves = codeBlock->calleeSaveRegisters();
     329        emitSaveCalleeSavesFor(calleeSaves);
     330    }
     331
     332    void emitSaveCalleeSavesFor(const RegisterAtOffsetList* calleeSaves)
     333    {
    329334        RegisterSet dontSaveRegisters = RegisterSet(RegisterSet::stackRegisters(), RegisterSet::allFPRs());
    330335        unsigned registerCount = calleeSaves->size();
     
    400405    }
    401406
     407    void emitSaveCalleeSavesForBaselineJIT()
     408    {
     409        emitSaveCalleeSavesFor(&RegisterAtOffsetList::llintBaselineCalleeSaveRegisters());
     410    }
     411
    402412    void emitSaveThenMaterializeTagRegisters()
    403413    {
     
    416426    {
    417427        emitRestoreCalleeSavesFor(codeBlock());
     428    }
     429
     430    void emitRestoreCalleeSavesForBaselineJIT()
     431    {
     432        emitRestoreCalleeSavesFor(&RegisterAtOffsetList::llintBaselineCalleeSaveRegisters());
    418433    }
    419434
  • trunk/Source/JavaScriptCore/jit/JIT.cpp

    r278445 r278576  
    5656}
    5757
     58#if ENABLE(EXTRA_CTI_THUNKS)
     59#if CPU(ARM64) || (CPU(X86_64) && !OS(WINDOWS))
     60// These are supported ports.
     61#else
     62// This is a courtesy reminder (and warning) that the implementation of EXTRA_CTI_THUNKS can
     63// use up to 6 argument registers and/or 6/7 temp registers, and make use of ARM64 like
     64// features. Hence, it may not work for many other ports without significant work. If you
     65// plan on adding EXTRA_CTI_THUNKS support for your port, please remember to search the
     66// EXTRA_CTI_THUNKS code for CPU(ARM64) and CPU(X86_64) conditional code, and add support
     67// for your port there as well.
     68#error "unsupported architecture"
     69#endif
     70#endif // ENABLE(EXTRA_CTI_THUNKS)
     71
    5872Seconds totalBaselineCompileTime;
    5973Seconds totalDFGCompileTime;
     
    8498}
    8599
    86 #if ENABLE(DFG_JIT)
     100#if ENABLE(DFG_JIT) && !ENABLE(EXTRA_CTI_THUNKS)
    87101void JIT::emitEnterOptimizationCheck()
    88102{
     
    102116    skipOptimize.link(this);
    103117}
    104 #endif
     118#endif // ENABLE(DFG_JIT) && !ENABLE(EXTRA_CTI_THUNKS)(
    105119
    106120void JIT::emitNotifyWrite(WatchpointSet* set)
     
    683697}
    684698
     699static inline unsigned prologueGeneratorSelector(bool doesProfiling, bool isConstructor, bool hasHugeFrame)
     700{
     701    return doesProfiling << 2 | isConstructor << 1 | hasHugeFrame << 0;
     702}
     703
     704#define FOR_EACH_NON_PROFILING_PROLOGUE_GENERATOR(v) \
     705    v(!doesProfiling, !isConstructor, !hasHugeFrame, prologueGenerator0, arityFixup_prologueGenerator0) \
     706    v(!doesProfiling, !isConstructor,  hasHugeFrame, prologueGenerator1, arityFixup_prologueGenerator1) \
     707    v(!doesProfiling,  isConstructor, !hasHugeFrame, prologueGenerator2, arityFixup_prologueGenerator2) \
     708    v(!doesProfiling,  isConstructor,  hasHugeFrame, prologueGenerator3, arityFixup_prologueGenerator3)
     709
     710#if ENABLE(DFG_JIT)
     711#define FOR_EACH_PROFILING_PROLOGUE_GENERATOR(v) \
     712    v( doesProfiling, !isConstructor, !hasHugeFrame, prologueGenerator4, arityFixup_prologueGenerator4) \
     713    v( doesProfiling, !isConstructor,  hasHugeFrame, prologueGenerator5, arityFixup_prologueGenerator5) \
     714    v( doesProfiling,  isConstructor, !hasHugeFrame, prologueGenerator6, arityFixup_prologueGenerator6) \
     715    v( doesProfiling,  isConstructor,  hasHugeFrame, prologueGenerator7, arityFixup_prologueGenerator7)
     716
     717#else // not ENABLE(DFG_JIT)
     718#define FOR_EACH_PROFILING_PROLOGUE_GENERATOR(v)
     719#endif // ENABLE(DFG_JIT)
     720
     721#define FOR_EACH_PROLOGUE_GENERATOR(v) \
     722    FOR_EACH_NON_PROFILING_PROLOGUE_GENERATOR(v) \
     723    FOR_EACH_PROFILING_PROLOGUE_GENERATOR(v)
     724
    685725void JIT::compileAndLinkWithoutFinalizing(JITCompilationEffort effort)
    686726{
     
    751791
    752792    emitFunctionPrologue();
     793
     794#if !ENABLE(EXTRA_CTI_THUNKS)
    753795    emitPutToCallFrameHeader(m_codeBlock, CallFrameSlot::codeBlock);
    754796
     
    772814        ASSERT(!m_bytecodeIndex);
    773815        if (shouldEmitProfiling()) {
    774             for (unsigned argument = 0; argument < m_codeBlock->numParameters(); ++argument) {
    775                 // If this is a constructor, then we want to put in a dummy profiling site (to
    776                 // keep things consistent) but we don't actually want to record the dummy value.
    777                 if (m_codeBlock->isConstructor() && !argument)
    778                     continue;
     816            // If this is a constructor, then we want to put in a dummy profiling site (to
     817            // keep things consistent) but we don't actually want to record the dummy value.
     818            unsigned startArgument = m_codeBlock->isConstructor() ? 1 : 0;
     819            for (unsigned argument = startArgument; argument < m_codeBlock->numParameters(); ++argument) {
    779820                int offset = CallFrame::argumentOffsetIncludingThis(argument) * static_cast<int>(sizeof(Register));
    780821#if USE(JSVALUE64)
     
    790831        }
    791832    }
    792    
     833#else // ENABLE(EXTRA_CTI_THUNKS)
     834    constexpr GPRReg codeBlockGPR = regT7;
     835    ASSERT(!m_bytecodeIndex);
     836
     837    int frameTopOffset = stackPointerOffsetFor(m_codeBlock) * sizeof(Register);
     838    unsigned maxFrameSize = -frameTopOffset;
     839
     840    bool doesProfiling = (m_codeBlock->codeType() == FunctionCode) && shouldEmitProfiling();
     841    bool isConstructor = m_codeBlock->isConstructor();
     842    bool hasHugeFrame = maxFrameSize > Options::reservedZoneSize();
     843
     844    static constexpr ThunkGenerator generators[] = {
     845#define USE_PROLOGUE_GENERATOR(doesProfiling, isConstructor, hasHugeFrame, name, arityFixupName) name,
     846        FOR_EACH_PROLOGUE_GENERATOR(USE_PROLOGUE_GENERATOR)
     847#undef USE_PROLOGUE_GENERATOR
     848    };
     849    static constexpr unsigned numberOfGenerators = sizeof(generators) / sizeof(generators[0]);
     850
     851    move(TrustedImmPtr(m_codeBlock), codeBlockGPR);
     852
     853    unsigned generatorSelector = prologueGeneratorSelector(doesProfiling, isConstructor, hasHugeFrame);
     854    RELEASE_ASSERT(generatorSelector < numberOfGenerators);
     855    auto generator = generators[generatorSelector];
     856    emitNakedNearCall(vm().getCTIStub(generator).retaggedCode<NoPtrTag>());
     857
     858    Label bodyLabel(this);
     859#endif // !ENABLE(EXTRA_CTI_THUNKS)
     860
    793861    RELEASE_ASSERT(!JITCode::isJIT(m_codeBlock->jitType()));
    794862
     
    804872    m_pcToCodeOriginMapBuilder.appendItem(label(), PCToCodeOriginMapBuilder::defaultCodeOrigin());
    805873
     874#if !ENABLE(EXTRA_CTI_THUNKS)
    806875    stackOverflow.link(this);
    807876    m_bytecodeIndex = BytecodeIndex(0);
     
    809878        addPtr(TrustedImm32(-static_cast<int32_t>(maxFrameExtentForSlowPathCall)), stackPointerRegister);
    810879    callOperationWithCallFrameRollbackOnException(operationThrowStackOverflowError, m_codeBlock);
     880#endif
    811881
    812882    // If the number of parameters is 1, we never require arity fixup.
     
    814884    if (m_codeBlock->codeType() == FunctionCode && requiresArityFixup) {
    815885        m_arityCheck = label();
     886#if !ENABLE(EXTRA_CTI_THUNKS)
    816887        store8(TrustedImm32(0), &m_codeBlock->m_shouldAlwaysBeInlined);
    817888        emitFunctionPrologue();
     
    832903        emitNakedNearCall(m_vm->getCTIStub(arityFixupGenerator).retaggedCode<NoPtrTag>());
    833904
     905        jump(beginLabel);
     906
     907#else // ENABLE(EXTRA_CTI_THUNKS)
     908        emitFunctionPrologue();
     909
     910        static_assert(codeBlockGPR == regT7);
     911        ASSERT(!m_bytecodeIndex);
     912
     913        static constexpr ThunkGenerator generators[] = {
     914#define USE_PROLOGUE_GENERATOR(doesProfiling, isConstructor, hasHugeFrame, name, arityFixupName) arityFixupName,
     915            FOR_EACH_PROLOGUE_GENERATOR(USE_PROLOGUE_GENERATOR)
     916#undef USE_PROLOGUE_GENERATOR
     917        };
     918        static constexpr unsigned numberOfGenerators = sizeof(generators) / sizeof(generators[0]);
     919
     920        move(TrustedImmPtr(m_codeBlock), codeBlockGPR);
     921
     922        RELEASE_ASSERT(generatorSelector < numberOfGenerators);
     923        auto generator = generators[generatorSelector];
     924        RELEASE_ASSERT(generator);
     925        emitNakedNearCall(vm().getCTIStub(generator).retaggedCode<NoPtrTag>());
     926
     927        jump(bodyLabel);
     928#endif // !ENABLE(EXTRA_CTI_THUNKS)
     929
    834930#if ASSERT_ENABLED
    835931        m_bytecodeIndex = BytecodeIndex(); // Reset this, in order to guard its use with ASSERTs.
    836932#endif
    837 
    838         jump(beginLabel);
    839933    } else
    840934        m_arityCheck = entryLabel; // Never require arity fixup.
     
    842936    ASSERT(m_jmpTable.isEmpty());
    843937   
     938#if !ENABLE(EXTRA_CTI_THUNKS)
    844939    privateCompileExceptionHandlers();
     940#endif
    845941   
    846942    if (m_disassembler)
     
    851947    link();
    852948}
     949
     950#if ENABLE(EXTRA_CTI_THUNKS)
     951MacroAssemblerCodeRef<JITThunkPtrTag> JIT::prologueGenerator(VM& vm, bool doesProfiling, bool isConstructor, bool hasHugeFrame, const char* thunkName)
     952{
     953    // This function generates the Baseline JIT's prologue code. It is not useable by other tiers.
     954    constexpr GPRReg codeBlockGPR = regT7; // incoming.
     955
     956    constexpr int virtualRegisterSize = static_cast<int>(sizeof(Register));
     957    constexpr int virtualRegisterSizeShift = 3;
     958    static_assert((1 << virtualRegisterSizeShift) == virtualRegisterSize);
     959
     960    tagReturnAddress();
     961
     962    storePtr(codeBlockGPR, addressFor(CallFrameSlot::codeBlock));
     963
     964    load32(Address(codeBlockGPR, CodeBlock::offsetOfNumCalleeLocals()), regT1);
     965    if constexpr (maxFrameExtentForSlowPathCallInRegisters)
     966        add32(TrustedImm32(maxFrameExtentForSlowPathCallInRegisters), regT1);
     967    lshift32(TrustedImm32(virtualRegisterSizeShift), regT1);
     968    neg64(regT1);
     969#if ASSERT_ENABLED
     970    Probe::Function probeFunction = [] (Probe::Context& context) {
     971        CodeBlock* codeBlock = context.fp<CallFrame*>()->codeBlock();
     972        int64_t frameTopOffset = stackPointerOffsetFor(codeBlock) * sizeof(Register);
     973        RELEASE_ASSERT(context.gpr<intptr_t>(regT1) == frameTopOffset);
     974    };
     975    probe(tagCFunctionPtr<JITProbePtrTag>(probeFunction), nullptr);
     976#endif
     977
     978    addPtr(callFrameRegister, regT1);
     979
     980    JumpList stackOverflow;
     981    if (hasHugeFrame)
     982        stackOverflow.append(branchPtr(Above, regT1, callFrameRegister));
     983    stackOverflow.append(branchPtr(Above, AbsoluteAddress(vm.addressOfSoftStackLimit()), regT1));
     984
     985    // We'll be imminently returning with a `retab` (ARM64E's return with authentication
     986    // using the B key) in the normal path (see MacroAssemblerARM64E's implementation of
     987    // ret()), which will do validation. So, extra validation here is redundant and unnecessary.
     988    untagReturnAddressWithoutExtraValidation();
     989#if CPU(X86_64)
     990    pop(regT2); // Save the return address.
     991#endif
     992    move(regT1, stackPointerRegister);
     993    tagReturnAddress();
     994    checkStackPointerAlignment();
     995#if CPU(X86_64)
     996    push(regT2); // Restore the return address.
     997#endif
     998
     999    emitSaveCalleeSavesForBaselineJIT();
     1000    emitMaterializeTagCheckRegisters();
     1001
     1002    if (doesProfiling) {
     1003        constexpr GPRReg argumentValueProfileGPR = regT6;
     1004        constexpr GPRReg numParametersGPR = regT5;
     1005        constexpr GPRReg argumentGPR = regT4;
     1006
     1007        load32(Address(codeBlockGPR, CodeBlock::offsetOfNumParameters()), numParametersGPR);
     1008        loadPtr(Address(codeBlockGPR, CodeBlock::offsetOfArgumentValueProfiles()), argumentValueProfileGPR);
     1009        if (isConstructor)
     1010            addPtr(TrustedImm32(sizeof(ValueProfile)), argumentValueProfileGPR);
     1011
     1012        int startArgument = CallFrameSlot::thisArgument + (isConstructor ? 1 : 0);
     1013        int startArgumentOffset = startArgument * virtualRegisterSize;
     1014        move(TrustedImm64(startArgumentOffset), argumentGPR);
     1015
     1016        add32(TrustedImm32(static_cast<int>(CallFrameSlot::thisArgument)), numParametersGPR);
     1017        lshift32(TrustedImm32(virtualRegisterSizeShift), numParametersGPR);
     1018
     1019        addPtr(callFrameRegister, argumentGPR);
     1020        addPtr(callFrameRegister, numParametersGPR);
     1021
     1022        Label loopStart(this);
     1023        Jump done = branchPtr(AboveOrEqual, argumentGPR, numParametersGPR);
     1024        {
     1025            load64(Address(argumentGPR), regT0);
     1026            store64(regT0, Address(argumentValueProfileGPR, OBJECT_OFFSETOF(ValueProfile, m_buckets)));
     1027
     1028            // The argument ValueProfiles are stored in a FixedVector. Hence, the
     1029            // address of the next profile can be trivially computed with an increment.
     1030            addPtr(TrustedImm32(sizeof(ValueProfile)), argumentValueProfileGPR);
     1031            addPtr(TrustedImm32(virtualRegisterSize), argumentGPR);
     1032            jump().linkTo(loopStart, this);
     1033        }
     1034        done.link(this);
     1035    }
     1036    ret();
     1037
     1038    stackOverflow.link(this);
     1039#if CPU(X86_64)
     1040    addPtr(TrustedImm32(1 * sizeof(CPURegister)), stackPointerRegister); // discard return address.
     1041#endif
     1042
     1043    uint32_t locationBits = CallSiteIndex(0).bits();
     1044    store32(TrustedImm32(locationBits), tagFor(CallFrameSlot::argumentCountIncludingThis));
     1045
     1046    if (maxFrameExtentForSlowPathCall)
     1047        addPtr(TrustedImm32(-static_cast<int32_t>(maxFrameExtentForSlowPathCall)), stackPointerRegister);
     1048
     1049    setupArguments<decltype(operationThrowStackOverflowError)>(codeBlockGPR);
     1050    prepareCallOperation(vm);
     1051    MacroAssembler::Call operationCall = call(OperationPtrTag);
     1052    Jump handleExceptionJump = jump();
     1053
     1054    auto handler = vm.getCTIStub(handleExceptionWithCallFrameRollbackGenerator);
     1055
     1056    LinkBuffer patchBuffer(*this, GLOBAL_THUNK_ID, LinkBuffer::Profile::ExtraCTIThunk);
     1057    patchBuffer.link(operationCall, FunctionPtr<OperationPtrTag>(operationThrowStackOverflowError));
     1058    patchBuffer.link(handleExceptionJump, CodeLocationLabel(handler.retaggedCode<NoPtrTag>()));
     1059    return FINALIZE_CODE(patchBuffer, JITThunkPtrTag, thunkName);
     1060}
     1061
     1062static constexpr bool doesProfiling = true;
     1063static constexpr bool isConstructor = true;
     1064static constexpr bool hasHugeFrame = true;
     1065
     1066#define DEFINE_PROGLOGUE_GENERATOR(doesProfiling, isConstructor, hasHugeFrame, name, arityFixupName) \
     1067    MacroAssemblerCodeRef<JITThunkPtrTag> JIT::name(VM& vm) \
     1068    { \
     1069        JIT jit(vm); \
     1070        return jit.prologueGenerator(vm, doesProfiling, isConstructor, hasHugeFrame, "Baseline: " #name); \
     1071    }
     1072
     1073FOR_EACH_PROLOGUE_GENERATOR(DEFINE_PROGLOGUE_GENERATOR)
     1074#undef DEFINE_PROGLOGUE_GENERATOR
     1075
     1076MacroAssemblerCodeRef<JITThunkPtrTag> JIT::arityFixupPrologueGenerator(VM& vm, bool isConstructor, ThunkGenerator normalPrologueGenerator, const char* thunkName)
     1077{
     1078    // This function generates the Baseline JIT's prologue code. It is not useable by other tiers.
     1079    constexpr GPRReg codeBlockGPR = regT7; // incoming.
     1080    constexpr GPRReg numParametersGPR = regT6;
     1081
     1082    tagReturnAddress();
     1083#if CPU(X86_64)
     1084    push(framePointerRegister);
     1085#elif CPU(ARM64)
     1086    pushPair(framePointerRegister, linkRegister);
     1087#endif
     1088
     1089    storePtr(codeBlockGPR, addressFor(CallFrameSlot::codeBlock));
     1090    store8(TrustedImm32(0), Address(codeBlockGPR, CodeBlock::offsetOfShouldAlwaysBeInlined()));
     1091
     1092    load32(payloadFor(CallFrameSlot::argumentCountIncludingThis), regT1);
     1093    load32(Address(codeBlockGPR, CodeBlock::offsetOfNumParameters()), numParametersGPR);
     1094    Jump noFixupNeeded = branch32(AboveOrEqual, regT1, numParametersGPR);
     1095
     1096    if constexpr (maxFrameExtentForSlowPathCall)
     1097        addPtr(TrustedImm32(-static_cast<int32_t>(maxFrameExtentForSlowPathCall)), stackPointerRegister);
     1098
     1099    loadPtr(Address(codeBlockGPR, CodeBlock::offsetOfGlobalObject()), argumentGPR0);
     1100
     1101    static_assert(std::is_same<decltype(operationConstructArityCheck), decltype(operationCallArityCheck)>::value);
     1102    setupArguments<decltype(operationCallArityCheck)>(argumentGPR0);
     1103    prepareCallOperation(vm);
     1104
     1105    MacroAssembler::Call arityCheckCall = call(OperationPtrTag);
     1106    Jump handleExceptionJump = emitNonPatchableExceptionCheck(vm);
     1107
     1108    if constexpr (maxFrameExtentForSlowPathCall)
     1109        addPtr(TrustedImm32(maxFrameExtentForSlowPathCall), stackPointerRegister);
     1110    Jump needFixup = branchTest32(NonZero, returnValueGPR);
     1111    noFixupNeeded.link(this);
     1112
     1113    // The normal prologue expects incoming codeBlockGPR.
     1114    load64(addressFor(CallFrameSlot::codeBlock), codeBlockGPR);
     1115
     1116#if CPU(X86_64)
     1117    pop(framePointerRegister);
     1118#elif CPU(ARM64)
     1119    popPair(framePointerRegister, linkRegister);
     1120#endif
     1121    untagReturnAddress();
     1122
     1123    JumpList normalPrologueJump;
     1124    normalPrologueJump.append(jump());
     1125
     1126    needFixup.link(this);
     1127
     1128    // Restore the stack for arity fixup, and preserve the return address.
     1129    // arityFixupGenerator will be shifting the stack. So, we can't use the stack to
     1130    // preserve the return address. We also can't use callee saved registers because
     1131    // they haven't been saved yet.
     1132    //
     1133    // arityFixupGenerator is carefully crafted to only use a0, a1, a2, t3, t4 and t5.
     1134    // So, the return address can be preserved in regT7.
     1135#if CPU(X86_64)
     1136    pop(argumentGPR2); // discard.
     1137    pop(regT7); // save return address.
     1138#elif CPU(ARM64)
     1139    popPair(framePointerRegister, linkRegister);
     1140    untagReturnAddress();
     1141    move(linkRegister, regT7);
     1142    auto randomReturnAddressTag = random();
     1143    move(TrustedImm32(randomReturnAddressTag), regT1);
     1144    tagPtr(regT1, regT7);
     1145#endif
     1146    move(returnValueGPR, GPRInfo::argumentGPR0);
     1147    Call arityFixupCall = nearCall();
     1148
     1149#if CPU(X86_64)
     1150    push(regT7); // restore return address.
     1151#elif CPU(ARM64)
     1152    move(TrustedImm32(randomReturnAddressTag), regT1);
     1153    untagPtr(regT1, regT7);
     1154    move(regT7, linkRegister);
     1155#endif
     1156
     1157    load64(addressFor(CallFrameSlot::codeBlock), codeBlockGPR);
     1158    normalPrologueJump.append(jump());
     1159
     1160    auto arityCheckOperation = isConstructor ? operationConstructArityCheck : operationCallArityCheck;
     1161    auto arityFixup = vm.getCTIStub(arityFixupGenerator);
     1162    auto normalPrologue = vm.getCTIStub(normalPrologueGenerator);
     1163    auto exceptionHandler = vm.getCTIStub(popThunkStackPreservesAndHandleExceptionGenerator);
     1164
     1165    LinkBuffer patchBuffer(*this, GLOBAL_THUNK_ID, LinkBuffer::Profile::ExtraCTIThunk);
     1166    patchBuffer.link(arityCheckCall, FunctionPtr<OperationPtrTag>(arityCheckOperation));
     1167    patchBuffer.link(arityFixupCall, FunctionPtr(arityFixup.retaggedCode<NoPtrTag>()));
     1168    patchBuffer.link(normalPrologueJump, CodeLocationLabel(normalPrologue.retaggedCode<NoPtrTag>()));
     1169    patchBuffer.link(handleExceptionJump, CodeLocationLabel(exceptionHandler.retaggedCode<NoPtrTag>()));
     1170    return FINALIZE_CODE(patchBuffer, JITThunkPtrTag, thunkName);
     1171}
     1172
     1173#define DEFINE_ARITY_PROGLOGUE_GENERATOR(doesProfiling, isConstructor, hasHugeFrame, name, arityFixupName) \
     1174MacroAssemblerCodeRef<JITThunkPtrTag> JIT::arityFixupName(VM& vm) \
     1175    { \
     1176        JIT jit(vm); \
     1177        return jit.arityFixupPrologueGenerator(vm, isConstructor, name, "Baseline: " #arityFixupName); \
     1178    }
     1179
     1180FOR_EACH_PROLOGUE_GENERATOR(DEFINE_ARITY_PROGLOGUE_GENERATOR)
     1181#undef DEFINE_ARITY_PROGLOGUE_GENERATOR
     1182
     1183#endif // ENABLE(EXTRA_CTI_THUNKS)
    8531184
    8541185void JIT::link()
     
    10471378}
    10481379
     1380#if !ENABLE(EXTRA_CTI_THUNKS)
    10491381void JIT::privateCompileExceptionHandlers()
    10501382{
    1051 #if !ENABLE(EXTRA_CTI_THUNKS)
    10521383    if (!m_exceptionChecksWithCallFrameRollback.empty()) {
    10531384        m_exceptionChecksWithCallFrameRollback.link(this);
     
    10741405        jumpToExceptionHandler(vm());
    10751406    }
    1076 #endif // ENABLE(EXTRA_CTI_THUNKS)
    1077 }
     1407}
     1408#endif // !ENABLE(EXTRA_CTI_THUNKS)
    10781409
    10791410void JIT::doMainThreadPreparationBeforeCompile()
  • trunk/Source/JavaScriptCore/jit/JIT.h

    r278445 r278576  
    319319        }
    320320
     321#if !ENABLE(EXTRA_CTI_THUNKS)
    321322        void privateCompileExceptionHandlers();
     323#endif
    322324
    323325        void advanceToNextCheckpoint();
     
    791793#if ENABLE(EXTRA_CTI_THUNKS)
    792794        // Thunk generators.
     795        static MacroAssemblerCodeRef<JITThunkPtrTag> prologueGenerator0(VM&);
     796        static MacroAssemblerCodeRef<JITThunkPtrTag> prologueGenerator1(VM&);
     797        static MacroAssemblerCodeRef<JITThunkPtrTag> prologueGenerator2(VM&);
     798        static MacroAssemblerCodeRef<JITThunkPtrTag> prologueGenerator3(VM&);
     799        static MacroAssemblerCodeRef<JITThunkPtrTag> prologueGenerator4(VM&);
     800        static MacroAssemblerCodeRef<JITThunkPtrTag> prologueGenerator5(VM&);
     801        static MacroAssemblerCodeRef<JITThunkPtrTag> prologueGenerator6(VM&);
     802        static MacroAssemblerCodeRef<JITThunkPtrTag> prologueGenerator7(VM&);
     803        MacroAssemblerCodeRef<JITThunkPtrTag> prologueGenerator(VM&, bool doesProfiling, bool isConstructor, bool hasHugeFrame, const char* name);
     804
     805        static MacroAssemblerCodeRef<JITThunkPtrTag> arityFixup_prologueGenerator0(VM&);
     806        static MacroAssemblerCodeRef<JITThunkPtrTag> arityFixup_prologueGenerator1(VM&);
     807        static MacroAssemblerCodeRef<JITThunkPtrTag> arityFixup_prologueGenerator2(VM&);
     808        static MacroAssemblerCodeRef<JITThunkPtrTag> arityFixup_prologueGenerator3(VM&);
     809        static MacroAssemblerCodeRef<JITThunkPtrTag> arityFixup_prologueGenerator4(VM&);
     810        static MacroAssemblerCodeRef<JITThunkPtrTag> arityFixup_prologueGenerator5(VM&);
     811        static MacroAssemblerCodeRef<JITThunkPtrTag> arityFixup_prologueGenerator6(VM&);
     812        static MacroAssemblerCodeRef<JITThunkPtrTag> arityFixup_prologueGenerator7(VM&);
     813        MacroAssemblerCodeRef<JITThunkPtrTag> arityFixupPrologueGenerator(VM&, bool isConstructor, ThunkGenerator normalPrologueGenerator, const char* name);
     814
    793815        static MacroAssemblerCodeRef<JITThunkPtrTag> slow_op_del_by_id_prepareCallGenerator(VM&);
    794816        static MacroAssemblerCodeRef<JITThunkPtrTag> slow_op_del_by_val_prepareCallGenerator(VM&);
     
    805827
    806828        static MacroAssemblerCodeRef<JITThunkPtrTag> op_check_traps_handlerGenerator(VM&);
    807         static MacroAssemblerCodeRef<JITThunkPtrTag> op_enter_handlerGenerator(VM&);
     829
     830        static MacroAssemblerCodeRef<JITThunkPtrTag> op_enter_canBeOptimized_Generator(VM&);
     831        static MacroAssemblerCodeRef<JITThunkPtrTag> op_enter_cannotBeOptimized_Generator(VM&);
     832        MacroAssemblerCodeRef<JITThunkPtrTag> op_enter_Generator(VM&, bool canBeOptimized, const char* thunkName);
     833
     834#if ENABLE(DFG_JIT)
     835        static MacroAssemblerCodeRef<JITThunkPtrTag> op_loop_hint_Generator(VM&);
     836#endif
    808837        static MacroAssemblerCodeRef<JITThunkPtrTag> op_ret_handlerGenerator(VM&);
    809838        static MacroAssemblerCodeRef<JITThunkPtrTag> op_throw_handlerGenerator(VM&);
  • trunk/Source/JavaScriptCore/jit/JITInlines.h

    r277576 r278576  
    9292ALWAYS_INLINE JIT::Call JIT::emitNakedNearCall(CodePtr<NoPtrTag> target)
    9393{
    94     ASSERT(m_bytecodeIndex); // This method should only be called during hot/cold path generation, so that m_bytecodeIndex is set.
    9594    Call nakedCall = nearCall();
    9695    m_nearCalls.append(NearCallRecord(nakedCall, FunctionPtr<JSInternalPtrTag>(target.retagged<JSInternalPtrTag>())));
  • trunk/Source/JavaScriptCore/jit/JITOpcodes.cpp

    r278029 r278576  
    377377
    378378    jit.checkStackPointerAlignment();
    379     jit.emitRestoreCalleeSavesFor(&RegisterAtOffsetList::llintBaselineCalleeSaveRegisters());
     379    jit.emitRestoreCalleeSavesForBaselineJIT();
    380380    jit.emitFunctionEpilogue();
    381381    jit.ret();
     
    11871187#else
    11881188    ASSERT(m_bytecodeIndex.offset() == 0);
    1189     constexpr GPRReg localsToInitGPR = argumentGPR0;
    1190     constexpr GPRReg canBeOptimizedGPR = argumentGPR4;
    1191 
    11921189    unsigned localsToInit = count - CodeBlock::llintBaselineCalleeSaveSpaceAsVirtualRegisters();
    11931190    RELEASE_ASSERT(localsToInit < count);
    1194     move(TrustedImm32(localsToInit * sizeof(Register)), localsToInitGPR);
    1195     move(TrustedImm32(canBeOptimized()), canBeOptimizedGPR);
    1196     emitNakedNearCall(vm().getCTIStub(op_enter_handlerGenerator).retaggedCode<NoPtrTag>());
     1191    ThunkGenerator generator = canBeOptimized() ? op_enter_canBeOptimized_Generator : op_enter_cannotBeOptimized_Generator;
     1192    emitNakedNearCall(vm().getCTIStub(generator).retaggedCode<NoPtrTag>());
    11971193#endif // ENABLE(EXTRA_CTI_THUNKS)
    11981194}
    11991195
    12001196#if ENABLE(EXTRA_CTI_THUNKS)
    1201 MacroAssemblerCodeRef<JITThunkPtrTag> JIT::op_enter_handlerGenerator(VM& vm)
    1202 {
    1203     JIT jit(vm);
    1204 
     1197MacroAssemblerCodeRef<JITThunkPtrTag> JIT::op_enter_Generator(VM& vm, bool canBeOptimized, const char* thunkName)
     1198{
    12051199#if CPU(X86_64)
    1206     jit.push(X86Registers::ebp);
     1200    push(X86Registers::ebp);
    12071201#elif CPU(ARM64)
    1208     jit.tagReturnAddress();
    1209     jit.pushPair(framePointerRegister, linkRegister);
     1202    tagReturnAddress();
     1203    pushPair(framePointerRegister, linkRegister);
    12101204#endif
    12111205    // op_enter is always at bytecodeOffset 0.
    1212     jit.store32(TrustedImm32(0), tagFor(CallFrameSlot::argumentCountIncludingThis));
     1206    store32(TrustedImm32(0), tagFor(CallFrameSlot::argumentCountIncludingThis));
    12131207
    12141208    constexpr GPRReg localsToInitGPR = argumentGPR0;
     
    12161210    constexpr GPRReg endGPR = argumentGPR2;
    12171211    constexpr GPRReg undefinedGPR = argumentGPR3;
    1218     constexpr GPRReg canBeOptimizedGPR = argumentGPR4;
     1212    constexpr GPRReg codeBlockGPR = argumentGPR4;
     1213
     1214    constexpr int virtualRegisterSizeShift = 3;
     1215    static_assert((1 << virtualRegisterSizeShift) == sizeof(Register));
     1216
     1217    loadPtr(addressFor(CallFrameSlot::codeBlock), codeBlockGPR);
     1218    load32(Address(codeBlockGPR, CodeBlock::offsetOfNumVars()), localsToInitGPR);
     1219    sub32(TrustedImm32(CodeBlock::llintBaselineCalleeSaveSpaceAsVirtualRegisters()), localsToInitGPR);
     1220    lshift32(TrustedImm32(virtualRegisterSizeShift), localsToInitGPR);
    12191221
    12201222    size_t startLocal = CodeBlock::llintBaselineCalleeSaveSpaceAsVirtualRegisters();
    12211223    int startOffset = virtualRegisterForLocal(startLocal).offset();
    1222     jit.move(TrustedImm64(startOffset * sizeof(Register)), iteratorGPR);
    1223     jit.sub64(iteratorGPR, localsToInitGPR, endGPR);
    1224 
    1225     jit.move(TrustedImm64(JSValue::encode(jsUndefined())), undefinedGPR);
    1226     auto initLoop = jit.label();
    1227     Jump initDone = jit.branch32(LessThanOrEqual, iteratorGPR, endGPR);
     1224    move(TrustedImm64(startOffset * sizeof(Register)), iteratorGPR);
     1225    sub64(iteratorGPR, localsToInitGPR, endGPR);
     1226
     1227    move(TrustedImm64(JSValue::encode(jsUndefined())), undefinedGPR);
     1228    auto initLoop = label();
     1229    Jump initDone = branch32(LessThanOrEqual, iteratorGPR, endGPR);
    12281230    {
    1229         jit.store64(undefinedGPR, BaseIndex(GPRInfo::callFrameRegister, iteratorGPR, TimesOne));
    1230         jit.sub64(TrustedImm32(sizeof(Register)), iteratorGPR);
    1231         jit.jump(initLoop);
     1231        store64(undefinedGPR, BaseIndex(GPRInfo::callFrameRegister, iteratorGPR, TimesOne));
     1232        sub64(TrustedImm32(sizeof(Register)), iteratorGPR);
     1233        jump(initLoop);
    12321234    }
    1233     initDone.link(&jit);
    1234 
    1235     // emitWriteBarrier(m_codeBlock).
    1236     jit.loadPtr(addressFor(CallFrameSlot::codeBlock), argumentGPR1);
    1237     Jump ownerIsRememberedOrInEden = jit.barrierBranch(vm, argumentGPR1, argumentGPR2);
    1238 
    1239     jit.move(canBeOptimizedGPR, GPRInfo::numberTagRegister); // save.
    1240     jit.setupArguments<decltype(operationWriteBarrierSlowPath)>(&vm, argumentGPR1);
    1241     jit.prepareCallOperation(vm);
    1242     Call operationWriteBarrierCall = jit.call(OperationPtrTag);
    1243 
    1244     jit.move(GPRInfo::numberTagRegister, canBeOptimizedGPR); // restore.
    1245     jit.move(TrustedImm64(JSValue::NumberTag), GPRInfo::numberTagRegister);
    1246     ownerIsRememberedOrInEden.link(&jit);
     1235    initDone.link(this);
     1236
     1237    // Implementing emitWriteBarrier(m_codeBlock).
     1238    Jump ownerIsRememberedOrInEden = barrierBranch(vm, codeBlockGPR, argumentGPR2);
     1239
     1240    setupArguments<decltype(operationWriteBarrierSlowPath)>(&vm, codeBlockGPR);
     1241    prepareCallOperation(vm);
     1242    Call operationWriteBarrierCall = call(OperationPtrTag);
     1243
     1244    if (canBeOptimized)
     1245        loadPtr(addressFor(CallFrameSlot::codeBlock), codeBlockGPR);
     1246
     1247    ownerIsRememberedOrInEden.link(this);
    12471248
    12481249#if ENABLE(DFG_JIT)
     1250    // Implementing emitEnterOptimizationCheck().
    12491251    Call operationOptimizeCall;
    1250     if (Options::useDFGJIT()) {
    1251         // emitEnterOptimizationCheck().
     1252    if (canBeOptimized) {
    12521253        JumpList skipOptimize;
    12531254
    1254         skipOptimize.append(jit.branchTest32(Zero, canBeOptimizedGPR));
    1255 
    1256         jit.loadPtr(addressFor(CallFrameSlot::codeBlock), argumentGPR1);
    1257         skipOptimize.append(jit.branchAdd32(Signed, TrustedImm32(Options::executionCounterIncrementForEntry()), Address(argumentGPR1, CodeBlock::offsetOfJITExecuteCounter())));
    1258 
    1259         jit.copyLLIntBaselineCalleeSavesFromFrameOrRegisterToEntryFrameCalleeSavesBuffer(vm.topEntryFrame);
    1260 
    1261         jit.setupArguments<decltype(operationOptimize)>(&vm, TrustedImm32(0));
    1262         jit.prepareCallOperation(vm);
    1263         operationOptimizeCall = jit.call(OperationPtrTag);
    1264 
    1265         skipOptimize.append(jit.branchTestPtr(Zero, returnValueGPR));
    1266         jit.farJump(returnValueGPR, GPRInfo::callFrameRegister);
    1267 
    1268         skipOptimize.link(&jit);
     1255        skipOptimize.append(branchAdd32(Signed, TrustedImm32(Options::executionCounterIncrementForEntry()), Address(codeBlockGPR, CodeBlock::offsetOfJITExecuteCounter())));
     1256
     1257        copyLLIntBaselineCalleeSavesFromFrameOrRegisterToEntryFrameCalleeSavesBuffer(vm.topEntryFrame);
     1258
     1259        setupArguments<decltype(operationOptimize)>(&vm, TrustedImm32(0));
     1260        prepareCallOperation(vm);
     1261        operationOptimizeCall = call(OperationPtrTag);
     1262
     1263        skipOptimize.append(branchTestPtr(Zero, returnValueGPR));
     1264        farJump(returnValueGPR, GPRInfo::callFrameRegister);
     1265
     1266        skipOptimize.link(this);
    12691267    }
    12701268#endif // ENABLE(DFG_JIT)
    12711269
    12721270#if CPU(X86_64)
    1273     jit.pop(X86Registers::ebp);
     1271    pop(X86Registers::ebp);
    12741272#elif CPU(ARM64)
    1275     jit.popPair(framePointerRegister, linkRegister);
    1276 #endif
    1277     jit.ret();
    1278 
    1279     LinkBuffer patchBuffer(jit, GLOBAL_THUNK_ID, LinkBuffer::Profile::ExtraCTIThunk);
     1273    popPair(framePointerRegister, linkRegister);
     1274#endif
     1275    ret();
     1276
     1277    LinkBuffer patchBuffer(*this, GLOBAL_THUNK_ID, LinkBuffer::Profile::ExtraCTIThunk);
    12801278    patchBuffer.link(operationWriteBarrierCall, FunctionPtr<OperationPtrTag>(operationWriteBarrierSlowPath));
    12811279#if ENABLE(DFG_JIT)
    1282     if (Options::useDFGJIT())
     1280    if (canBeOptimized)
    12831281        patchBuffer.link(operationOptimizeCall, FunctionPtr<OperationPtrTag>(operationOptimize));
    12841282#endif
    1285     return FINALIZE_CODE(patchBuffer, JITThunkPtrTag, "Baseline: op_enter_handler");
     1283    return FINALIZE_CODE(patchBuffer, JITThunkPtrTag, thunkName);
     1284}
     1285
     1286MacroAssemblerCodeRef<JITThunkPtrTag> JIT::op_enter_canBeOptimized_Generator(VM& vm)
     1287{
     1288    JIT jit(vm);
     1289    constexpr bool canBeOptimized = true;
     1290    return jit.op_enter_Generator(vm, canBeOptimized, "Baseline: op_enter_canBeOptimized");
     1291}
     1292
     1293MacroAssemblerCodeRef<JITThunkPtrTag> JIT::op_enter_cannotBeOptimized_Generator(VM& vm)
     1294{
     1295    JIT jit(vm);
     1296    constexpr bool canBeOptimized = false;
     1297    return jit.op_enter_Generator(vm, canBeOptimized, "Baseline: op_enter_cannotBeOptimized");
    12861298}
    12871299#endif // ENABLE(EXTRA_CTI_THUNKS)
     
    14351447        store64(regT0, ptr);
    14361448    }
    1437 #endif
    1438 
    1439     // Emit the JIT optimization check:
     1449#else
     1450    UNUSED_PARAM(instruction);
     1451#endif
     1452
     1453    // Emit the JIT optimization check:
    14401454    if (canBeOptimized()) {
     1455        constexpr GPRReg codeBlockGPR = regT0;
     1456        loadPtr(addressFor(CallFrameSlot::codeBlock), codeBlockGPR);
    14411457        addSlowCase(branchAdd32(PositiveOrZero, TrustedImm32(Options::executionCounterIncrementForLoop()),
    1442             AbsoluteAddress(m_codeBlock->addressOfJITExecuteCounter())));
     1458            Address(codeBlockGPR, CodeBlock::offsetOfJITExecuteCounter())));
    14431459    }
    14441460}
    14451461
    1446 void JIT::emitSlow_op_loop_hint(const Instruction* currentInstruction, Vector<SlowCaseEntry>::iterator& iter)
     1462void JIT::emitSlow_op_loop_hint(const Instruction* instruction, Vector<SlowCaseEntry>::iterator& iter)
    14471463{
    14481464#if ENABLE(DFG_JIT)
     
    14511467        linkAllSlowCases(iter);
    14521468
     1469#if !ENABLE(EXTRA_CTI_THUNKS)
    14531470        copyLLIntBaselineCalleeSavesFromFrameOrRegisterToEntryFrameCalleeSavesBuffer(vm().topEntryFrame);
    14541471
     
    14631480        noOptimizedEntry.link(this);
    14641481
    1465         emitJumpSlowToHot(jump(), currentInstruction->size());
     1482#else // ENABLE(EXTRA_CTI_THUNKS)
     1483        uint32_t bytecodeOffset = m_bytecodeIndex.offset();
     1484        ASSERT(BytecodeIndex(bytecodeOffset) == m_bytecodeIndex);
     1485        ASSERT(m_codeBlock->instructionAt(m_bytecodeIndex) == instruction);
     1486
     1487        constexpr GPRReg bytecodeOffsetGPR = regT7;
     1488
     1489        move(TrustedImm32(bytecodeOffset), bytecodeOffsetGPR);
     1490        emitNakedNearCall(vm().getCTIStub(op_loop_hint_Generator).retaggedCode<NoPtrTag>());
     1491#endif // !ENABLE(EXTRA_CTI_THUNKS)
    14661492    }
    1467 #else
    1468     UNUSED_PARAM(currentInstruction);
     1493#endif // ENABLE(DFG_JIT)
    14691494    UNUSED_PARAM(iter);
    1470 #endif
    1471 }
     1495    UNUSED_PARAM(instruction);
     1496}
     1497
     1498#if ENABLE(EXTRA_CTI_THUNKS)
     1499
     1500#if ENABLE(DFG_JIT)
     1501MacroAssemblerCodeRef<JITThunkPtrTag> JIT::op_loop_hint_Generator(VM& vm)
     1502{
     1503    // The thunk generated by this function can only work with the LLInt / Baseline JIT because
     1504    // it makes assumptions about the right globalObject being available from CallFrame::codeBlock().
     1505    // DFG/FTL may inline functions belonging to other globalObjects, which may not match
     1506    // CallFrame::codeBlock().
     1507    JIT jit(vm);
     1508
     1509    jit.tagReturnAddress();
     1510
     1511    constexpr GPRReg bytecodeOffsetGPR = regT7; // incoming.
     1512
     1513#if CPU(X86_64)
     1514    jit.push(framePointerRegister);
     1515#elif CPU(ARM64)
     1516    jit.pushPair(framePointerRegister, linkRegister);
     1517#endif
     1518
     1519    auto usedRegisters = RegisterSet::stubUnavailableRegisters();
     1520    usedRegisters.add(bytecodeOffsetGPR);
     1521    jit.copyLLIntBaselineCalleeSavesFromFrameOrRegisterToEntryFrameCalleeSavesBuffer(vm.topEntryFrame, usedRegisters);
     1522
     1523    jit.store32(bytecodeOffsetGPR, CCallHelpers::tagFor(CallFrameSlot::argumentCountIncludingThis));
     1524    jit.lshift32(TrustedImm32(BytecodeIndex::checkpointShift), bytecodeOffsetGPR);
     1525    jit.setupArguments<decltype(operationOptimize)>(TrustedImmPtr(&vm), bytecodeOffsetGPR);
     1526    jit.prepareCallOperation(vm);
     1527    Call operationCall = jit.call(OperationPtrTag);
     1528    Jump hasOptimizedEntry = jit.branchTestPtr(NonZero, returnValueGPR);
     1529
     1530#if CPU(X86_64)
     1531    jit.pop(framePointerRegister);
     1532#elif CPU(ARM64)
     1533    jit.popPair(framePointerRegister, linkRegister);
     1534#endif
     1535    jit.ret();
     1536
     1537    hasOptimizedEntry.link(&jit);
     1538#if CPU(X86_64)
     1539    jit.addPtr(CCallHelpers::TrustedImm32(2 * sizeof(CPURegister)), stackPointerRegister);
     1540#elif CPU(ARM64)
     1541    jit.popPair(framePointerRegister, linkRegister);
     1542#endif
     1543    if (ASSERT_ENABLED) {
     1544        Jump ok = jit.branchPtr(MacroAssembler::Above, returnValueGPR, TrustedImmPtr(bitwise_cast<void*>(static_cast<intptr_t>(1000))));
     1545        jit.abortWithReason(JITUnreasonableLoopHintJumpTarget);
     1546        ok.link(&jit);
     1547    }
     1548
     1549    jit.farJump(returnValueGPR, GPRInfo::callFrameRegister);
     1550
     1551    LinkBuffer patchBuffer(jit, GLOBAL_THUNK_ID, LinkBuffer::Profile::ExtraCTIThunk);
     1552    patchBuffer.link(operationCall, FunctionPtr<OperationPtrTag>(operationOptimize));
     1553    return FINALIZE_CODE(patchBuffer, JITThunkPtrTag, "Baseline: op_loop_hint");
     1554}
     1555#endif // ENABLE(DFG_JIT)
     1556#endif // !ENABLE(EXTRA_CTI_THUNKS)
    14721557
    14731558void JIT::emit_op_check_traps(const Instruction*)
  • trunk/Source/JavaScriptCore/jit/JITOpcodes32_64.cpp

    r277902 r278576  
    10671067    // registers to zap stale pointers, to avoid unnecessarily prolonging
    10681068    // object lifetime and increasing GC pressure.
    1069     for (int i = CodeBlock::llintBaselineCalleeSaveSpaceAsVirtualRegisters(); i < m_codeBlock->numVars(); ++i)
     1069    for (unsigned i = CodeBlock::llintBaselineCalleeSaveSpaceAsVirtualRegisters(); i < m_codeBlock->numVars(); ++i)
    10701070        emitStore(virtualRegisterForLocal(i), jsUndefined());
    10711071
  • trunk/Source/JavaScriptCore/jit/ThunkGenerators.cpp

    r278458 r278576  
    8282    CCallHelpers jit;
    8383
    84 #if CPU(X86_64)
    85     jit.addPtr(CCallHelpers::TrustedImm32(2 * sizeof(CPURegister)), X86Registers::esp);
    86 #elif CPU(ARM64)
    87     jit.popPair(CCallHelpers::framePointerRegister, CCallHelpers::linkRegister);
    88 #endif
     84    jit.addPtr(CCallHelpers::TrustedImm32(2 * sizeof(CPURegister)), CCallHelpers::stackPointerRegister);
    8985
    9086    CCallHelpers::Jump continuation = jit.jump();
Note: See TracChangeset for help on using the changeset viewer.