Changeset 191865 in webkit


Ignore:
Timestamp:
Nov 1, 2015 3:37:03 PM (8 years ago)
Author:
fpizlo@apple.com
Message:

B3::reduceStrength's DCE should be more agro and less wrong
https://bugs.webkit.org/show_bug.cgi?id=150748

Reviewed by Geoffrey Garen.

Source/JavaScriptCore:

First of all, our DCE had a bug where it would keep Upsilons after it deleted the Phis that
they referenced. But our B3 DCE was also not aggressive enough. It would not eliminate
cycles. It was also probably slower than it needed to be, since it would eliminate all
never-referenced things on each fixpoint.

This adds a presume-everyone-is-dead-and-find-live-things style DCE. This is very natural to
write, except for Upsilons. For everything but Upsilons, it's just a worklist algorithm. For
Upsilons, it's a fixpoint. It works fine in the end.

I kept finding bugs in this algorithm when I tested it against my "Complex" test that I was
writing as a compile time benchmark. So, I include that test in this change. I also include
the small lowering extensions that it needed - shifting and zero extending.

This change also adds an LLVM version of the Complex test. Though the LLVM version feels
more natural to write because LLVM has traditional Phi's rather than our quirky Phi's, in
the end LLVM ends up performing very badly - 10x to 20x worse than B3. Some of that gap will
close once we give B3 a register allocator, but still, that's pretty good news for our B3
strategy.

  • JavaScriptCore.xcodeproj/project.pbxproj:
  • assembler/MacroAssemblerX86_64.h:

(JSC::MacroAssemblerX86_64::lshift64):
(JSC::MacroAssemblerX86_64::rshift64):

  • assembler/X86Assembler.h:

(JSC::X86Assembler::shlq_i8r):
(JSC::X86Assembler::shlq_CLr):
(JSC::X86Assembler::imull_rr):

  • b3/B3BasicBlock.cpp:

(JSC::B3::BasicBlock::replacePredecessor):
(JSC::B3::BasicBlock::dump):
(JSC::B3::BasicBlock::removeNops): Deleted.

  • b3/B3BasicBlock.h:

(JSC::B3::BasicBlock::frequency):

  • b3/B3Common.cpp:

(JSC::B3::shouldSaveIRBeforePhase):
(JSC::B3::shouldMeasurePhaseTiming):

  • b3/B3Common.h:

(JSC::B3::isRepresentableAsImpl):

  • b3/B3Generate.cpp:

(JSC::B3::generate):
(JSC::B3::generateToAir):

  • b3/B3LowerToAir.cpp:

(JSC::B3::Air::LowerToAir::tryAnd):
(JSC::B3::Air::LowerToAir::tryShl):
(JSC::B3::Air::LowerToAir::tryStoreAddLoad):
(JSC::B3::Air::LowerToAir::tryTrunc):
(JSC::B3::Air::LowerToAir::tryZExt32):
(JSC::B3::Air::LowerToAir::tryArgumentReg):

  • b3/B3LoweringMatcher.patterns:
  • b3/B3PhaseScope.cpp:

(JSC::B3::PhaseScope::PhaseScope):

  • b3/B3PhaseScope.h:
  • b3/B3ReduceStrength.cpp:
  • b3/B3TimingScope.cpp: Added.

(JSC::B3::TimingScope::TimingScope):
(JSC::B3::TimingScope::~TimingScope):

  • b3/B3TimingScope.h: Added.
  • b3/B3Validate.cpp:
  • b3/air/AirAllocateStack.cpp:

(JSC::B3::Air::allocateStack):

  • b3/air/AirGenerate.cpp:

(JSC::B3::Air::generate):

  • b3/air/AirInstInlines.h:

(JSC::B3::Air::ForEach<Arg>::forEach):
(JSC::B3::Air::Inst::forEach):
(JSC::B3::Air::isLshift32Valid):
(JSC::B3::Air::isLshift64Valid):

  • b3/air/AirLiveness.h:

(JSC::B3::Air::Liveness::isAlive):
(JSC::B3::Air::Liveness::Liveness):
(JSC::B3::Air::Liveness::LocalCalc::execute):

  • b3/air/AirOpcode.opcodes:
  • b3/air/AirPhaseScope.cpp:

(JSC::B3::Air::PhaseScope::PhaseScope):

  • b3/air/AirPhaseScope.h:
  • b3/testb3.cpp:

(JSC::B3::testBranchEqualFoldPtr):
(JSC::B3::testComplex):
(JSC::B3::run):

  • runtime/Options.h:

Source/WTF:

  • wtf/GraphNodeWorklist.h:

(WTF::GraphNodeWorklist::saw): This method is super useful.

Tools:

Add an LLVM version of testb3's "testComplex".

  • ReducedFTL/ComplexTest.cpp: Added.
Location:
trunk
Files:
3 added
27 edited

Legend:

Unmodified
Added
Removed
  • trunk/Source/JavaScriptCore/ChangeLog

    r191864 r191865  
     12015-10-31  Filip Pizlo  <fpizlo@apple.com>
     2
     3        B3::reduceStrength's DCE should be more agro and less wrong
     4        https://bugs.webkit.org/show_bug.cgi?id=150748
     5
     6        Reviewed by Geoffrey Garen.
     7
     8        First of all, our DCE had a bug where it would keep Upsilons after it deleted the Phis that
     9        they referenced. But our B3 DCE was also not aggressive enough. It would not eliminate
     10        cycles. It was also probably slower than it needed to be, since it would eliminate all
     11        never-referenced things on each fixpoint.
     12
     13        This adds a presume-everyone-is-dead-and-find-live-things style DCE. This is very natural to
     14        write, except for Upsilons. For everything but Upsilons, it's just a worklist algorithm. For
     15        Upsilons, it's a fixpoint. It works fine in the end.
     16
     17        I kept finding bugs in this algorithm when I tested it against my "Complex" test that I was
     18        writing as a compile time benchmark. So, I include that test in this change. I also include
     19        the small lowering extensions that it needed - shifting and zero extending.
     20
     21        This change also adds an LLVM version of the Complex test. Though the LLVM version feels
     22        more natural to write because LLVM has traditional Phi's rather than our quirky Phi's, in
     23        the end LLVM ends up performing very badly - 10x to 20x worse than B3. Some of that gap will
     24        close once we give B3 a register allocator, but still, that's pretty good news for our B3
     25        strategy.
     26
     27        * JavaScriptCore.xcodeproj/project.pbxproj:
     28        * assembler/MacroAssemblerX86_64.h:
     29        (JSC::MacroAssemblerX86_64::lshift64):
     30        (JSC::MacroAssemblerX86_64::rshift64):
     31        * assembler/X86Assembler.h:
     32        (JSC::X86Assembler::shlq_i8r):
     33        (JSC::X86Assembler::shlq_CLr):
     34        (JSC::X86Assembler::imull_rr):
     35        * b3/B3BasicBlock.cpp:
     36        (JSC::B3::BasicBlock::replacePredecessor):
     37        (JSC::B3::BasicBlock::dump):
     38        (JSC::B3::BasicBlock::removeNops): Deleted.
     39        * b3/B3BasicBlock.h:
     40        (JSC::B3::BasicBlock::frequency):
     41        * b3/B3Common.cpp:
     42        (JSC::B3::shouldSaveIRBeforePhase):
     43        (JSC::B3::shouldMeasurePhaseTiming):
     44        * b3/B3Common.h:
     45        (JSC::B3::isRepresentableAsImpl):
     46        * b3/B3Generate.cpp:
     47        (JSC::B3::generate):
     48        (JSC::B3::generateToAir):
     49        * b3/B3LowerToAir.cpp:
     50        (JSC::B3::Air::LowerToAir::tryAnd):
     51        (JSC::B3::Air::LowerToAir::tryShl):
     52        (JSC::B3::Air::LowerToAir::tryStoreAddLoad):
     53        (JSC::B3::Air::LowerToAir::tryTrunc):
     54        (JSC::B3::Air::LowerToAir::tryZExt32):
     55        (JSC::B3::Air::LowerToAir::tryArgumentReg):
     56        * b3/B3LoweringMatcher.patterns:
     57        * b3/B3PhaseScope.cpp:
     58        (JSC::B3::PhaseScope::PhaseScope):
     59        * b3/B3PhaseScope.h:
     60        * b3/B3ReduceStrength.cpp:
     61        * b3/B3TimingScope.cpp: Added.
     62        (JSC::B3::TimingScope::TimingScope):
     63        (JSC::B3::TimingScope::~TimingScope):
     64        * b3/B3TimingScope.h: Added.
     65        * b3/B3Validate.cpp:
     66        * b3/air/AirAllocateStack.cpp:
     67        (JSC::B3::Air::allocateStack):
     68        * b3/air/AirGenerate.cpp:
     69        (JSC::B3::Air::generate):
     70        * b3/air/AirInstInlines.h:
     71        (JSC::B3::Air::ForEach<Arg>::forEach):
     72        (JSC::B3::Air::Inst::forEach):
     73        (JSC::B3::Air::isLshift32Valid):
     74        (JSC::B3::Air::isLshift64Valid):
     75        * b3/air/AirLiveness.h:
     76        (JSC::B3::Air::Liveness::isAlive):
     77        (JSC::B3::Air::Liveness::Liveness):
     78        (JSC::B3::Air::Liveness::LocalCalc::execute):
     79        * b3/air/AirOpcode.opcodes:
     80        * b3/air/AirPhaseScope.cpp:
     81        (JSC::B3::Air::PhaseScope::PhaseScope):
     82        * b3/air/AirPhaseScope.h:
     83        * b3/testb3.cpp:
     84        (JSC::B3::testBranchEqualFoldPtr):
     85        (JSC::B3::testComplex):
     86        (JSC::B3::run):
     87        * runtime/Options.h:
     88
    1892015-11-01  Alexey Proskuryakov  <ap@apple.com>
    290
  • trunk/Source/JavaScriptCore/JavaScriptCore.xcodeproj/project.pbxproj

    r191846 r191865  
    316316                0F45703C1BE45F0A0062A629 /* AirReportUsedRegisters.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 0F45703A1BE45F0A0062A629 /* AirReportUsedRegisters.cpp */; };
    317317                0F45703D1BE45F0A0062A629 /* AirReportUsedRegisters.h in Headers */ = {isa = PBXBuildFile; fileRef = 0F45703B1BE45F0A0062A629 /* AirReportUsedRegisters.h */; };
     318                0F4570401BE584CA0062A629 /* B3TimingScope.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 0F45703E1BE584CA0062A629 /* B3TimingScope.cpp */; };
     319                0F4570411BE584CA0062A629 /* B3TimingScope.h in Headers */ = {isa = PBXBuildFile; fileRef = 0F45703F1BE584CA0062A629 /* B3TimingScope.h */; };
    318320                0F46808214BA572D00BFE272 /* JITExceptions.h in Headers */ = {isa = PBXBuildFile; fileRef = 0F46808014BA572700BFE272 /* JITExceptions.h */; settings = {ATTRIBUTES = (Private, ); }; };
    319321                0F46808314BA573100BFE272 /* JITExceptions.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 0F46807F14BA572700BFE272 /* JITExceptions.cpp */; };
     
    23342336                0F45703A1BE45F0A0062A629 /* AirReportUsedRegisters.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; name = AirReportUsedRegisters.cpp; path = b3/air/AirReportUsedRegisters.cpp; sourceTree = "<group>"; };
    23352337                0F45703B1BE45F0A0062A629 /* AirReportUsedRegisters.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; name = AirReportUsedRegisters.h; path = b3/air/AirReportUsedRegisters.h; sourceTree = "<group>"; };
     2338                0F45703E1BE584CA0062A629 /* B3TimingScope.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; name = B3TimingScope.cpp; path = b3/B3TimingScope.cpp; sourceTree = "<group>"; };
     2339                0F45703F1BE584CA0062A629 /* B3TimingScope.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; name = B3TimingScope.h; path = b3/B3TimingScope.h; sourceTree = "<group>"; };
    23362340                0F46807F14BA572700BFE272 /* JITExceptions.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; path = JITExceptions.cpp; sourceTree = "<group>"; };
    23372341                0F46808014BA572700BFE272 /* JITExceptions.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = JITExceptions.h; sourceTree = "<group>"; };
     
    44564460                                0FEC84EF1BDACDAC0080FF74 /* B3SwitchValue.cpp */,
    44574461                                0FEC84F01BDACDAC0080FF74 /* B3SwitchValue.h */,
     4462                                0F45703E1BE584CA0062A629 /* B3TimingScope.cpp */,
     4463                                0F45703F1BE584CA0062A629 /* B3TimingScope.h */,
    44584464                                0FEC84F11BDACDAC0080FF74 /* B3Type.cpp */,
    44594465                                0FEC84F21BDACDAC0080FF74 /* B3Type.h */,
     
    66096615                                BC18C3F40E16F5CD00B34460 /* Completion.h in Headers */,
    66106616                                996B731D1BDA08EF00331B84 /* JSGlobalObject.lut.h in Headers */,
     6617                                0F4570411BE584CA0062A629 /* B3TimingScope.h in Headers */,
    66116618                                996B731F1BDA08EF00331B84 /* JSPromisePrototype.lut.h in Headers */,
    66126619                                0F6FC751196110A800E1D02D /* ComplexGetStatus.h in Headers */,
     
    85768583                                7C184E2217BEE240007CB63A /* JSPromiseConstructor.cpp in Sources */,
    85778584                                7C008CDA187124BB00955C24 /* JSPromiseDeferred.cpp in Sources */,
     8585                                0F4570401BE584CA0062A629 /* B3TimingScope.cpp in Sources */,
    85788586                                7C184E1E17BEE22E007CB63A /* JSPromisePrototype.cpp in Sources */,
    85798587                                2A05ABD51961DF2400341750 /* JSPropertyNameEnumerator.cpp in Sources */,
  • trunk/Source/JavaScriptCore/assembler/MacroAssemblerX86_64.h

    r191742 r191865  
    330330    }
    331331   
     332    void lshift64(RegisterID src, RegisterID dest)
     333    {
     334        ASSERT(src != dest);
     335       
     336        if (src == X86Registers::ecx)
     337            m_assembler.shlq_CLr(dest);
     338        else {
     339            // Can only shift by ecx, so we do some swapping if we see anything else.
     340            swap(src, X86Registers::ecx);
     341            m_assembler.shlq_CLr(dest);
     342            swap(src, X86Registers::ecx);
     343        }
     344    }
     345   
    332346    void rshift64(TrustedImm32 imm, RegisterID dest)
    333347    {
  • trunk/Source/JavaScriptCore/assembler/X86Assembler.h

    r189918 r191865  
    911911        }
    912912    }
     913
     914    void shlq_CLr(RegisterID dst)
     915    {
     916        m_formatter.oneByteOp64(OP_GROUP2_EvCL, GROUP2_OP_SHL, dst);
     917    }
    913918#endif // CPU(X86_64)
    914919
  • trunk/Source/JavaScriptCore/b3/B3BasicBlock.cpp

    r191718 r191865  
    6868}
    6969
    70 void BasicBlock::removeNops(Procedure& procedure)
    71 {
    72     unsigned sourceIndex = 0;
    73     unsigned targetIndex = 0;
    74     while (sourceIndex < size()) {
    75         Value* value = m_values[sourceIndex++];
    76         if (value->opcode() == Nop)
    77             procedure.deleteValue(value);
    78         else
    79             m_values[targetIndex++] = value;
    80     }
    81     m_values.resize(targetIndex);
    82 }
    83 
    8470void BasicBlock::dump(PrintStream& out) const
    8571{
  • trunk/Source/JavaScriptCore/b3/B3BasicBlock.h

    r191718 r191865  
    9696    double frequency() const { return m_frequency; }
    9797
    98     void removeNops(Procedure&);
    99 
    10098    void dump(PrintStream&) const;
    10199    void deepDump(PrintStream&) const;
  • trunk/Source/JavaScriptCore/b3/B3Common.cpp

    r191705 r191865  
    6060}
    6161
     62bool shouldMeasurePhaseTiming()
     63{
     64    return Options::logB3PhaseTimes();
     65}
     66
    6267} } // namespace JSC::B3
    6368
  • trunk/Source/JavaScriptCore/b3/B3Common.h

    r191816 r191865  
    3939bool shouldValidateIRAtEachPhase();
    4040bool shouldSaveIRBeforePhase();
     41bool shouldMeasurePhaseTiming();
    4142
    4243template<typename ResultType, typename InputType, typename BitsType>
  • trunk/Source/JavaScriptCore/b3/B3Generate.cpp

    r191705 r191865  
    3636#include "B3Procedure.h"
    3737#include "B3ReduceStrength.h"
     38#include "B3TimingScope.h"
    3839#include "B3Validate.h"
    3940
     
    4243void generate(Procedure& procedure, CCallHelpers& jit)
    4344{
     45    TimingScope timingScope("generate");
     46
    4447    Air::Code code;
    4548    generateToAir(procedure, code);
     
    4952void generateToAir(Procedure& procedure, Air::Code& code)
    5053{
     54    TimingScope timingScope("generateToAir");
     55   
    5156    // We don't require the incoming IR to have predecessors computed.
    5257    procedure.resetReachability();
  • trunk/Source/JavaScriptCore/b3/B3LowerToAir.cpp

    r191860 r191865  
    586586        }
    587587    }
     588
     589    bool tryShl(Value* value, Value* amount)
     590    {
     591        Air::Opcode opcode = value->type() == Int32 ? Lshift32 : Lshift64;
     592
     593        if (imm(amount)) {
     594            append(Move, tmp(value), tmp(currentValue));
     595            append(opcode, imm(amount), tmp(currentValue));
     596            return true;
     597        }
     598
     599        append(Move, tmp(value), tmp(currentValue));
     600        append(Move, tmp(amount), Tmp(X86Registers::ecx));
     601        append(opcode, Tmp(X86Registers::ecx), tmp(currentValue));
     602        return true;
     603    }
    588604   
    589605    bool tryStoreAddLoad(Value* left, Value* right, Value*)
     
    648664        // https://bugs.webkit.org/show_bug.cgi?id=150775
    649665        append(Move, tmp(value), tmp(currentValue));
     666        return true;
     667    }
     668
     669    bool tryZExt32(Value* value)
     670    {
     671        // FIXME: This could just be a copy propagation rule, with caveats. ;-)
     672        // https://bugs.webkit.org/show_bug.cgi?id=150775
     673        append(Move32, tmp(value), tmp(currentValue));
    650674        return true;
    651675    }
  • trunk/Source/JavaScriptCore/b3/B3LoweringMatcher.patterns

    r191838 r191865  
    3535And = BitAnd(left, right)
    3636
     37Shl = Shl(value, amount)
     38
    3739StoreAddLoad = Store(Add(left, right), address)
    3840StoreSubLoad = Store(Add(left, right), address)
     
    4244TruncArgumentReg = Trunc(argumentReg = ArgumentReg())
    4345Trunc = Trunc(value)
     46
     47ZExt32 = ZExt32(value)
    4448
    4549ArgumentReg = ArgumentReg()
  • trunk/Source/JavaScriptCore/b3/B3PhaseScope.cpp

    r191705 r191865  
    4040    : m_procedure(procedure)
    4141    , m_name(name)
     42    , m_timingScope(name)
    4243{
    4344    if (shouldDumpIRAtEachPhase()) {
  • trunk/Source/JavaScriptCore/b3/B3PhaseScope.h

    r191705 r191865  
    2929#if ENABLE(B3_JIT)
    3030
     31#include "B3TimingScope.h"
    3132#include <wtf/Noncopyable.h>
    3233#include <wtf/text/CString.h>
     
    4546    Procedure& m_procedure;
    4647    const char* m_name;
     48    TimingScope m_timingScope;
    4749    CString m_dumpBefore;
    4850};
  • trunk/Source/JavaScriptCore/b3/B3ReduceStrength.cpp

    r191838 r191865  
    3030
    3131#include "B3ControlValue.h"
     32#include "B3IndexSet.h"
    3233#include "B3InsertionSetInlines.h"
    3334#include "B3MemoryValue.h"
    3435#include "B3PhaseScope.h"
    3536#include "B3ProcedureInlines.h"
     37#include "B3UpsilonValue.h"
    3638#include "B3UseCounts.h"
    3739#include "B3ValueInlines.h"
     40#include <wtf/GraphNodeWorklist.h>
    3841
    3942namespace JSC { namespace B3 {
     
    6265                    process();
    6366                m_insertionSet.execute(m_block);
    64 
    65                 block->removeNops(m_proc);
    6667            }
    6768
     
    7172            }
    7273
    73             UseCounts useCounts(m_proc);
    74             for (Value* value : m_proc.values()) {
    75                 if (!useCounts[value] && !value->effects().mustExecute()) {
    76                     value->replaceWithNop();
    77                     m_changed = true;
    78                 }
    79             }
     74            killDeadCode();
    8075           
    8176            result |= m_changed;
     
    338333    }
    339334
     335    void killDeadCode()
     336    {
     337        GraphNodeWorklist<Value*, IndexSet<Value>> worklist;
     338        Vector<UpsilonValue*, 64> upsilons;
     339        for (BasicBlock* block : m_proc) {
     340            for (Value* value : *block) {
     341                Effects effects = value->effects();
     342                // We don't care about SSA Effects, since we model them more accurately than the
     343                // effects() method does.
     344                effects.writesSSAState = false;
     345                effects.readsSSAState = false;
     346                if (effects.mustExecute())
     347                    worklist.push(value);
     348                if (UpsilonValue* upsilon = value->as<UpsilonValue>())
     349                    upsilons.append(upsilon);
     350            }
     351        }
     352        for (;;) {
     353            while (Value* value = worklist.pop()) {
     354                for (Value* child : value->children())
     355                    worklist.push(child);
     356            }
     357           
     358            bool didPush = false;
     359            for (size_t upsilonIndex = 0; upsilonIndex < upsilons.size(); ++upsilonIndex) {
     360                UpsilonValue* upsilon = upsilons[upsilonIndex];
     361                if (worklist.saw(upsilon->phi())) {
     362                    worklist.push(upsilon);
     363                    upsilons[upsilonIndex--] = upsilons.last();
     364                    upsilons.takeLast();
     365                    didPush = true;
     366                }
     367            }
     368            if (!didPush)
     369                break;
     370        }
     371
     372        for (BasicBlock* block : m_proc) {
     373            size_t sourceIndex = 0;
     374            size_t targetIndex = 0;
     375            while (sourceIndex < block->size()) {
     376                Value* value = block->at(sourceIndex++);
     377                if (worklist.saw(value))
     378                    block->at(targetIndex++) = value;
     379                else {
     380                    m_proc.deleteValue(value);
     381                   
     382                    // It's not entirely clear if this is needed. I think it makes sense to have
     383                    // this force a rerun of the fixpoint for now, since that will make it easier
     384                    // to do peephole optimizations: removing dead code will make the peephole
     385                    // easier to spot.
     386                    m_changed = true;
     387                }
     388            }
     389            block->values().resize(targetIndex);
     390        }
     391    }
     392
    340393    Procedure& m_proc;
    341394    InsertionSet m_insertionSet;
  • trunk/Source/JavaScriptCore/b3/B3Validate.cpp

    r191816 r191865  
    6262        HashSet<BasicBlock*> blocks;
    6363        HashSet<Value*> valueInProc;
    64         HashSet<Value*> valueInBlock;
     64        HashMap<Value*, unsigned> valueInBlock;
    6565
    6666        for (BasicBlock* block : m_procedure) {
    6767            blocks.add(block);
    6868            for (Value* value : *block)
    69                 valueInBlock.add(value);
     69                valueInBlock.add(value, 0).iterator->value++;
    7070        }
    7171
     
    7575        for (Value* value : valueInProc)
    7676            VALIDATE(valueInBlock.contains(value), ("At ", *value));
    77         for (Value* value : valueInBlock)
    78             VALIDATE(valueInProc.contains(value), ("At ", *value));
     77        for (auto& entry : valueInBlock) {
     78            VALIDATE(valueInProc.contains(entry.key), ("At ", *entry.key));
     79            VALIDATE(entry.value == 1, ("At ", *entry.key));
     80        }
    7981
    8082        for (Value* value : valueInProc) {
  • trunk/Source/JavaScriptCore/b3/air/AirAllocateStack.cpp

    r191845 r191865  
    4242const bool verbose = false;
    4343
    44 bool argumentIsOnStack(const Arg& arg)
    45 {
    46     return arg.isStack() && arg.stackSlot()->kind() == StackSlotKind::Anonymous;
    47 }
    48 
    4944bool attemptAssignment(
    5045    StackSlot* slot, intptr_t offsetFromFP, const Vector<StackSlot*>& otherSlots)
     
    142137
    143138    // Now we handle the anonymous slots.
    144     // FIXME: We should tell Liveness to only track StackSlots.
    145     // https://bugs.webkit.org/show_bug.cgi?id=150751
    146     Liveness<Arg> liveness(code);
     139    Liveness<StackSlot*> liveness(code);
    147140    IndexMap<StackSlot, HashSet<StackSlot*>> interference(code.stackSlots().size());
    148141    Vector<StackSlot*> slots;
    149142
    150143    for (BasicBlock* block : code) {
    151         Liveness<Arg>::LocalCalc localCalc(liveness, block);
     144        Liveness<StackSlot*>::LocalCalc localCalc(liveness, block);
    152145
    153146        auto interfere = [&] (Inst& inst) {
    154147            if (verbose)
    155                 dataLog("Interfering: ", listDump(localCalc.live()), "\n");
     148                dataLog("Interfering: ", pointerListDump(localCalc.live()), "\n");
    156149           
    157150            // Form a clique of stack slots that interfere. First find the list of stack slots
    158151            // that are live right now.
    159152            slots.resize(0);
    160             for (Arg arg : localCalc.live()) {
    161                 if (argumentIsOnStack(arg))
    162                     slots.append(arg.stackSlot());
     153            for (StackSlot* slot : localCalc.live()) {
     154                if (slot->kind() == StackSlotKind::Anonymous)
     155                    slots.append(slot);
    163156            }
    164157
     
    167160            inst.forEachArg(
    168161                [&] (Arg& arg, Arg::Role role, Arg::Type) {
    169                     if (Arg::isDef(role) && argumentIsOnStack(arg)
    170                         && !localCalc.live().contains(arg))
    171                         slots.append(arg.stackSlot());
     162                    if (Arg::isDef(role) && arg.isStack()) {
     163                        StackSlot* slot = arg.stackSlot();
     164                        if (slot->kind() == StackSlotKind::Anonymous
     165                            && !localCalc.live().contains(slot))
     166                            slots.append(slot);
     167                    }
    172168                });
    173169           
  • trunk/Source/JavaScriptCore/b3/air/AirGenerate.cpp

    r191846 r191865  
    3939#include "B3Common.h"
    4040#include "B3IndexMap.h"
     41#include "B3TimingScope.h"
    4142#include "CCallHelpers.h"
    4243
     
    4546void generate(Code& code, CCallHelpers& jit)
    4647{
     48    TimingScope timingScope("Air::generate");
     49   
    4750    // We don't expect the incoming code to have predecessors computed.
    4851    code.resetReachability();
     
    9194        dataLog(code);
    9295    }
     96
     97    TimingScope codeGenTimingScope("Air::generate backend");
    9398
    9499    // And now, we generate code.
  • trunk/Source/JavaScriptCore/b3/air/AirInstInlines.h

    r191705 r191865  
    5252    }
    5353};
     54template<> struct ForEach<StackSlot*> {
     55    template<typename Functor>
     56    static void forEach(Inst& inst, const Functor& functor)
     57    {
     58        inst.forEachArg(
     59            [&] (Arg& arg, Arg::Role role, Arg::Type type) {
     60                if (!arg.isStack())
     61                    return;
     62                StackSlot* stackSlot = arg.stackSlot();
     63                functor(stackSlot, role, type);
     64                arg = Arg::stack(stackSlot);
     65            });
     66    }
     67};
    5468
    5569template<typename Thing, typename Functor>
     
    91105}
    92106
     107inline bool isLshift64Valid(const Inst& inst)
     108{
     109    return isShiftValid(inst);
     110}
     111
    93112} } } // namespace JSC::B3::Air
    94113
  • trunk/Source/JavaScriptCore/b3/air/AirLiveness.h

    r191705 r191865  
    4040class Liveness {
    4141public:
     42    template<typename T>
     43    static bool isAlive(const T& thing) { return thing.isAlive(); }
     44
     45    static bool isAlive(StackSlot* slot) { return slot->kind() == StackSlotKind::Anonymous; }
     46   
    4247    Liveness(Code& code)
    4348    {
     
    95100            inst.forEach<Thing>(
    96101                [this] (Thing& arg, Arg::Role role, Arg::Type) {
    97                     if (!arg.isAlive())
     102                    if (!isAlive(arg))
    98103                        return;
    99104                    if (Arg::isDef(role))
     
    101106                });
    102107
    103             // Next handle clobbered registers.
    104             if (inst.hasSpecial()) {
    105                 inst.extraClobberedRegs().forEach(
    106                     [this] (Reg reg) {
    107                         m_live.remove(Thing(Tmp(reg)));
    108                     });
    109             }
    110            
    111108            // Finally handle use's.
    112109            inst.forEach<Thing>(
    113110                [this] (Thing& arg, Arg::Role role, Arg::Type) {
    114                     if (!arg.isAlive())
     111                    if (!isAlive(arg))
    115112                        return;
    116113                    if (Arg::isUse(role))
  • trunk/Source/JavaScriptCore/b3/air/AirOpcode.opcodes

    r191846 r191865  
    108108    Imm, Tmp
    109109
     110Lshift64 U:G, UD:G
     111    Tmp*, Tmp
     112    Imm, Tmp
     113
    110114Mul32 U:G, UD:G
    111115    Tmp, Tmp
  • trunk/Source/JavaScriptCore/b3/air/AirPhaseScope.cpp

    r191705 r191865  
    3838    : m_code(code)
    3939    , m_name(name)
     40    , m_timingScope(name)
    4041{
    4142    if (shouldDumpIRAtEachPhase()) {
  • trunk/Source/JavaScriptCore/b3/air/AirPhaseScope.h

    r191705 r191865  
    2929#if ENABLE(B3_JIT)
    3030
     31#include "B3TimingScope.h"
    3132#include <wtf/Noncopyable.h>
    3233#include <wtf/text/CString.h>
     
    4546    Code& m_code;
    4647    const char* m_name;
     48    TimingScope m_timingScope;
    4749    CString m_dumpBefore;
    4850};
  • trunk/Source/JavaScriptCore/b3/testb3.cpp

    r191838 r191865  
    10651065
    10661066    CHECK(compileAndRun<int>(proc) == !value);
     1067}
     1068
     1069void testComplex(unsigned numVars, unsigned numConstructs)
     1070{
     1071    double before = monotonicallyIncreasingTimeMS();
     1072   
     1073    Procedure proc;
     1074    BasicBlock* current = proc.addBlock();
     1075
     1076    Const32Value* one = current->appendNew<Const32Value>(proc, Origin(), 1);
     1077
     1078    Vector<int32_t> varSlots;
     1079    for (unsigned i = numVars; i--;)
     1080        varSlots.append(i);
     1081
     1082    Vector<Value*> vars;
     1083    for (int32_t& varSlot : varSlots) {
     1084        Value* varSlotPtr = current->appendNew<ConstPtrValue>(proc, Origin(), &varSlot);
     1085        vars.append(current->appendNew<MemoryValue>(proc, Load, Int32, Origin(), varSlotPtr));
     1086    }
     1087
     1088    for (unsigned i = 0; i < numConstructs; ++i) {
     1089        if (i & 1) {
     1090            // Control flow diamond.
     1091            unsigned predicateVarIndex = (i >> 1) % numVars;
     1092            unsigned thenIncVarIndex = ((i >> 1) + 1) % numVars;
     1093            unsigned elseIncVarIndex = ((i >> 1) + 2) % numVars;
     1094
     1095            BasicBlock* thenBlock = proc.addBlock();
     1096            BasicBlock* elseBlock = proc.addBlock();
     1097            BasicBlock* continuation = proc.addBlock();
     1098
     1099            current->appendNew<ControlValue>(
     1100                proc, Branch, Origin(), vars[predicateVarIndex],
     1101                FrequentedBlock(thenBlock), FrequentedBlock(elseBlock));
     1102
     1103            UpsilonValue* thenThenResult = thenBlock->appendNew<UpsilonValue>(
     1104                proc, Origin(),
     1105                thenBlock->appendNew<Value>(proc, Add, Origin(), vars[thenIncVarIndex], one));
     1106            UpsilonValue* thenElseResult = thenBlock->appendNew<UpsilonValue>(
     1107                proc, Origin(), vars[elseIncVarIndex]);
     1108            thenBlock->appendNew<ControlValue>(proc, Jump, Origin(), FrequentedBlock(continuation));
     1109
     1110            UpsilonValue* elseElseResult = elseBlock->appendNew<UpsilonValue>(
     1111                proc, Origin(),
     1112                elseBlock->appendNew<Value>(proc, Add, Origin(), vars[elseIncVarIndex], one));
     1113            UpsilonValue* elseThenResult = elseBlock->appendNew<UpsilonValue>(
     1114                proc, Origin(), vars[thenIncVarIndex]);
     1115            elseBlock->appendNew<ControlValue>(proc, Jump, Origin(), FrequentedBlock(continuation));
     1116
     1117            Value* thenPhi = continuation->appendNew<Value>(proc, Phi, Int32, Origin());
     1118            thenThenResult->setPhi(thenPhi);
     1119            elseThenResult->setPhi(thenPhi);
     1120            vars[thenIncVarIndex] = thenPhi;
     1121           
     1122            Value* elsePhi = continuation->appendNew<Value>(proc, Phi, Int32, Origin());
     1123            thenElseResult->setPhi(elsePhi);
     1124            elseElseResult->setPhi(elsePhi);
     1125            vars[elseIncVarIndex] = thenPhi;
     1126           
     1127            current = continuation;
     1128        } else {
     1129            // Loop.
     1130
     1131            BasicBlock* loopEntry = proc.addBlock();
     1132            BasicBlock* loopReentry = proc.addBlock();
     1133            BasicBlock* loopBody = proc.addBlock();
     1134            BasicBlock* loopExit = proc.addBlock();
     1135            BasicBlock* loopSkip = proc.addBlock();
     1136            BasicBlock* continuation = proc.addBlock();
     1137           
     1138            Value* startIndex = vars[(i >> 1) % numVars];
     1139            Value* startSum = current->appendNew<Const32Value>(proc, Origin(), 0);
     1140            current->appendNew<ControlValue>(
     1141                proc, Branch, Origin(), startIndex,
     1142                FrequentedBlock(loopEntry), FrequentedBlock(loopSkip));
     1143
     1144            UpsilonValue* startIndexForBody = loopEntry->appendNew<UpsilonValue>(
     1145                proc, Origin(), startIndex);
     1146            UpsilonValue* startSumForBody = loopEntry->appendNew<UpsilonValue>(
     1147                proc, Origin(), startSum);
     1148            loopEntry->appendNew<ControlValue>(proc, Jump, Origin(), FrequentedBlock(loopBody));
     1149
     1150            Value* bodyIndex = loopBody->appendNew<Value>(proc, Phi, Int32, Origin());
     1151            startIndexForBody->setPhi(bodyIndex);
     1152            Value* bodySum = loopBody->appendNew<Value>(proc, Phi, Int32, Origin());
     1153            startSumForBody->setPhi(bodySum);
     1154            Value* newBodyIndex = loopBody->appendNew<Value>(proc, Sub, Origin(), bodyIndex, one);
     1155            Value* newBodySum = loopBody->appendNew<Value>(
     1156                proc, Add, Origin(),
     1157                bodySum,
     1158                loopBody->appendNew<MemoryValue>(
     1159                    proc, Load, Int32, Origin(),
     1160                    loopBody->appendNew<Value>(
     1161                        proc, Add, Origin(),
     1162                        loopBody->appendNew<ConstPtrValue>(proc, Origin(), varSlots.data()),
     1163                        loopBody->appendNew<Value>(
     1164                            proc, Shl, Origin(),
     1165                            loopBody->appendNew<Value>(
     1166                                proc, ZExt32, Origin(),
     1167                                loopBody->appendNew<Value>(
     1168                                    proc, BitAnd, Origin(),
     1169                                    newBodyIndex,
     1170                                    loopBody->appendNew<Const32Value>(
     1171                                        proc, Origin(), numVars - 1))),
     1172                            loopBody->appendNew<Const32Value>(proc, Origin(), 2)))));
     1173            loopBody->appendNew<ControlValue>(
     1174                proc, Branch, Origin(), newBodyIndex,
     1175                FrequentedBlock(loopReentry), FrequentedBlock(loopExit));
     1176
     1177            loopReentry->appendNew<UpsilonValue>(proc, Origin(), newBodyIndex, bodyIndex);
     1178            loopReentry->appendNew<UpsilonValue>(proc, Origin(), newBodySum, bodySum);
     1179            loopReentry->appendNew<ControlValue>(proc, Jump, Origin(), FrequentedBlock(loopBody));
     1180
     1181            UpsilonValue* exitSum = loopExit->appendNew<UpsilonValue>(proc, Origin(), newBodySum);
     1182            loopExit->appendNew<ControlValue>(proc, Jump, Origin(), FrequentedBlock(continuation));
     1183
     1184            UpsilonValue* skipSum = loopSkip->appendNew<UpsilonValue>(proc, Origin(), startSum);
     1185            loopSkip->appendNew<ControlValue>(proc, Jump, Origin(), FrequentedBlock(continuation));
     1186
     1187            Value* finalSum = continuation->appendNew<Value>(proc, Phi, Int32, Origin());
     1188            exitSum->setPhi(finalSum);
     1189            skipSum->setPhi(finalSum);
     1190
     1191            current = continuation;
     1192            vars[((i >> 1) + 1) % numVars] = finalSum;
     1193        }
     1194    }
     1195
     1196    current->appendNew<ControlValue>(proc, Return, Origin(), vars[0]);
     1197
     1198    compile(proc);
     1199
     1200    double after = monotonicallyIncreasingTimeMS();
     1201    dataLog("    That took ", after - before, " ms.\n");
    10671202}
    10681203
     
    11691304    RUN(testBranchEqualFoldPtr(0));
    11701305
     1306    RUN(testComplex(64, 128));
     1307    RUN(testComplex(64, 256));
     1308    RUN(testComplex(64, 384));
     1309    RUN(testComplex(4, 128));
     1310    RUN(testComplex(4, 256));
     1311    RUN(testComplex(4, 384));
     1312
    11711313    if (!didRun)
    11721314        usage();
  • trunk/Source/JavaScriptCore/runtime/Options.h

    r191840 r191865  
    338338    v(unsigned, fireOSRExitFuzzAtOrAfter, 0, nullptr) \
    339339    \
     340    v(bool, logB3PhaseTimes, false, nullptr) \
     341    \
    340342    v(bool, useDollarVM, false, "installs the $vm debugging tool in global objects") \
    341343    v(optionString, functionOverrides, nullptr, "file with debugging overrides for function bodies") \
  • trunk/Source/WTF/ChangeLog

    r191856 r191865  
     12015-10-31  Filip Pizlo  <fpizlo@apple.com>
     2
     3        B3::reduceStrength's DCE should be more agro and less wrong
     4        https://bugs.webkit.org/show_bug.cgi?id=150748
     5
     6        Reviewed by Geoffrey Garen.
     7
     8        * wtf/GraphNodeWorklist.h:
     9        (WTF::GraphNodeWorklist::saw): This method is super useful.
     10
    1112015-11-01  Philip Chimento  <philip.chimento@gmail.com>
    212
  • trunk/Source/WTF/wtf/GraphNodeWorklist.h

    r191424 r191865  
    5555    }
    5656
     57    bool saw(Node node) { return m_seen.contains(node); }
     58
    5759private:
    5860    Set m_seen;
  • trunk/Tools/ChangeLog

    r191857 r191865  
     12015-10-31  Filip Pizlo  <fpizlo@apple.com>
     2
     3        B3::reduceStrength's DCE should be more agro and less wrong
     4        https://bugs.webkit.org/show_bug.cgi?id=150748
     5
     6        Reviewed by Geoffrey Garen.
     7
     8        Add an LLVM version of testb3's "testComplex".
     9
     10        * ReducedFTL/ComplexTest.cpp: Added.
     11
    1122015-11-01  Commit Queue  <commit-queue@webkit.org>
    213
Note: See TracChangeset for help on using the changeset viewer.