Changeset 206226 in webkit


Ignore:
Timestamp:
Sep 21, 2016 12:09:24 PM (8 years ago)
Author:
fpizlo@apple.com
Message:

Add a Fence opcode to B3
https://bugs.webkit.org/show_bug.cgi?id=162343

Reviewed by Geoffrey Garen.
Source/JavaScriptCore:


This adds the most basic fence support to B3. Currently, this is optimal on x86 and correct
on ARM. It also happens to be sufficient and optimal for what we'll do in the concurrent GC.

The idea of Fence is that it can represent any standalone fence instruction by having two
additional params: a read range and a write range. If the write range is empty, this is
taken to be a store-store fence, which turns into zero code on x86 and a cheaper fence on
ARM.

It turns out that this is powerful enough to express store-load and store-store fences. For
load-store and load-load fences, you wouldn't have wanted to use any code on x86 and you
wouldn't have wanted a standalone barrier on ARM. For those cases, you'd want either a
fenced load (load acquire) or a dependency. See bug 162349 and bug 162350, respectively.

This isn't yet optimized for store-store fences on ARM because we don't have the
MacroAssembler support. Also, the support for "dmb ish" is not really what we want (it seems
to use a heavier fence). I don't think that this is urgent because of how the concurrent GC
will use this facility. I've left that to bug 162342.

  • CMakeLists.txt:
  • JavaScriptCore.xcodeproj/project.pbxproj:
  • assembler/MacroAssemblerCodeRef.cpp:

(JSC::MacroAssemblerCodeRef::tryToDisassemble):
(JSC::MacroAssemblerCodeRef::disassembly):

  • assembler/MacroAssemblerCodeRef.h:

(JSC::MacroAssemblerCodeRef::size): Deleted.
(JSC::MacroAssemblerCodeRef::tryToDisassemble): Deleted.

  • b3/B3Compilation.h:

(JSC::B3::Compilation::codeRef):
(JSC::B3::Compilation::disassembly):
(JSC::B3::Compilation::code): Deleted.

  • b3/B3Effects.h:
  • b3/B3FenceValue.cpp: Added.

(JSC::B3::FenceValue::~FenceValue):
(JSC::B3::FenceValue::cloneImpl):
(JSC::B3::FenceValue::FenceValue):

  • b3/B3FenceValue.h: Added.
  • b3/B3LowerToAir.cpp:

(JSC::B3::Air::LowerToAir::lower):

  • b3/B3Opcode.cpp:

(WTF::printInternal):

  • b3/B3Opcode.h:
  • b3/B3Validate.cpp:
  • b3/B3Value.cpp:

(JSC::B3::Value::effects):

  • b3/air/AirOpcode.opcodes:
  • b3/testb3.cpp:

(JSC::B3::checkUsesInstruction):
(JSC::B3::checkDoesNotUseInstruction):
(JSC::B3::testX86MFence):
(JSC::B3::testX86CompilerFence):
(JSC::B3::run):

Websites/webkit.org:

  • docs/b3/intermediate-representation.html:
Location:
trunk
Files:
2 added
17 edited

Legend:

Unmodified
Added
Removed
  • trunk/Source/JavaScriptCore/CMakeLists.txt

    r206110 r206226  
    126126    b3/B3Effects.cpp
    127127    b3/B3EliminateCommonSubexpressions.cpp
     128    b3/B3FenceValue.cpp
    128129    b3/B3FixSSA.cpp
    129130    b3/B3FoldPathConstants.cpp
  • trunk/Source/JavaScriptCore/ChangeLog

    r206222 r206226  
     12016-09-21  Filip Pizlo  <fpizlo@apple.com>
     2
     3        Add a Fence opcode to B3
     4        https://bugs.webkit.org/show_bug.cgi?id=162343
     5
     6        Reviewed by Geoffrey Garen.
     7       
     8        This adds the most basic fence support to B3. Currently, this is optimal on x86 and correct
     9        on ARM. It also happens to be sufficient and optimal for what we'll do in the concurrent GC.
     10       
     11        The idea of Fence is that it can represent any standalone fence instruction by having two
     12        additional params: a read range and a write range. If the write range is empty, this is
     13        taken to be a store-store fence, which turns into zero code on x86 and a cheaper fence on
     14        ARM.
     15       
     16        It turns out that this is powerful enough to express store-load and store-store fences. For
     17        load-store and load-load fences, you wouldn't have wanted to use any code on x86 and you
     18        wouldn't have wanted a standalone barrier on ARM. For those cases, you'd want either a
     19        fenced load (load acquire) or a dependency. See bug 162349 and bug 162350, respectively.
     20       
     21        This isn't yet optimized for store-store fences on ARM because we don't have the
     22        MacroAssembler support. Also, the support for "dmb ish" is not really what we want (it seems
     23        to use a heavier fence). I don't think that this is urgent because of how the concurrent GC
     24        will use this facility. I've left that to bug 162342.
     25
     26        * CMakeLists.txt:
     27        * JavaScriptCore.xcodeproj/project.pbxproj:
     28        * assembler/MacroAssemblerCodeRef.cpp:
     29        (JSC::MacroAssemblerCodeRef::tryToDisassemble):
     30        (JSC::MacroAssemblerCodeRef::disassembly):
     31        * assembler/MacroAssemblerCodeRef.h:
     32        (JSC::MacroAssemblerCodeRef::size): Deleted.
     33        (JSC::MacroAssemblerCodeRef::tryToDisassemble): Deleted.
     34        * b3/B3Compilation.h:
     35        (JSC::B3::Compilation::codeRef):
     36        (JSC::B3::Compilation::disassembly):
     37        (JSC::B3::Compilation::code): Deleted.
     38        * b3/B3Effects.h:
     39        * b3/B3FenceValue.cpp: Added.
     40        (JSC::B3::FenceValue::~FenceValue):
     41        (JSC::B3::FenceValue::cloneImpl):
     42        (JSC::B3::FenceValue::FenceValue):
     43        * b3/B3FenceValue.h: Added.
     44        * b3/B3LowerToAir.cpp:
     45        (JSC::B3::Air::LowerToAir::lower):
     46        * b3/B3Opcode.cpp:
     47        (WTF::printInternal):
     48        * b3/B3Opcode.h:
     49        * b3/B3Validate.cpp:
     50        * b3/B3Value.cpp:
     51        (JSC::B3::Value::effects):
     52        * b3/air/AirOpcode.opcodes:
     53        * b3/testb3.cpp:
     54        (JSC::B3::checkUsesInstruction):
     55        (JSC::B3::checkDoesNotUseInstruction):
     56        (JSC::B3::testX86MFence):
     57        (JSC::B3::testX86CompilerFence):
     58        (JSC::B3::run):
     59
    1602016-09-21  Keith Miller  <keith_miller@apple.com>
    261
  • trunk/Source/JavaScriptCore/JavaScriptCore.xcodeproj/project.pbxproj

    r206154 r206226  
    453453                0F682FB219BCB36400FA3BAD /* DFGSSACalculator.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 0F682FB019BCB36400FA3BAD /* DFGSSACalculator.cpp */; };
    454454                0F682FB319BCB36400FA3BAD /* DFGSSACalculator.h in Headers */ = {isa = PBXBuildFile; fileRef = 0F682FB119BCB36400FA3BAD /* DFGSSACalculator.h */; };
     455                0F6971EA1D92F42400BA02A5 /* B3FenceValue.h in Headers */ = {isa = PBXBuildFile; fileRef = 0F6971E91D92F42100BA02A5 /* B3FenceValue.h */; };
     456                0F6971EB1D92F42D00BA02A5 /* B3FenceValue.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 0F6971E81D92F42100BA02A5 /* B3FenceValue.cpp */; };
    455457                0F69CC88193AC60A0045759E /* DFGFrozenValue.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 0F69CC86193AC60A0045759E /* DFGFrozenValue.cpp */; };
    456458                0F69CC89193AC60A0045759E /* DFGFrozenValue.h in Headers */ = {isa = PBXBuildFile; fileRef = 0F69CC87193AC60A0045759E /* DFGFrozenValue.h */; };
     
    26922694                0F682FB019BCB36400FA3BAD /* DFGSSACalculator.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; name = DFGSSACalculator.cpp; path = dfg/DFGSSACalculator.cpp; sourceTree = "<group>"; };
    26932695                0F682FB119BCB36400FA3BAD /* DFGSSACalculator.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; name = DFGSSACalculator.h; path = dfg/DFGSSACalculator.h; sourceTree = "<group>"; };
     2696                0F6971E81D92F42100BA02A5 /* B3FenceValue.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; name = B3FenceValue.cpp; path = b3/B3FenceValue.cpp; sourceTree = "<group>"; };
     2697                0F6971E91D92F42100BA02A5 /* B3FenceValue.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; name = B3FenceValue.h; path = b3/B3FenceValue.h; sourceTree = "<group>"; };
    26942698                0F69CC86193AC60A0045759E /* DFGFrozenValue.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; name = DFGFrozenValue.cpp; path = dfg/DFGFrozenValue.cpp; sourceTree = "<group>"; };
    26952699                0F69CC87193AC60A0045759E /* DFGFrozenValue.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; name = DFGFrozenValue.h; path = dfg/DFGFrozenValue.h; sourceTree = "<group>"; };
     
    48454849                                0F725CA31C503DED00AD943A /* B3EliminateCommonSubexpressions.cpp */,
    48464850                                0F725CA41C503DED00AD943A /* B3EliminateCommonSubexpressions.h */,
     4851                                0F6971E81D92F42100BA02A5 /* B3FenceValue.cpp */,
     4852                                0F6971E91D92F42100BA02A5 /* B3FenceValue.h */,
    48474853                                0F6B8AE01C4EFE1700969052 /* B3FixSSA.cpp */,
    48484854                                0F6B8AE11C4EFE1700969052 /* B3FixSSA.h */,
     
    49224928                                0FEC84F01BDACDAC0080FF74 /* B3SwitchValue.h */,
    49234929                                0F45703E1BE584CA0062A629 /* B3TimingScope.cpp */,
    4924                                 0FEC84F21BDACDAC0080FF74 /* B3Type.h */,
    49254930                                0F45703F1BE584CA0062A629 /* B3TimingScope.h */,
    49264931                                0FEC84F11BDACDAC0080FF74 /* B3Type.cpp */,
     4932                                0FEC84F21BDACDAC0080FF74 /* B3Type.h */,
    49274933                                DCFDFBD81D1F5D9800FE3D72 /* B3TypeMap.h */,
    49284934                                0FEC84F31BDACDAC0080FF74 /* B3UpsilonValue.cpp */,
     
    78277833                                A1587D761B4DC1C600D69849 /* IntlDateTimeFormatPrototype.lut.h in Headers */,
    78287834                                A55714BE1CD8049F0004D2C6 /* ConsoleObject.h in Headers */,
     7835                                0F6971EA1D92F42400BA02A5 /* B3FenceValue.h in Headers */,
    78297836                                A1D792FD1B43864B004516F5 /* IntlNumberFormat.h in Headers */,
    78307837                                A1D792FF1B43864B004516F5 /* IntlNumberFormatConstructor.h in Headers */,
     
    94799486                                A5AB49DC1BEC8082007020FB /* PerGlobalObjectWrapperWorld.cpp in Sources */,
    94809487                                14B723B212D7DA46003BD5ED /* MachineStackMarker.cpp in Sources */,
     9488                                0F6971EB1D92F42D00BA02A5 /* B3FenceValue.cpp in Sources */,
    94819489                                FE3A06BD1C11040D00390FDD /* JITLeftShiftGenerator.cpp in Sources */,
    94829490                                0FEB3ECF16237F6C00AB67AD /* MacroAssembler.cpp in Sources */,
  • trunk/Source/JavaScriptCore/assembler/MacroAssemblerCodeRef.cpp

    r205462 r206226  
    6060}
    6161
     62bool MacroAssemblerCodeRef::tryToDisassemble(PrintStream& out, const char* prefix) const
     63{
     64    return JSC::tryToDisassemble(m_codePtr, size(), prefix, out);
     65}
     66
     67bool MacroAssemblerCodeRef::tryToDisassemble(const char* prefix) const
     68{
     69    return tryToDisassemble(WTF::dataFile(), prefix);
     70}
     71
     72CString MacroAssemblerCodeRef::disassembly() const
     73{
     74    StringPrintStream out;
     75    if (!tryToDisassemble(out, ""))
     76        return CString();
     77    return out.toCString();
     78}
     79
    6280void MacroAssemblerCodeRef::dump(PrintStream& out) const
    6381{
  • trunk/Source/JavaScriptCore/assembler/MacroAssemblerCodeRef.h

    r205462 r206226  
    11/*
    2  * Copyright (C) 2009, 2012 Apple Inc. All rights reserved.
     2 * Copyright (C) 2009, 2012, 2016 Apple Inc. All rights reserved.
    33 *
    44 * Redistribution and use in source and binary forms, with or without
     
    2727#define MacroAssemblerCodeRef_h
    2828
    29 #include "Disassembler.h"
    3029#include "ExecutableAllocator.h"
    3130#include <wtf/DataLog.h>
     
    3332#include <wtf/PrintStream.h>
    3433#include <wtf/RefPtr.h>
     34#include <wtf/text/CString.h>
    3535
    3636// ASSERT_VALID_CODE_POINTER checks that ptr is a non-null pointer, and that it is a valid
     
    392392        return m_executableMemory->sizeInBytes();
    393393    }
    394    
    395     bool tryToDisassemble(const char* prefix) const
    396     {
    397         return JSC::tryToDisassemble(m_codePtr, size(), prefix, WTF::dataFile());
    398     }
     394
     395    bool tryToDisassemble(PrintStream& out, const char* prefix = "") const;
     396   
     397    bool tryToDisassemble(const char* prefix = "") const;
     398   
     399    JS_EXPORT_PRIVATE CString disassembly() const;
    399400   
    400401    explicit operator bool() const { return !!m_codePtr; }
  • trunk/Source/JavaScriptCore/b3/B3Compilation.h

    r195139 r206226  
    6464
    6565    MacroAssemblerCodePtr code() const { return m_codeRef.code(); }
     66    MacroAssemblerCodeRef codeRef() const { return m_codeRef; }
     67   
     68    CString disassembly() const { return m_codeRef.disassembly(); }
    6669
    6770private:
  • trunk/Source/JavaScriptCore/b3/B3Effects.h

    r203390 r206226  
    5454
    5555    // True if this writes to the local state. Operations that write local state don't write to anything
    56     // in "memory" but they have a side-effect anyway. This is for modeling Upsilons and Sets. You can ignore
    57     // this if you have your own way of modeling Upsilons and Sets or if you intend to just rebuild them
    58     // anyway.
     56    // in "memory" but they have a side-effect anyway. This is for modeling Upsilons, Sets, and Fences.
     57    // This is a way of saying: even though this operation is not a terminal, does not exit sideways,
     58    // and does not write to the heap, you still cannot kill this operation.
    5959    bool writesLocalState { false };
    6060
  • trunk/Source/JavaScriptCore/b3/B3LowerToAir.cpp

    r204920 r206226  
    4141#include "B3Commutativity.h"
    4242#include "B3Dominators.h"
     43#include "B3FenceValue.h"
    4344#include "B3MemoryValue.h"
    4445#include "B3PatchpointSpecial.h"
     
    20462047            return;
    20472048        }
     2049           
     2050        case Fence: {
     2051            FenceValue* fence = m_value->as<FenceValue>();
     2052            if (isX86() && !fence->write)
     2053                return;
     2054            // FIXME: Optimize this on ARM.
     2055            // https://bugs.webkit.org/show_bug.cgi?id=162342
     2056            append(MemoryFence);
     2057            return;
     2058        }
    20482059
    20492060        case Trunc: {
  • trunk/Source/JavaScriptCore/b3/B3MemoryValue.h

    r197366 r206226  
    3333
    3434namespace JSC { namespace B3 {
     35
     36// FIXME: We want to allow fenced memory accesses on ARM.
     37// https://bugs.webkit.org/show_bug.cgi?id=162349
    3538
    3639class JS_EXPORT_PRIVATE MemoryValue : public Value {
  • trunk/Source/JavaScriptCore/b3/B3Opcode.cpp

    r203670 r206226  
    258258        out.print("Store");
    259259        return;
     260    case Fence:
     261        out.print("Fence");
     262        return;
    260263    case CCall:
    261264        out.print("CCall");
  • trunk/Source/JavaScriptCore/b3/B3Opcode.h

    r203670 r206226  
    158158    // This is a polymorphic store for Int32, Int64, Float, and Double.
    159159    Store,
     160   
     161    // This is used to represent standalone fences - i.e. fences that are not part of other
     162    // instructions. It's expressive enough to expose mfence on x86 and dmb ish/ishst on ARM. On
     163    // x86, it also acts as a compiler store-store fence in those cases where it would have been a
     164    // dmb ishst on ARM.
     165    Fence,
    160166
    161167    // This is a regular ordinary C function call, using the system C calling convention. Make sure
  • trunk/Source/JavaScriptCore/b3/B3Validate.cpp

    r204402 r206226  
    131131            switch (value->opcode()) {
    132132            case Nop:
     133            case Fence:
    133134                VALIDATE(!value->numChildren(), ("At ", *value));
    134135                VALIDATE(value->type() == Void, ("At ", *value));
  • trunk/Source/JavaScriptCore/b3/B3Value.cpp

    r203670 r206226  
    3333#include "B3BottomProvider.h"
    3434#include "B3CCallValue.h"
     35#include "B3FenceValue.h"
    3536#include "B3MemoryValue.h"
    3637#include "B3OriginDump.h"
     
    586587        result.controlDependent = true;
    587588        break;
     589    case Fence: {
     590        const FenceValue* fence = as<FenceValue>();
     591        result.reads = fence->read;
     592        result.writes = fence->write;
     593       
     594        // Prevent killing of fences that claim not to write anything. It's a bit weird that we use
     595        // local state as the way to do this, but it happens to work: we must assume that we cannot
     596        // kill writesLocalState unless we understands exactly what the instruction is doing (like
     597        // the way that fixSSA understands Set/Get and the way that reduceStrength and others
     598        // understand Upsilon). This would only become a problem if we had some analysis that was
     599        // looking to use the writesLocalState bit to invalidate a CSE over local state operations.
     600        // Then a Fence would look block, say, the elimination of a redundant Get. But it like
     601        // that's not at all how our optimizations for Set/Get/Upsilon/Phi work - they grok their
     602        // operations deeply enough that they have no need to check this bit - so this cheat is
     603        // fine.
     604        result.writesLocalState = true;
     605        break;
     606    }
    588607    case CCall:
    589608        result = as<CCallValue>()->effects;
  • trunk/Source/JavaScriptCore/b3/air/AirOpcode.opcodes

    r205656 r206226  
    844844    DoubleCond, Tmp, Tmp, Tmp, Tmp, Tmp
    845845
     846MemoryFence /effects
     847
    846848Jump /branch
    847849
  • trunk/Source/JavaScriptCore/b3/testb3.cpp

    r205656 r206226  
    3838#include "B3ConstPtrValue.h"
    3939#include "B3Effects.h"
     40#include "B3FenceValue.h"
    4041#include "B3Generate.h"
    4142#include "B3LowerToAir.h"
     
    150151   
    151152    Air::validate(proc.code());
     153}
     154
     155void checkUsesInstruction(Compilation& compilation, const char* text)
     156{
     157    CString disassembly = compilation.disassembly();
     158    if (strstr(disassembly.data(), text))
     159        return;
     160
     161    crashLock.lock();
     162    dataLog("Bad lowering!  Expected to find ", text, " but didn't:\n");
     163    dataLog(disassembly);
     164    CRASH();
     165}
     166
     167void checkDoesNotUseInstruction(Compilation& compilation, const char* text)
     168{
     169    CString disassembly = compilation.disassembly();
     170    if (!strstr(disassembly.data(), text))
     171        return;
     172
     173    crashLock.lock();
     174    dataLog("Bad lowering!  Did not expected to find ", text, " but it's there:\n");
     175    dataLog(disassembly);
     176    CRASH();
    152177}
    153178
     
    1302613051    CHECK_EQ(invoke<int>(*code, 43), 666);
    1302713052    CHECK_EQ(invoke<int>(*code, -1), 666);
     13053}
     13054
     13055void testX86MFence()
     13056{
     13057    Procedure proc;
     13058   
     13059    BasicBlock* root = proc.addBlock();
     13060   
     13061    root->appendNew<FenceValue>(proc, Origin());
     13062    root->appendNew<Value>(proc, Return, Origin());
     13063   
     13064    auto code = compile(proc);
     13065    checkUsesInstruction(*code, "mfence");
     13066}
     13067
     13068void testX86CompilerFence()
     13069{
     13070    Procedure proc;
     13071   
     13072    BasicBlock* root = proc.addBlock();
     13073   
     13074    root->appendNew<FenceValue>(proc, Origin(), HeapRange::top(), HeapRange());
     13075    root->appendNew<Value>(proc, Return, Origin());
     13076   
     13077    auto code = compile(proc);
     13078    checkDoesNotUseInstruction(*code, "mfence");
    1302813079}
    1302913080
     
    1445714508        RUN(testBranchBitAndImmFusion(Load, Int32, 1, Air::BranchTest32, Air::Arg::Addr));
    1445814509        RUN(testBranchBitAndImmFusion(Load, Int64, 1, Air::BranchTest32, Air::Arg::Addr));
     14510       
     14511        RUN(testX86MFence());
     14512        RUN(testX86CompilerFence());
    1445914513    }
    1446014514
  • trunk/Websites/webkit.org/ChangeLog

    r204546 r206226  
     12016-09-21  Filip Pizlo  <fpizlo@apple.com>
     2
     3        Add a Fence opcode to B3
     4        https://bugs.webkit.org/show_bug.cgi?id=162343
     5
     6        Reviewed by Geoffrey Garen.
     7
     8        * docs/b3/intermediate-representation.html:
     9
    1102016-08-16  Benjamin Poulain  <bpoulain@apple.com>
    211
  • trunk/Websites/webkit.org/docs/b3/intermediate-representation.html

    r204546 r206226  
    422422        compile-time 32-bit signed integer offset to the second child.  Misaligned stores are
    423423        not penalized.  Must use the MemoryValue class.</dd>
     424     
     425      <dt>Void Fence()</dt>
     426      <dd>Abstracts standalone data fences on x86 and ARM. Must use the FenceValue class, which has
     427        two additional members that configure the precise meaning of the fence:
     428        <code>HeapRange FenceValue::read</code> and <code>HeapRange FenceValue::write</code>. If the
     429        <code>write</code> range is empty, this is taken to be a store-store fence, which leads to
     430        no code generation on x86 and the weaker <code>dmb ishst</code> fence on ARM. If the write
     431        range is non-empty, this produces <code>mfence</code> on x86 and <code>dmb ish</code> on
     432        ARM. Within B3 IR, the Fence also reports the read/write in its effects. This allows you to
     433        scope the fence for the purpose of B3's load elimination. For example, you may use a Fence
     434        to protect a store from being sunk below a specific load. In that case, you can claim to
     435        read just that store's range and write that load's range.</dd>
    424436
    425437      <dt>T1 CCall(IntPtr, [T2, [T3, ...]])</dt>
Note: See TracChangeset for help on using the changeset viewer.