Context Navigation

← Previous Changeset
Next Changeset →

Changeset 160205 in webkit

Timestamp:

Dec 5, 2013, 5:47:19 PM (12 years ago)

Author:

fpizlo@apple.com

Message:

FTL should use cvttsd2si directly for double-to-int32 conversions
https://bugs.webkit.org/show_bug.cgi?id=125275

Source/JavaScriptCore:

Reviewed by Michael Saboff.

Wow. This was an ordeal. Using cvttsd2si was actually easy, but I learned, and
sometimes even fixed, some interesting things:

The llvm.x86.sse2.cvttsd2si intrinsic can actually result in LLVM emitting a vcvttsd2si. I guess the intrinsic doesn't actually imply the instruction.

That whole thing about branchTruncateDoubleToUint32? Yeah we don't need that. It's better to use branchTruncateDoubleToInt32 instead. It has the right semantics for all of its callers (err, its one-and-only caller), and it's more likely to take fast path. This patch kills branchTruncateDoubleToUint32.

"a[i] = v; v = a[i]". Does this change v? OK, assume that 'a[i]' is a pure-ish operation - like an array access with 'i' being an integer index and we're not having a bad time. Now does this change v? CSE assumes that it doesn't. That's wrong. If 'a' is a typed array - the most sensible and pure kind of array - then this can be a truncating cast. For example 'v' could be a double and 'a' could be an integer array.

"v1 = a[i]; v2 = a[i]". Is v1 === v2 assuming that 'a[i]' is pure-ish? The answer is no. You could have a different arrayMode in each access. I know this sounds weird, but with concurrent JIT that might happen.

This patch adds tests for all of this stuff, except for the first issue (it's weird
but probably doesn't matter) and the last issue (it's too much of a freakshow).

assembler/MacroAssemblerARM64.h:
assembler/MacroAssemblerARMv7.h:
assembler/MacroAssemblerX86Common.h:
dfg/DFGCSEPhase.cpp:

(JSC::DFG::CSEPhase::getByValLoadElimination):
(JSC::DFG::CSEPhase::performNodeCSE):

dfg/DFGSpeculativeJIT.cpp:

(JSC::DFG::SpeculativeJIT::compilePutByValForIntTypedArray):

ftl/FTLAbbreviations.h:

(JSC::FTL::vectorType):
(JSC::FTL::getUndef):
(JSC::FTL::buildInsertElement):

ftl/FTLIntrinsicRepository.h:
ftl/FTLLowerDFGToLLVM.cpp:

(JSC::FTL::LowerDFGToLLVM::doubleToInt32):
(JSC::FTL::LowerDFGToLLVM::doubleToUInt32):
(JSC::FTL::LowerDFGToLLVM::sensibleDoubleToInt32):

ftl/FTLOutput.h:

(JSC::FTL::Output::insertElement):
(JSC::FTL::Output::hasSensibleDoubleToInt):
(JSC::FTL::Output::sensibleDoubleToInt):

LayoutTests:

Reviewed by Michael Saboff.

js/regress/double-to-int32-typed-array-expected.txt: Added.
js/regress/double-to-int32-typed-array-no-inline-expected.txt: Added.
js/regress/double-to-int32-typed-array-no-inline.html: Added.
js/regress/double-to-int32-typed-array.html: Added.
js/regress/double-to-uint32-typed-array-expected.txt: Added.
js/regress/double-to-uint32-typed-array-no-inline-expected.txt: Added.
js/regress/double-to-uint32-typed-array-no-inline.html: Added.
js/regress/double-to-uint32-typed-array.html: Added.
js/regress/script-tests/double-to-int32-typed-array-no-inline.js: Added.

(foo):
(test):

js/regress/script-tests/double-to-int32-typed-array.js: Added.

(foo):
(test):

js/regress/script-tests/double-to-uint32-typed-array-no-inline.js: Added.

(foo):
(test):

js/regress/script-tests/double-to-uint32-typed-array.js: Added.

(foo):
(test):

Location:

trunk

Files:

: 12 added
: 11 edited

LayoutTests/ChangeLog (modified) (1 diff)
LayoutTests/js/regress/double-to-int32-typed-array-expected.txt (added)
LayoutTests/js/regress/double-to-int32-typed-array-no-inline-expected.txt (added)
LayoutTests/js/regress/double-to-int32-typed-array-no-inline.html (added)
LayoutTests/js/regress/double-to-int32-typed-array.html (added)
LayoutTests/js/regress/double-to-uint32-typed-array-expected.txt (added)
LayoutTests/js/regress/double-to-uint32-typed-array-no-inline-expected.txt (added)
LayoutTests/js/regress/double-to-uint32-typed-array-no-inline.html (added)
LayoutTests/js/regress/double-to-uint32-typed-array.html (added)
LayoutTests/js/regress/script-tests/double-to-int32-typed-array-no-inline.js (added)
LayoutTests/js/regress/script-tests/double-to-int32-typed-array.js (added)
LayoutTests/js/regress/script-tests/double-to-uint32-typed-array-no-inline.js (added)
LayoutTests/js/regress/script-tests/double-to-uint32-typed-array.js (added)
Source/JavaScriptCore/ChangeLog (modified) (1 diff)
Source/JavaScriptCore/assembler/MacroAssemblerARM64.h (modified) (1 diff)
Source/JavaScriptCore/assembler/MacroAssemblerARMv7.h (modified) (1 diff)
Source/JavaScriptCore/assembler/MacroAssemblerX86Common.h (modified) (1 diff)
Source/JavaScriptCore/dfg/DFGCSEPhase.cpp (modified) (5 diffs)
Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp (modified) (1 diff)
Source/JavaScriptCore/ftl/FTLAbbreviations.h (modified) (3 diffs)
Source/JavaScriptCore/ftl/FTLIntrinsicRepository.h (modified) (1 diff)
Source/JavaScriptCore/ftl/FTLLowerDFGToLLVM.cpp (modified) (2 diffs)
Source/JavaScriptCore/ftl/FTLOutput.h (modified) (2 diffs)

Legend:

: Unmodified
: Added
: Removed

trunk/LayoutTests/ChangeLog

-              r160200
+              r160205
+-12-04  Filip Pizlo  <fpizlo@apple.com>
+        FTL should use cvttsd2si directly for double-to-int32 conversions
+        https://bugs.webkit.org/show_bug.cgi?id=125275
+        Reviewed by Michael Saboff.
+        * js/regress/double-to-int32-typed-array-expected.txt: Added.
+        * js/regress/double-to-int32-typed-array-no-inline-expected.txt: Added.
+        * js/regress/double-to-int32-typed-array-no-inline.html: Added.
+        * js/regress/double-to-int32-typed-array.html: Added.
+        * js/regress/double-to-uint32-typed-array-expected.txt: Added.
+        * js/regress/double-to-uint32-typed-array-no-inline-expected.txt: Added.
+        * js/regress/double-to-uint32-typed-array-no-inline.html: Added.
+        * js/regress/double-to-uint32-typed-array.html: Added.
+        * js/regress/script-tests/double-to-int32-typed-array-no-inline.js: Added.
+        (foo):
+        (test):
+        * js/regress/script-tests/double-to-int32-typed-array.js: Added.
+        (foo):
+        (test):
+        * js/regress/script-tests/double-to-uint32-typed-array-no-inline.js: Added.
+        (foo):
+        (test):
+        * js/regress/script-tests/double-to-uint32-typed-array.js: Added.
+        (foo):
+        (test):
 -12-05  Bear Travis  <betravis@adobe.com>

trunk/Source/JavaScriptCore/ChangeLog

-              r160204
+              r160205
+-12-04  Filip Pizlo  <fpizlo@apple.com>
+        FTL should use cvttsd2si directly for double-to-int32 conversions
+        https://bugs.webkit.org/show_bug.cgi?id=125275
+        Reviewed by Michael Saboff.
+        Wow. This was an ordeal. Using cvttsd2si was actually easy, but I learned, and
+        sometimes even fixed, some interesting things:
+        - The llvm.x86.sse2.cvttsd2si intrinsic can actually result in LLVM emitting a
+          vcvttsd2si. I guess the intrinsic doesn't actually imply the instruction.
+        - That whole thing about branchTruncateDoubleToUint32? Yeah we don't need that. It's
+          better to use branchTruncateDoubleToInt32 instead. It has the right semantics for
+          all of its callers (err, its one-and-only caller), and it's more likely to take
+          fast path. This patch kills branchTruncateDoubleToUint32.
+        - "a[i] = v; v = a[i]". Does this change v? OK, assume that 'a[i]' is a pure-ish
+          operation - like an array access with 'i' being an integer index and we're not
+          having a bad time. Now does this change v? CSE assumes that it doesn't. That's
+          wrong. If 'a' is a typed array - the most sensible and pure kind of array - then
+          this can be a truncating cast. For example 'v' could be a double and 'a' could be
+          an integer array.
+        - "v1 = a[i]; v2 = a[i]". Is v1 === v2 assuming that 'a[i]' is pure-ish? The answer
+          is no. You could have a different arrayMode in each access. I know this sounds
+          weird, but with concurrent JIT that might happen.
+        This patch adds tests for all of this stuff, except for the first issue (it's weird
+        but probably doesn't matter) and the last issue (it's too much of a freakshow).
+        * assembler/MacroAssemblerARM64.h:
+        * assembler/MacroAssemblerARMv7.h:
+        * assembler/MacroAssemblerX86Common.h:
+        * dfg/DFGCSEPhase.cpp:
+        (JSC::DFG::CSEPhase::getByValLoadElimination):
+        (JSC::DFG::CSEPhase::performNodeCSE):
+        * dfg/DFGSpeculativeJIT.cpp:
+        (JSC::DFG::SpeculativeJIT::compilePutByValForIntTypedArray):
+        * ftl/FTLAbbreviations.h:
+        (JSC::FTL::vectorType):
+        (JSC::FTL::getUndef):
+        (JSC::FTL::buildInsertElement):
+        * ftl/FTLIntrinsicRepository.h:
+        * ftl/FTLLowerDFGToLLVM.cpp:
+        (JSC::FTL::LowerDFGToLLVM::doubleToInt32):
+        (JSC::FTL::LowerDFGToLLVM::doubleToUInt32):
+        (JSC::FTL::LowerDFGToLLVM::sensibleDoubleToInt32):
+        * ftl/FTLOutput.h:
+        (JSC::FTL::Output::insertElement):
+        (JSC::FTL::Output::hasSensibleDoubleToInt):
+        (JSC::FTL::Output::sensibleDoubleToInt):
 -12-05  Commit Queue  <commit-queue@webkit.org>

trunk/Source/JavaScriptCore/assembler/MacroAssemblerARM64.h

-              r160056
+              r160205
+    }
-    Jump branchTruncateDoubleToUint32(FPRegisterID src, RegisterID dest, BranchTruncateType branchType = BranchIfTruncateFailed)
+    {
-        // Truncate to a 64-bit integer in dataTempRegister, copy the low 32-bit to dest.
-        m_assembler.fcvtzs<64, 64>(dest, src);
-        // Check thlow 32-bits zero extend to be equal to the full value.
-        m_assembler.cmp<64>(dest, dest, ARM64Assembler::UXTW, 0);
-        return Jump(makeBranch(branchType == BranchIfTruncateSuccessful ? Equal : NotEqual));
+    }
     void convertDoubleToFloat(FPRegisterID src, FPRegisterID dest)
+    {

trunk/Source/JavaScriptCore/assembler/MacroAssemblerARMv7.h

-              r159564
+              r160205
+    }
-    Jump branchTruncateDoubleToUint32(FPRegisterID src, RegisterID dest, BranchTruncateType branchType = BranchIfTruncateFailed)
+    {
-        m_assembler.vcvt_floatingPointToSigned(fpTempRegisterAsSingle(), src);
-        m_assembler.vmov(dest, fpTempRegisterAsSingle());
-        Jump overflow = branch32(Equal, dest, TrustedImm32(0x7fffffff));
-        Jump success = branch32(GreaterThanOrEqual, dest, TrustedImm32(0));
-        overflow.link(this);
-        if (branchType == BranchIfTruncateSuccessful)
-            return success;
-        Jump failure = jump();
-        success.link(this);
-        return failure;
+    }
     // Result is undefined if the value is outside of the integer range.
     void truncateDoubleToInt32(FPRegisterID src, RegisterID dest)

trunk/Source/JavaScriptCore/assembler/MacroAssemblerX86Common.h

-              r159855
+              r160205
         m_assembler.cvttsd2si_rr(src, dest);
         return branch32(branchType ? NotEqual : Equal, dest, TrustedImm32(0x80000000));
+    }
-    Jump branchTruncateDoubleToUint32(FPRegisterID src, RegisterID dest, BranchTruncateType branchType = BranchIfTruncateFailed)
+    {
-        ASSERT(isSSE2Present());
-        m_assembler.cvttsd2si_rr(src, dest);
-        return branch32(branchType ? GreaterThanOrEqual : LessThan, dest, TrustedImm32(0));
+    }

trunk/Source/JavaScriptCore/dfg/DFGCSEPhase.cpp

-              r159886
+              r160205
+    }
     Node* getByValLoadElimination(Node* child1, Node* child2)
+    Node* getByValLoadElimination(Node* child1, Node* child2, ArrayMode arrayMode)
+    {
         for (unsigned i = m_indexInBlock; i--;) {
 …
                 if (!m_graph.byValIsPure(node))
                     return 0;
+                if (node->child1() == child1 && node->child2() == child2)
+                if (node->child1() == child1
+                    && node->child2() == child2
+                    && node->arrayMode().type() == arrayMode.type())
                     return node;
                 break;
 …
                 if (!m_graph.byValIsPure(node))
                     return 0;
+                if (m_graph.varArgChild(node, 0) == child1 && m_graph.varArgChild(node, 1) == child2)
+                // Typed arrays
+                if (arrayMode.typedArrayType() != NotTypedArray)
+                    return 0;
+                if (m_graph.varArgChild(node, 0) == child1
+                    && m_graph.varArgChild(node, 1) == child2
+                    && node->arrayMode().type() == arrayMode.type())
                     return m_graph.varArgChild(node, 2).node();
                 // We must assume that the PutByVal will clobber the location we're getting from.
 …
                 break;
             if (m_graph.byValIsPure(node))
                 setReplacement(getByValLoadElimination(node->child1().node(), node->child2().node()));
+                setReplacement(getByValLoadElimination(node->child1().node(), node->child2().node(), node->arrayMode()));
             break;
 …
             Edge child2 = m_graph.varArgChild(node, 1);
             if (node->arrayMode().canCSEStorage()) {
                 Node* replacement = getByValLoadElimination(child1.node(), child2.node());
+                Node* replacement = getByValLoadElimination(child1.node(), child2.node(), node->arrayMode());
                 if (!replacement)
                     break;

trunk/Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp

-              r160150
+              r160205
                 notNaN.link(&m_jit);
+                MacroAssembler::Jump failed;
+                if (isSigned(type))
+                    failed = m_jit.branchTruncateDoubleToInt32(fpr, gpr, MacroAssembler::BranchIfTruncateFailed);
+                else
+                    failed = m_jit.branchTruncateDoubleToUint32(fpr, gpr, MacroAssembler::BranchIfTruncateFailed);
+                MacroAssembler::Jump failed = m_jit.branchTruncateDoubleToInt32(
+                    fpr, gpr, MacroAssembler::BranchIfTruncateFailed);
                 addSlowPathGenerator(slowPathCall(failed, this, toInt32, gpr, fpr));

trunk/Source/JavaScriptCore/ftl/FTLAbbreviations.h

-              r159545
+              r160205
 static inline LType pointerType(LType type) { return llvm->PointerType(type, 0); }
+static inline LType vectorType(LType type, unsigned count) { return llvm->VectorType(type, count); }
 enum PackingMode { NotPacked, Packed };
 …
 static inline LValue getParam(LValue function, unsigned index) { return llvm->GetParam(function, index); }
+static inline LValue getUndef(LType type) { return llvm->GetUndef(type); }
 enum BitExtension { ZeroExtend, SignExtend };
 …
 static inline LValue buildICmp(LBuilder builder, LIntPredicate cond, LValue left, LValue right) { return llvm->BuildICmp(builder, cond, left, right, ""); }
 static inline LValue buildFCmp(LBuilder builder, LRealPredicate cond, LValue left, LValue right) { return llvm->BuildFCmp(builder, cond, left, right, ""); }
+static inline LValue buildInsertElement(LBuilder builder, LValue vector, LValue element, LValue index) { return llvm->BuildInsertElement(builder, vector, element, index, ""); }
 enum SynchronizationScope { SingleThread, CrossThread };

trunk/Source/JavaScriptCore/ftl/FTLIntrinsicRepository.h

-              r159798
+              r160205
     macro(mulWithOverflow32, "llvm.smul.with.overflow.i32", functionType(structType(m_context, int32, boolean), int32, int32)) \
     macro(mulWithOverflow64, "llvm.smul.with.overflow.i64", functionType(structType(m_context, int64, boolean), int64, int64)) \
-    macro(subWithOverflow32, "llvm.ssub.with.overflow.i32", functionType(structType(m_context, int32, boolean), int32, int32)) \
-    macro(subWithOverflow64, "llvm.ssub.with.overflow.i64", functionType(structType(m_context, int64, boolean), int64, int64)) \
     macro(patchpointInt64, "llvm.experimental.patchpoint.i64", functionType(int64, int32, int32, ref8, int32, Variadic)) \
     macro(patchpointVoid, "llvm.experimental.patchpoint.void", functionType(voidType, int32, int32, ref8, int32, Variadic)) \
     macro(stackmap, "llvm.experimental.stackmap", functionType(voidType, int32, int32, Variadic)) \
+    macro(trap, "llvm.trap", functionType(voidType))
+    macro(subWithOverflow32, "llvm.ssub.with.overflow.i32", functionType(structType(m_context, int32, boolean), int32, int32)) \
+    macro(subWithOverflow64, "llvm.ssub.with.overflow.i64", functionType(structType(m_context, int64, boolean), int64, int64)) \
+    macro(trap, "llvm.trap", functionType(voidType)) \
+    macro(x86SSE2CvtTSD2SI, "llvm.x86.sse2.cvttsd2si", functionType(int32, vectorType(doubleType, 2)))
 #define FOR_EACH_FUNCTION_TYPE(macro) \

trunk/Source/JavaScriptCore/ftl/FTLLowerDFGToLLVM.cpp

-              r160150
+              r160205
     LValue doubleToInt32(LValue doubleValue)
+    {
+        if (Output::hasSensibleDoubleToInt())
+            return sensibleDoubleToInt32(doubleValue);
         double limit = pow(2, 31) - 1;
         return doubleToInt32(doubleValue, -limit, limit);
 …
     LValue doubleToUInt32(LValue doubleValue)
+    {
+        if (Output::hasSensibleDoubleToInt())
+            return sensibleDoubleToInt32(doubleValue);
         return doubleToInt32(doubleValue, 0, pow(2, 32) - 1, false);
+    }
+    LValue sensibleDoubleToInt32(LValue doubleValue)
+    {
+        LBasicBlock slowPath = FTL_NEW_BLOCK(m_out, ("sensible doubleToInt32 slow path"));
+        LBasicBlock continuation = FTL_NEW_BLOCK(m_out, ("sensible doubleToInt32 continuation"));
+        ValueFromBlock fastResult = m_out.anchor(
+            m_out.sensibleDoubleToInt(doubleValue));
+        m_out.branch(
+            m_out.equal(fastResult.value(), m_out.constInt32(0x80000000)),
+            slowPath, continuation);
+        LBasicBlock lastNext = m_out.appendTo(slowPath, continuation);
+        ValueFromBlock slowResult = m_out.anchor(
+            m_out.call(m_out.operation(toInt32), doubleValue));
+        m_out.jump(continuation);
+        m_out.appendTo(continuation, lastNext);
+        return m_out.phi(m_out.int32, fastResult, slowResult);
+    }

trunk/Source/JavaScriptCore/ftl/FTLOutput.h

-              r159545
+              r160205
     LValue bitNot(LValue value) { return buildNot(m_builder, value); }
+    LValue insertElement(LValue vector, LValue element, LValue index) { return buildInsertElement(m_builder, vector, element, index); }
     LValue addWithOverflow32(LValue left, LValue right)
+    {
 …
+    {
         return call(doubleAbsIntrinsic(), value);
+    }
+    static bool hasSensibleDoubleToInt() { return isX86(); }
+    LValue sensibleDoubleToInt(LValue value)
+    {
+        RELEASE_ASSERT(isX86());
+        return call(
+            x86SSE2CvtTSD2SIIntrinsic(),
+            insertElement(
+                insertElement(getUndef(vectorType(doubleType, 2)), value, int32Zero),
+                doubleZero, int32One));
+    }

Note: See TracChangeset for help on using the changeset viewer.

Context Navigation

Changeset 160205 in webkit

Legend:

Download in other formats: