Changeset 243851 in webkit
- Timestamp:
- Apr 3, 2019 8:37:23 PM (5 years ago)
- Location:
- trunk
- Files:
-
- 5 added
- 10 edited
- 2 copied
Legend:
- Unmodified
- Added
- Removed
-
trunk/JSTests/ChangeLog
r243835 r243851 1 2019-04-03 Robin Morisset <rmorisset@apple.com> 2 3 B3 should use associativity to optimize expression trees 4 https://bugs.webkit.org/show_bug.cgi?id=194081 5 6 Reviewed by Filip Pizlo. 7 8 Added three microbenchmarks: 9 - add-tree should be the ideal case, but there is no speedup because we are currently unable to prove that the CheckAdd won't overflow 10 - bit-xor-tree most closely matches the situation where the optimization triggers on the JetStream2 subtests where it triggers: 11 an unbalanced expression tree of size 8 that can be balanced, with no other optimizations being unlocked. 16% speedup 12 - bit-or-tree is an ideal case, where the reassociation also enables a ton of further simplifications. 42% speedup 13 14 * microbenchmarks/add-tree.js: Added. 15 * microbenchmarks/bit-or-tree.js: Added. 16 * microbenchmarks/bit-xor-tree.js: Added. 17 1 18 2019-04-03 Yusuke Suzuki <ysuzuki@apple.com> 2 19 -
trunk/Source/JavaScriptCore/ChangeLog
r243843 r243851 1 2019-04-03 Robin Morisset <rmorisset@apple.com> 2 3 B3 should use associativity to optimize expression trees 4 https://bugs.webkit.org/show_bug.cgi?id=194081 5 6 Reviewed by Filip Pizlo. 7 8 This patch adds a new B3 pass, that tries to find and optimize expression trees made purely of any one associative and commutative operator (Add/Mul/BitOr/BitAnd/BitXor). 9 The pass only runs in O2, and runs once, after lowerMacros and just before a run of B3ReduceStrength (which helps clean up the dead code it tends to leave behind). 10 I had to separate killDeadCode out of B3ReduceStrength (as a new B3EliminateDeadCode pass) to run it before B3OptimizeAssociativeExpressionTrees, as otherwise it is stopped by high use counts 11 inherited from CSE. 12 This extra run of DCE is by itself a win, most notably on microbenchmarks/instanceof-always-hit-two (1.5x faster), and on microbenchmarks/licm-dragons(-out-of-bounds) (both get 1.16x speedup). 13 I suspect it is because it runs between CSE and tail-dedup, and as a result allows a lot more tail-dedup to occur. 14 15 The pass is currently extremely conservative, not trying anything if it would cause _any_ code duplication. 16 For this purpose, it starts by computing use counts for the potentially interesting nodes (those with the right opcodes), and segregate them into expression trees. 17 The root of an expression tree is a node that is either used in multiple places, or is used by a value with a different opcode. 18 The leaves of an expression tree are nodes that are either used in multiple places, or have a different opcode. 19 All constant leaves of a tree are combined, as well as all leaves that are identical. What remains is then laid out into a balanced binary tree, hopefully maximizing ILP. 20 21 This optimization was implemented as a stand-alone pass and not as part of B3ReduceStrength mostly because it needs use counts to avoid code duplication. 22 It also benefits from finding all tree roots first, and not trying to repeatedly optimize subtrees. 23 24 I added several tests to testB3 with varying patterns of trees. It is also tested in a less focused way by lots of older tests. 25 26 In the future this pass could be expanded to allow some bounded amount of code duplication, and merging more leaves (e.g. Mul(a, 3) and a in an Add tree, into Mul(a, 4)) 27 The latter will need exposing the peephole optimizations out of B3ReduceStrength to avoid duplicating code. 28 29 * JavaScriptCore.xcodeproj/project.pbxproj: 30 * Sources.txt: 31 * b3/B3Common.cpp: 32 (JSC::B3::shouldDumpIR): 33 (JSC::B3::shouldDumpIRAtEachPhase): 34 * b3/B3Common.h: 35 * b3/B3EliminateDeadCode.cpp: Added. 36 (JSC::B3::EliminateDeadCode::run): 37 (JSC::B3::eliminateDeadCode): 38 * b3/B3EliminateDeadCode.h: Added. 39 (JSC::B3::EliminateDeadCode::EliminateDeadCode): 40 * b3/B3Generate.cpp: 41 (JSC::B3::generateToAir): 42 * b3/B3OptimizeAssociativeExpressionTrees.cpp: Added. 43 (JSC::B3::OptimizeAssociativeExpressionTrees::OptimizeAssociativeExpressionTrees): 44 (JSC::B3::OptimizeAssociativeExpressionTrees::neutralElement): 45 (JSC::B3::OptimizeAssociativeExpressionTrees::isAbsorbingElement): 46 (JSC::B3::OptimizeAssociativeExpressionTrees::combineConstants): 47 (JSC::B3::OptimizeAssociativeExpressionTrees::emitValue): 48 (JSC::B3::OptimizeAssociativeExpressionTrees::optimizeRootedTree): 49 (JSC::B3::OptimizeAssociativeExpressionTrees::run): 50 (JSC::B3::optimizeAssociativeExpressionTrees): 51 * b3/B3OptimizeAssociativeExpressionTrees.h: Added. 52 * b3/B3ReduceStrength.cpp: 53 * b3/B3Value.cpp: 54 (JSC::B3::Value::replaceWithIdentity): 55 * b3/testb3.cpp: 56 (JSC::B3::testBitXorTreeArgs): 57 (JSC::B3::testBitXorTreeArgsEven): 58 (JSC::B3::testBitXorTreeArgImm): 59 (JSC::B3::testAddTreeArg32): 60 (JSC::B3::testMulTreeArg32): 61 (JSC::B3::testBitAndTreeArg32): 62 (JSC::B3::testBitOrTreeArg32): 63 (JSC::B3::run): 64 1 65 2019-04-03 Yusuke Suzuki <ysuzuki@apple.com> 2 66 -
trunk/Source/JavaScriptCore/JavaScriptCore.xcodeproj/project.pbxproj
r243832 r243851 852 852 2AF7382D18BBBF92008A5A37 /* StructureIDTable.h in Headers */ = {isa = PBXBuildFile; fileRef = 2AF7382B18BBBF92008A5A37 /* StructureIDTable.h */; settings = {ATTRIBUTES = (Private, ); }; }; 853 853 2D342F36F7244096804ADB24 /* SourceOrigin.h in Headers */ = {isa = PBXBuildFile; fileRef = 425BA1337E4344E1B269A671 /* SourceOrigin.h */; settings = {ATTRIBUTES = (Private, ); }; }; 854 3395C70722555F6D00BDBFAD /* B3EliminateDeadCode.h in Headers */ = {isa = PBXBuildFile; fileRef = 3395C70522555F6D00BDBFAD /* B3EliminateDeadCode.h */; }; 854 855 371D842D17C98B6E00ECF994 /* libz.dylib in Frameworks */ = {isa = PBXBuildFile; fileRef = 371D842C17C98B6E00ECF994 /* libz.dylib */; }; 855 856 37C738D21EDB56E4003F2B0B /* ParseInt.h in Headers */ = {isa = PBXBuildFile; fileRef = 37C738D11EDB5672003F2B0B /* ParseInt.h */; settings = {ATTRIBUTES = (Private, ); }; }; … … 3339 3340 3032175DF1AD47D8998B34E1 /* JSSourceCode.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = JSSourceCode.h; sourceTree = "<group>"; }; 3340 3341 30A5F403F11C4F599CD596D5 /* WasmSignatureInlines.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = WasmSignatureInlines.h; sourceTree = "<group>"; }; 3342 33743649224D79EF00C8C227 /* B3OptimizeAssociativeExpressionTrees.cpp */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.cpp.cpp; name = B3OptimizeAssociativeExpressionTrees.cpp; path = b3/B3OptimizeAssociativeExpressionTrees.cpp; sourceTree = "<group>"; }; 3343 3374364A224D79EF00C8C227 /* B3OptimizeAssociativeExpressionTrees.h */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.c.h; name = B3OptimizeAssociativeExpressionTrees.h; path = b3/B3OptimizeAssociativeExpressionTrees.h; sourceTree = "<group>"; }; 3344 3395C70422555F6C00BDBFAD /* B3EliminateDeadCode.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; name = B3EliminateDeadCode.cpp; path = b3/B3EliminateDeadCode.cpp; sourceTree = "<group>"; }; 3345 3395C70522555F6D00BDBFAD /* B3EliminateDeadCode.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; name = B3EliminateDeadCode.h; path = b3/B3EliminateDeadCode.h; sourceTree = "<group>"; }; 3341 3346 37119A7720CCB5DC002C6DC9 /* WebKitTargetConditionals.xcconfig */ = {isa = PBXFileReference; lastKnownFileType = text.xcconfig; path = WebKitTargetConditionals.xcconfig; sourceTree = "<group>"; }; 3342 3347 371D842C17C98B6E00ECF994 /* libz.dylib */ = {isa = PBXFileReference; lastKnownFileType = "compiled.mach-o.dylib"; name = libz.dylib; path = usr/lib/libz.dylib; sourceTree = SDKROOT; }; … … 5359 5364 0F725CA31C503DED00AD943A /* B3EliminateCommonSubexpressions.cpp */, 5360 5365 0F725CA41C503DED00AD943A /* B3EliminateCommonSubexpressions.h */, 5366 3395C70422555F6C00BDBFAD /* B3EliminateDeadCode.cpp */, 5367 3395C70522555F6D00BDBFAD /* B3EliminateDeadCode.h */, 5361 5368 0F5BF16E1F23A5A10029D91D /* B3EnsureLoopPreHeaders.cpp */, 5362 5369 0F5BF16F1F23A5A10029D91D /* B3EnsureLoopPreHeaders.h */, … … 5406 5413 0FEC84D71BDACDAC0080FF74 /* B3Opcode.cpp */, 5407 5414 0FEC84D81BDACDAC0080FF74 /* B3Opcode.h */, 5415 33743649224D79EF00C8C227 /* B3OptimizeAssociativeExpressionTrees.cpp */, 5416 3374364A224D79EF00C8C227 /* B3OptimizeAssociativeExpressionTrees.h */, 5408 5417 0FEC84D91BDACDAC0080FF74 /* B3Origin.cpp */, 5409 5418 0FEC84DA1BDACDAC0080FF74 /* B3Origin.h */, … … 9071 9080 0FCEFAAC1804C13E00472CE4 /* FTLSaveRestore.h in Headers */, 9072 9081 0F25F1B2181635F300522F39 /* FTLSlowPathCall.h in Headers */, 9082 3395C70722555F6D00BDBFAD /* B3EliminateDeadCode.h in Headers */, 9073 9083 0F25F1B4181635F300522F39 /* FTLSlowPathCallKey.h in Headers */, 9074 9084 E322E5A71DA644A8006E7709 /* FTLSnippetParams.h in Headers */, -
trunk/Source/JavaScriptCore/Sources.txt
r243832 r243851 125 125 b3/B3Effects.cpp 126 126 b3/B3EliminateCommonSubexpressions.cpp 127 b3/B3EliminateDeadCode.cpp 127 128 b3/B3EnsureLoopPreHeaders.cpp 128 129 b3/B3FenceValue.cpp … … 144 145 b3/B3OpaqueByproducts.cpp 145 146 b3/B3Opcode.cpp 147 b3/B3OptimizeAssociativeExpressionTrees.cpp 146 148 b3/B3Origin.cpp 147 149 b3/B3OriginDump.cpp -
trunk/Source/JavaScriptCore/b3/B3Common.cpp
r242776 r243851 36 36 namespace JSC { namespace B3 { 37 37 38 bool shouldDumpIR(B3Comp litationMode mode)38 bool shouldDumpIR(B3CompilationMode mode) 39 39 { 40 40 #if ENABLE(FTL_JIT) … … 45 45 } 46 46 47 bool shouldDumpIRAtEachPhase(B3Comp litationMode mode)47 bool shouldDumpIRAtEachPhase(B3CompilationMode mode) 48 48 { 49 49 if (mode == B3Mode) -
trunk/Source/JavaScriptCore/b3/B3Common.h
r242776 r243851 35 35 namespace JSC { namespace B3 { 36 36 37 enum B3Comp litationMode {37 enum B3CompilationMode { 38 38 B3Mode, 39 39 AirMode 40 40 }; 41 41 42 JS_EXPORT_PRIVATE bool shouldDumpIR(B3Comp litationMode);43 bool shouldDumpIRAtEachPhase(B3Comp litationMode);42 JS_EXPORT_PRIVATE bool shouldDumpIR(B3CompilationMode); 43 bool shouldDumpIRAtEachPhase(B3CompilationMode); 44 44 bool shouldValidateIR(); 45 45 bool shouldValidateIRAtEachPhase(); -
trunk/Source/JavaScriptCore/b3/B3EliminateDeadCode.h
r243850 r243851 1 1 /* 2 * Copyright (C) 201 5Apple Inc. All rights reserved.2 * Copyright (C) 2019 Apple Inc. All rights reserved. 3 3 * 4 4 * Redistribution and use in source and binary forms, with or without … … 24 24 */ 25 25 26 #include "config.h" 27 #include "B3Common.h" 26 #pragma once 28 27 29 28 #if ENABLE(B3_JIT) 30 29 31 #include "DFGCommon.h"32 #include "FTLState.h"33 #include "Options.h"34 #include <wtf/Optional.h>35 36 30 namespace JSC { namespace B3 { 37 31 38 bool shouldDumpIR(B3ComplitationMode mode) 39 { 40 #if ENABLE(FTL_JIT) 41 return FTL::verboseCompilationEnabled() || FTL::shouldDumpDisassembly() || shouldDumpIRAtEachPhase(mode); 42 #else 43 return shouldDumpIRAtEachPhase(mode); 44 #endif 45 } 32 class Procedure; 46 33 47 bool shouldDumpIRAtEachPhase(B3ComplitationMode mode) 48 { 49 if (mode == B3Mode) 50 return Options::dumpGraphAtEachPhase() || Options::dumpB3GraphAtEachPhase(); 51 return Options::dumpGraphAtEachPhase() || Options::dumpAirGraphAtEachPhase(); 52 } 53 54 bool shouldValidateIR() 55 { 56 return DFG::validationEnabled() || shouldValidateIRAtEachPhase(); 57 } 58 59 bool shouldValidateIRAtEachPhase() 60 { 61 return Options::validateGraphAtEachPhase(); 62 } 63 64 bool shouldSaveIRBeforePhase() 65 { 66 return Options::verboseValidationFailure(); 67 } 68 69 Optional<GPRReg> pinnedExtendedOffsetAddrRegister() 70 { 71 #if CPU(ARM64) 72 return static_cast<GPRReg>(+MacroAssembler::dataTempRegister); 73 #elif CPU(X86_64) 74 return WTF::nullopt; 75 #else 76 #error Unhandled architecture. 77 #endif 78 } 34 // The 'Impl' version of this function does not have a scope phase. It is public so that B3ReduceStrength can use it in its fixpoint. 35 bool eliminateDeadCodeImpl(Procedure&); 36 JS_EXPORT_PRIVATE bool eliminateDeadCode(Procedure&); 79 37 80 38 } } // namespace JSC::B3 -
trunk/Source/JavaScriptCore/b3/B3Generate.cpp
r232008 r243851 35 35 #include "B3DuplicateTails.h" 36 36 #include "B3EliminateCommonSubexpressions.h" 37 #include "B3EliminateDeadCode.h" 37 38 #include "B3FixSSA.h" 38 39 #include "B3FoldPathConstants.h" … … 44 45 #include "B3LowerToAir.h" 45 46 #include "B3MoveConstants.h" 47 #include "B3OptimizeAssociativeExpressionTrees.h" 46 48 #include "B3Procedure.h" 47 49 #include "B3PureCSE.h" … … 88 90 if (eliminateCommonSubexpressions(procedure)) 89 91 eliminateCommonSubexpressions(procedure); 92 eliminateDeadCode(procedure); 90 93 inferSwitches(procedure); 91 94 if (Options::useB3TailDup()) … … 93 96 fixSSA(procedure); 94 97 foldPathConstants(procedure); 95 96 98 // FIXME: Add more optimizations here. 97 99 // https://bugs.webkit.org/show_bug.cgi?id=150507 … … 105 107 106 108 if (procedure.optLevel() >= 2) { 109 optimizeAssociativeExpressionTrees(procedure); 107 110 reduceStrength(procedure); 108 111 -
trunk/Source/JavaScriptCore/b3/B3OptimizeAssociativeExpressionTrees.h
r243850 r243851 1 1 /* 2 * Copyright (C) 201 5Apple Inc. All rights reserved.2 * Copyright (C) 2019 Apple Inc. All rights reserved. 3 3 * 4 4 * Redistribution and use in source and binary forms, with or without … … 21 21 * OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT 22 22 * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 23 * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 23 * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 24 24 */ 25 25 26 #include "config.h" 27 #include "B3Common.h" 26 #pragma once 28 27 29 28 #if ENABLE(B3_JIT) 30 29 31 #include "DFGCommon.h"32 #include "FTLState.h"33 #include "Options.h"34 #include <wtf/Optional.h>35 36 30 namespace JSC { namespace B3 { 37 31 38 bool shouldDumpIR(B3ComplitationMode mode) 39 { 40 #if ENABLE(FTL_JIT) 41 return FTL::verboseCompilationEnabled() || FTL::shouldDumpDisassembly() || shouldDumpIRAtEachPhase(mode); 42 #else 43 return shouldDumpIRAtEachPhase(mode); 44 #endif 45 } 32 class Procedure; 46 33 47 bool shouldDumpIRAtEachPhase(B3ComplitationMode mode) 48 { 49 if (mode == B3Mode) 50 return Options::dumpGraphAtEachPhase() || Options::dumpB3GraphAtEachPhase(); 51 return Options::dumpGraphAtEachPhase() || Options::dumpAirGraphAtEachPhase(); 52 } 53 54 bool shouldValidateIR() 55 { 56 return DFG::validationEnabled() || shouldValidateIRAtEachPhase(); 57 } 58 59 bool shouldValidateIRAtEachPhase() 60 { 61 return Options::validateGraphAtEachPhase(); 62 } 63 64 bool shouldSaveIRBeforePhase() 65 { 66 return Options::verboseValidationFailure(); 67 } 68 69 Optional<GPRReg> pinnedExtendedOffsetAddrRegister() 70 { 71 #if CPU(ARM64) 72 return static_cast<GPRReg>(+MacroAssembler::dataTempRegister); 73 #elif CPU(X86_64) 74 return WTF::nullopt; 75 #else 76 #error Unhandled architecture. 77 #endif 78 } 34 // Finds large expression trees using one of the associative and commutative opcodes (Add/Mul/BitAnd/BitOr/BitXor), 35 // then reorder them to eliminate as many operations as possible, then make balanced binary trees out of them 36 // to maximize ILP and minimize the total latency. 37 bool optimizeAssociativeExpressionTrees(Procedure&); 79 38 80 39 } } // namespace JSC::B3 81 40 82 41 #endif // ENABLE(B3_JIT) 83 -
trunk/Source/JavaScriptCore/b3/B3ReduceStrength.cpp
r243670 r243851 34 34 #include "B3ComputeDivisionMagic.h" 35 35 #include "B3Dominators.h" 36 #include "B3EliminateDeadCode.h" 36 37 #include "B3InsertionSetInlines.h" 37 38 #include "B3MemoryValueInlines.h" … … 441 442 // keep @thing. That's better, since we usually want things to stay wherever the client 442 443 // put them. We're not actually smart enough to move things around at random. 443 killDeadCode();444 m_changed |= eliminateDeadCodeImpl(m_proc); 444 445 445 446 simplifySSA(); … … 2733 2734 } 2734 2735 2735 void killDeadCode()2736 {2737 GraphNodeWorklist<Value*, IndexSet<Value*>> worklist;2738 Vector<UpsilonValue*, 64> upsilons;2739 for (BasicBlock* block : m_proc) {2740 for (Value* value : *block) {2741 Effects effects;2742 // We don't care about effects of SSA operations, since we model them more2743 // accurately than the effects() method does.2744 if (value->opcode() != Phi && value->opcode() != Upsilon)2745 effects = value->effects();2746 2747 if (effects.mustExecute())2748 worklist.push(value);2749 2750 if (UpsilonValue* upsilon = value->as<UpsilonValue>())2751 upsilons.append(upsilon);2752 }2753 }2754 for (;;) {2755 while (Value* value = worklist.pop()) {2756 for (Value* child : value->children())2757 worklist.push(child);2758 }2759 2760 bool didPush = false;2761 for (size_t upsilonIndex = 0; upsilonIndex < upsilons.size(); ++upsilonIndex) {2762 UpsilonValue* upsilon = upsilons[upsilonIndex];2763 if (worklist.saw(upsilon->phi())) {2764 worklist.push(upsilon);2765 upsilons[upsilonIndex--] = upsilons.last();2766 upsilons.takeLast();2767 didPush = true;2768 }2769 }2770 if (!didPush)2771 break;2772 }2773 2774 IndexSet<Variable*> liveVariables;2775 2776 for (BasicBlock* block : m_proc) {2777 size_t sourceIndex = 0;2778 size_t targetIndex = 0;2779 while (sourceIndex < block->size()) {2780 Value* value = block->at(sourceIndex++);2781 if (worklist.saw(value)) {2782 if (VariableValue* variableValue = value->as<VariableValue>())2783 liveVariables.add(variableValue->variable());2784 block->at(targetIndex++) = value;2785 } else {2786 m_proc.deleteValue(value);2787 m_changed = true;2788 }2789 }2790 block->values().resize(targetIndex);2791 }2792 2793 for (Variable* variable : m_proc.variables()) {2794 if (!liveVariables.contains(variable))2795 m_proc.deleteVariable(variable);2796 }2797 }2798 2799 2736 void simplifySSA() 2800 2737 { -
trunk/Source/JavaScriptCore/b3/B3Value.cpp
r241335 r243851 64 64 65 65 ASSERT(m_type == value->m_type); 66 ASSERT(value != this); 66 67 67 68 if (m_type == Void) { -
trunk/Source/JavaScriptCore/b3/testb3.cpp
r243670 r243851 39 39 #include "B3ComputeDivisionMagic.h" 40 40 #include "B3Const32Value.h" 41 #include "B3Const64Value.h" 41 42 #include "B3ConstPtrValue.h" 42 43 #include "B3Effects.h" … … 397 398 testLoadWithOffsetImpl(32761, 16381); 398 399 testLoadWithOffsetImpl(32768, 16384); 400 } 401 402 void testBitXorTreeArgs(int64_t a, int64_t b) 403 { 404 Procedure proc; 405 BasicBlock* root = proc.addBlock(); 406 Value* argA = root->appendNew<ArgumentRegValue>(proc, Origin(), GPRInfo::argumentGPR0); 407 Value* argB = root->appendNew<ArgumentRegValue>(proc, Origin(), GPRInfo::argumentGPR1); 408 Value* node = root->appendNew<Value>(proc, BitXor, Origin(), argA, argB); 409 node = root->appendNew<Value>(proc, BitXor, Origin(), node, argB); 410 node = root->appendNew<Value>(proc, BitXor, Origin(), node, argA); 411 node = root->appendNew<Value>(proc, BitXor, Origin(), node, argB); 412 root->appendNew<Value>(proc, Return, Origin(), node); 413 414 CHECK_EQ(compileAndRun<int64_t>(proc, a, b), (((a ^ b) ^ b) ^ a) ^ b); 415 } 416 417 void testBitXorTreeArgsEven(int64_t a, int64_t b) 418 { 419 Procedure proc; 420 BasicBlock* root = proc.addBlock(); 421 Value* argA = root->appendNew<ArgumentRegValue>(proc, Origin(), GPRInfo::argumentGPR0); 422 Value* argB = root->appendNew<ArgumentRegValue>(proc, Origin(), GPRInfo::argumentGPR1); 423 Value* node = root->appendNew<Value>(proc, BitXor, Origin(), argA, argB); 424 node = root->appendNew<Value>(proc, BitXor, Origin(), node, argB); 425 node = root->appendNew<Value>(proc, BitXor, Origin(), node, argA); 426 root->appendNew<Value>(proc, Return, Origin(), node); 427 428 CHECK_EQ(compileAndRun<int64_t>(proc, a, b), ((a ^ b) ^ b) ^ a); 429 } 430 431 void testBitXorTreeArgImm(int64_t a, int64_t b) 432 { 433 Procedure proc; 434 BasicBlock* root = proc.addBlock(); 435 Value* argA = root->appendNew<ArgumentRegValue>(proc, Origin(), GPRInfo::argumentGPR0); 436 Value* immB = root->appendNew<Const64Value>(proc, Origin(), b); 437 Value* node = root->appendNew<Value>(proc, BitXor, Origin(), argA, immB); 438 node = root->appendNew<Value>(proc, BitXor, Origin(), argA, node); 439 node = root->appendNew<Value>(proc, BitXor, Origin(), argA, node); 440 node = root->appendNew<Value>(proc, BitXor, Origin(), immB, node); 441 root->appendNew<Value>(proc, Return, Origin(), node); 442 443 CHECK_EQ(compileAndRun<int64_t>(proc, a), b ^ (a ^ (a ^ (a ^ b)))); 444 } 445 446 void testAddTreeArg32(int32_t a) 447 { 448 Procedure proc; 449 BasicBlock* root = proc.addBlock(); 450 Value* argA = root->appendNew<ArgumentRegValue>(proc, Origin(), GPRInfo::argumentGPR0); 451 argA = root->appendNew<Value>(proc, Trunc, Origin(), argA); 452 Value* node = argA; 453 int32_t expectedResult = a; 454 for (unsigned i = 0; i < 20; ++i) { 455 Value* otherNode; 456 if (!(i % 3)) { 457 otherNode = root->appendNew<Const32Value>(proc, Origin(), i); 458 expectedResult += i; 459 } else { 460 otherNode = argA; 461 expectedResult += a; 462 } 463 node = root->appendNew<Value>(proc, Add, Origin(), node, otherNode); 464 } 465 root->appendNew<Value>(proc, Return, Origin(), node); 466 467 CHECK_EQ(compileAndRun<int32_t>(proc, a), expectedResult); 468 } 469 470 void testMulTreeArg32(int32_t a) 471 { 472 // Fibonacci-like expression tree with multiplication instead of addition. 473 // Verifies that we don't explode on heavily factored graphs. 474 Procedure proc; 475 BasicBlock* root = proc.addBlock(); 476 Value* argA = root->appendNew<ArgumentRegValue>(proc, Origin(), GPRInfo::argumentGPR0); 477 argA = root->appendNew<Value>(proc, Trunc, Origin(), argA); 478 Value* nodeA = argA; 479 Value* nodeB = argA; 480 int32_t expectedA = a, expectedResult = a; 481 for (unsigned i = 0; i < 20; ++i) { 482 Value* newNodeB = root->appendNew<Value>(proc, Mul, Origin(), nodeA, nodeB); 483 nodeA = nodeB; 484 nodeB = newNodeB; 485 int32_t newExpectedResult = expectedA * expectedResult; 486 expectedA = expectedResult; 487 expectedResult = newExpectedResult; 488 } 489 root->appendNew<Value>(proc, Return, Origin(), nodeB); 490 491 CHECK_EQ(compileAndRun<int32_t>(proc, a), expectedResult); 492 } 493 494 void testBitAndTreeArg32(int32_t a) 495 { 496 Procedure proc; 497 BasicBlock* root = proc.addBlock(); 498 Value* argA = root->appendNew<ArgumentRegValue>(proc, Origin(), GPRInfo::argumentGPR0); 499 argA = root->appendNew<Value>(proc, Trunc, Origin(), argA); 500 Value* node = argA; 501 for (unsigned i = 0; i < 8; ++i) { 502 Value* constI = root->appendNew<Const32Value>(proc, Origin(), i | 42); 503 Value* newBitAnd = root->appendNew<Value>(proc, BitAnd, Origin(), argA, constI); 504 node = root->appendNew<Value>(proc, BitAnd, Origin(), node, newBitAnd); 505 } 506 root->appendNew<Value>(proc, Return, Origin(), node); 507 508 CHECK_EQ(compileAndRun<int32_t>(proc, a), a & 42); 509 } 510 511 void testBitOrTreeArg32(int32_t a) 512 { 513 Procedure proc; 514 BasicBlock* root = proc.addBlock(); 515 Value* argA = root->appendNew<ArgumentRegValue>(proc, Origin(), GPRInfo::argumentGPR0); 516 argA = root->appendNew<Value>(proc, Trunc, Origin(), argA); 517 Value* node = argA; 518 for (unsigned i = 0; i < 8; ++i) { 519 Value* constI = root->appendNew<Const32Value>(proc, Origin(), i); 520 Value* newBitAnd = root->appendNew<Value>(proc, BitOr, Origin(), argA, constI); 521 node = root->appendNew<Value>(proc, BitOr, Origin(), node, newBitAnd); 522 } 523 root->appendNew<Value>(proc, Return, Origin(), node); 524 525 CHECK_EQ(compileAndRun<int32_t>(proc, a), a | 7); 399 526 } 400 527 … … 17048 17175 RUN(testReturnConst64(-42)); 17049 17176 RUN(testReturnVoid()); 17177 17178 RUN_BINARY(testBitXorTreeArgs, int64Operands(), int64Operands()); 17179 RUN_BINARY(testBitXorTreeArgsEven, int64Operands(), int64Operands()); 17180 RUN_BINARY(testBitXorTreeArgImm, int64Operands(), int64Operands()); 17181 RUN_UNARY(testAddTreeArg32, int32Operands()); 17182 RUN_UNARY(testMulTreeArg32, int32Operands()); 17183 RUN_UNARY(testBitAndTreeArg32, int32Operands()); 17184 RUN_UNARY(testBitOrTreeArg32, int32Operands()); 17050 17185 17051 17186 RUN(testAddArg(111));
Note: See TracChangeset
for help on using the changeset viewer.