Changeset 256093 in webkit


Ignore:
Timestamp:
Feb 8, 2020 1:50:02 PM (4 years ago)
Author:
ysuzuki@apple.com
Message:

[WTF] Try using 75% load factor for HashTable
https://bugs.webkit.org/show_bug.cgi?id=207183

Reviewed by Mark Lam.

Source/WTF:

We know that hash-table is one of the most memory consuming part in WebKit.
By analyzing many production hash-table implementations[1], I found that many
of them are using 75% load-factor while our one is 50%.

This patch changes the load-factor from 50% to 75%. But we pick 75% only for
small tables which capacity is <= 1024 based on collected data by micro-benchmarking.
The collected data is telling that longer probe-length affects on performance if table
size gets larger.

[1]: LLVM DenseMap, Abseil's, rust's, and so on.

  • wtf/HashTable.h:

(WTF::HashTableCapacityForSize::shouldExpand):
(WTF::HashTableCapacityForSize::capacityForSize):
(WTF::HashTable::shouldExpand const):
(WTF::KeyTraits>::computeBestTableSize):

Tools:

  • TestWebKitAPI/Tests/WTF/HashSet.cpp:

(TestWebKitAPI::testInitialCapacity):

Location:
trunk
Files:
4 edited

Legend:

Unmodified
Added
Removed
  • trunk/Source/WTF/ChangeLog

    r256091 r256093  
     12020-02-08  Yusuke Suzuki  <ysuzuki@apple.com>
     2
     3        [WTF] Try using 75% load factor for HashTable
     4        https://bugs.webkit.org/show_bug.cgi?id=207183
     5
     6        Reviewed by Mark Lam.
     7
     8        We know that hash-table is one of the most memory consuming part in WebKit.
     9        By analyzing many production hash-table implementations[1], I found that many
     10        of them are using 75% load-factor while our one is 50%.
     11
     12        This patch changes the load-factor from 50% to 75%. But we pick 75% only for
     13        small tables which capacity is <= 1024 based on collected data by micro-benchmarking.
     14        The collected data is telling that longer probe-length affects on performance if table
     15        size gets larger.
     16
     17        [1]: LLVM DenseMap, Abseil's, rust's, and so on.
     18
     19        * wtf/HashTable.h:
     20        (WTF::HashTableCapacityForSize::shouldExpand):
     21        (WTF::HashTableCapacityForSize::capacityForSize):
     22        (WTF::HashTable::shouldExpand const):
     23        (WTF::KeyTraits>::computeBestTableSize):
     24
    1252020-02-08  Sam Weinig  <weinig@apple.com>
    226
  • trunk/Source/WTF/wtf/HashTable.h

    r256011 r256093  
    304304    };
    305305
     306    // HashTableCapacityForSize computes the upper power of two capacity to hold the size parameter.
     307    // This is done at compile time to initialize the HashTraits.
     308    template<unsigned size>
     309    struct HashTableCapacityForSize {
     310        // Load-factor for small table is 75%.
     311        static constexpr unsigned smallMaxLoadNumerator = 3;
     312        static constexpr unsigned smallMaxLoadDenominator = 4;
     313        // Load-factor for large table is 50%.
     314        static constexpr unsigned largeMaxLoadNumerator = 1;
     315        static constexpr unsigned largeMaxLoadDenominator = 2;
     316        static constexpr unsigned maxSmallTableCapacity = 1024;
     317        static constexpr unsigned minLoad = 6;
     318
     319        static constexpr bool shouldExpand(uint64_t keyAndDeleteCount, uint64_t tableSize)
     320        {
     321            if (tableSize <= maxSmallTableCapacity)
     322                return keyAndDeleteCount * smallMaxLoadDenominator >= tableSize * smallMaxLoadNumerator;
     323            return keyAndDeleteCount * largeMaxLoadDenominator >= tableSize * largeMaxLoadNumerator;
     324        }
     325
     326        static constexpr unsigned capacityForSize(uint64_t sizeArg)
     327        {
     328            if (!sizeArg)
     329                return 0;
     330            uint64_t capacity = roundUpToPowerOfTwo(sizeArg);
     331            if (shouldExpand(sizeArg, capacity))
     332                return capacity * 2;
     333            return capacity;
     334        }
     335
     336        static constexpr unsigned value = capacityForSize(size);
     337        static_assert(size > 0, "HashTableNonZeroMinimumCapacity");
     338        static_assert(!static_cast<unsigned>(value >> 31), "HashTableNoCapacityOverflow");
     339    };
     340
    306341    template<typename Key, typename Value, typename Extractor, typename HashFunctions, typename Traits, typename KeyTraits>
    307342    class HashTable {
     
    314349        typedef IdentityHashTranslator<ValueTraits, HashFunctions> IdentityTranslatorType;
    315350        typedef HashTableAddResult<iterator> AddResult;
     351
     352        using HashTableSizePolicy = HashTableCapacityForSize<1>;
    316353
    317354#if DUMP_HASHTABLE_STATS_PER_TABLE
     
    488525
    489526        static constexpr unsigned computeBestTableSize(unsigned keyCount);
    490         bool shouldExpand() const { return (keyCount() + deletedCount()) * maxLoad >= tableSize(); }
     527        bool shouldExpand() const { return HashTableSizePolicy::shouldExpand(keyCount() + deletedCount(), tableSize()); }
    491528        bool mustRehashInPlace() const { return keyCount() * minLoad < tableSize() * 2; }
    492529        bool shouldShrink() const { return keyCount() * minLoad < tableSize() && tableSize() > KeyTraits::minimumTableSize; }
     
    523560#endif
    524561
    525         static constexpr unsigned maxLoad = 2;
    526         static constexpr unsigned minLoad = 6;
     562        // Load-factor for small table is 75%.
     563        static constexpr unsigned smallMaxLoadNumerator = HashTableSizePolicy::smallMaxLoadNumerator;
     564        static constexpr unsigned smallMaxLoadDenominator = HashTableSizePolicy::smallMaxLoadDenominator;
     565        // Load-factor for large table is 50%.
     566        static constexpr unsigned largeMaxLoadNumerator = HashTableSizePolicy::largeMaxLoadNumerator;
     567        static constexpr unsigned largeMaxLoadDenominator = HashTableSizePolicy::largeMaxLoadDenominator;
     568        static constexpr unsigned maxSmallTableCapacity = HashTableSizePolicy::maxSmallTableCapacity;
     569        static constexpr unsigned minLoad = HashTableSizePolicy::minLoad;
    527570
    528571        static constexpr int tableSizeOffset = -1;
     
    555598        mutable std::unique_ptr<Stats> m_stats;
    556599#endif
    557     };
    558 
    559     // Set all the bits to one after the most significant bit: 00110101010 -> 00111111111.
    560     template<unsigned size> struct OneifyLowBits;
    561     template<>
    562     struct OneifyLowBits<0> {
    563         static constexpr unsigned value = 0;
    564     };
    565     template<unsigned number>
    566     struct OneifyLowBits {
    567         static constexpr unsigned value = number | OneifyLowBits<(number >> 1)>::value;
    568     };
    569     // Compute the first power of two integer that is an upper bound of the parameter 'number'.
    570     template<unsigned number>
    571     struct UpperPowerOfTwoBound {
    572         static constexpr unsigned value = (OneifyLowBits<number - 1>::value + 1) * 2;
    573     };
    574 
    575     // Because power of two numbers are the limit of maxLoad, their capacity is twice the
    576     // UpperPowerOfTwoBound, or 4 times their values.
    577     template<unsigned size, bool isPowerOfTwo> struct HashTableCapacityForSizeSplitter;
    578     template<unsigned size>
    579     struct HashTableCapacityForSizeSplitter<size, true> {
    580         static constexpr unsigned value = size * 4;
    581     };
    582     template<unsigned size>
    583     struct HashTableCapacityForSizeSplitter<size, false> {
    584         static constexpr unsigned value = UpperPowerOfTwoBound<size>::value;
    585     };
    586 
    587     // HashTableCapacityForSize computes the upper power of two capacity to hold the size parameter.
    588     // This is done at compile time to initialize the HashTraits.
    589     template<unsigned size>
    590     struct HashTableCapacityForSize {
    591         static constexpr unsigned value = HashTableCapacityForSizeSplitter<size, !(size & (size - 1))>::value;
    592         COMPILE_ASSERT(size > 0, HashTableNonZeroMinimumCapacity);
    593         COMPILE_ASSERT(!static_cast<unsigned>(value >> 31), HashTableNoCapacityOverflow);
    594         COMPILE_ASSERT(value > (2 * size), HashTableCapacityHoldsContentSize);
    595600    };
    596601
     
    12371242    constexpr unsigned HashTable<Key, Value, Extractor, HashFunctions, Traits, KeyTraits>::computeBestTableSize(unsigned keyCount)
    12381243    {
    1239         unsigned bestTableSize = WTF::roundUpToPowerOfTwo(keyCount) * 2;
    1240 
    1241         // With maxLoad at 1/2 and minLoad at 1/6, our average load is 2/6.
    1242         // If we are getting halfway between 2/6 and 1/2 (past 5/12), we double the size to avoid being too close to
    1243         // loadMax and bring the ratio close to 2/6. This give us a load in the bounds [3/12, 5/12).
    1244         bool aboveThreeQuarterLoad = keyCount * 12 >= bestTableSize * 5;
    1245         if (aboveThreeQuarterLoad)
     1244        unsigned bestTableSize = WTF::roundUpToPowerOfTwo(keyCount);
     1245
     1246        if (HashTableSizePolicy::shouldExpand(keyCount, bestTableSize))
    12461247            bestTableSize *= 2;
    12471248
     1249        auto aboveThresholdForEagerExpansion = [](double loadFactor, unsigned keyCount, unsigned tableSize)
     1250        {
     1251            // Here is the rationale behind this calculation, using 3/4 load-factor.
     1252            // With maxLoad at 3/4 and minLoad at 1/6, our average load is 11/24.
     1253            // If we are getting half-way between 11/24 and 3/4, we double the size
     1254            // to avoid being too close to loadMax and bring the ratio close to 11/24. This
     1255            // give us a load in the bounds [9/24, 15/24).
     1256            double maxLoadRatio = loadFactor;
     1257            double minLoadRatio = 1.0 / minLoad;
     1258            double averageLoadRatio = (maxLoadRatio + minLoadRatio) / 2;
     1259            double halfWayBetweenAverageAndMaxLoadRatio = (averageLoadRatio + maxLoadRatio) / 2;
     1260            return keyCount >= tableSize * halfWayBetweenAverageAndMaxLoadRatio;
     1261        };
     1262
     1263        if (bestTableSize <= maxSmallTableCapacity) {
     1264            constexpr double smallLoadFactor = static_cast<double>(smallMaxLoadNumerator) / smallMaxLoadDenominator;
     1265            if (aboveThresholdForEagerExpansion(smallLoadFactor, keyCount, bestTableSize))
     1266                bestTableSize *= 2;
     1267        } else {
     1268            constexpr double largeLoadFactor = static_cast<double>(largeMaxLoadNumerator) / largeMaxLoadDenominator;
     1269            if (aboveThresholdForEagerExpansion(largeLoadFactor, keyCount, bestTableSize))
     1270                bestTableSize *= 2;
     1271        }
    12481272        unsigned minimumTableSize = KeyTraits::minimumTableSize;
    12491273        return std::max(bestTableSize, minimumTableSize);
  • trunk/Tools/ChangeLog

    r256091 r256093  
     12020-02-08  Yusuke Suzuki  <ysuzuki@apple.com>
     2
     3        [WTF] Try using 75% load factor for HashTable
     4        https://bugs.webkit.org/show_bug.cgi?id=207183
     5
     6        Reviewed by Mark Lam.
     7
     8        * TestWebKitAPI/Tests/WTF/HashSet.cpp:
     9        (TestWebKitAPI::testInitialCapacity):
     10
    1112020-02-08  Sam Weinig  <weinig@apple.com>
    212
  • trunk/Tools/TestWebKitAPI/Tests/WTF/HashSet.cpp

    r256011 r256093  
    5656    }
    5757
    58     // Adding items up to less than half the capacity should not change the capacity.
    59     unsigned capacityLimit = initialCapacity / 2 - 1;
     58    // Adding items up to less than 3/4 of the capacity should not change the capacity.
     59    unsigned capacityLimit = initialCapacity * 3 / 4 - 1;
    6060    for (size_t i = size; i < capacityLimit; ++i) {
    6161        testSet.add(i);
Note: See TracChangeset for help on using the changeset viewer.