Changeset 159764 in webkit


Ignore:
Timestamp:
Nov 25, 2013 1:53:32 PM (10 years ago)
Author:
akling@apple.com
Message:

Deduplicate shortish Text node strings during tree construction.
<https://webkit.org/b/124855>

Let HTMLConstructionSite keep a hash set of already seen strings over
its lifetime. Use this to deduplicate the strings inside Text nodes
for any string up to 64 characters of length.

This optimization already sort-of existed for whitespace-only Texts,
but those are laundered in the AtomicString table which we definitely
don't want to pollute with every single Text. It might be a good idea
to stop using the AtomicString table for all-whitespace Text too.

3.82 MB progression on HTML5-8266 locally.

Reviewed by Anders Carlsson.

Location:
trunk/Source/WebCore
Files:
3 edited

Legend:

Unmodified
Added
Removed
  • trunk/Source/WebCore/ChangeLog

    r159762 r159764  
     12013-11-25  Andreas Kling  <akling@apple.com>
     2
     3        Deduplicate shortish Text node strings during tree construction.
     4        <https://webkit.org/b/124855>
     5
     6        Let HTMLConstructionSite keep a hash set of already seen strings over
     7        its lifetime. Use this to deduplicate the strings inside Text nodes
     8        for any string up to 64 characters of length.
     9
     10        This optimization already sort-of existed for whitespace-only Texts,
     11        but those are laundered in the AtomicString table which we definitely
     12        don't want to pollute with every single Text. It might be a good idea
     13        to stop using the AtomicString table for all-whitespace Text too.
     14
     15        3.82 MB progression on HTML5-8266 locally.
     16
     17        Reviewed by Anders Carlsson.
     18
    1192013-11-25  Nick Diego Yamane  <nick.yamane@openbossa.org>
    220
  • trunk/Source/WebCore/html/parser/HTMLConstructionSite.cpp

    r159618 r159764  
    506506
    507507    while (currentPosition < characters.length()) {
    508         RefPtr<Text> textNode = Text::createWithLengthLimit(task.parent->document(), shouldUseAtomicString ? AtomicString(characters).string() : characters, currentPosition, lengthLimit);
     508        RefPtr<Text> textNode = Text::createWithLengthLimit(task.parent->document(), stringForTextNode(characters, shouldUseAtomicString), currentPosition, lengthLimit);
    509509        // If we have a whole string of unbreakable characters the above could lead to an infinite loop. Exceeding the length limit is the lesser evil.
    510510        if (!textNode->length()) {
    511511            String substring = characters.substring(currentPosition);
    512             textNode = Text::create(task.parent->document(), shouldUseAtomicString ? AtomicString(substring).string() : substring);
     512            textNode = Text::create(task.parent->document(), stringForTextNode(substring, shouldUseAtomicString));
    513513        }
    514514
     
    667667}
    668668
    669 }
     669String HTMLConstructionSite::stringForTextNode(const String& string, bool shouldUseAtomicString)
     670{
     671    static const unsigned maximumLengthForDeduplication = 64;
     672    if (shouldUseAtomicString)
     673        return AtomicString(string).string();
     674    if (string.length() > maximumLengthForDeduplication)
     675        return string;
     676    return *m_stringsForDeduplication.add(string).iterator;
     677}
     678
     679}
  • trunk/Source/WebCore/html/parser/HTMLConstructionSite.h

    r156980 r159764  
    171171    void dispatchDocumentElementAvailableIfNeeded();
    172172
     173    String stringForTextNode(const String&, bool shouldUseAtomicString);
     174
    173175    Document* m_document;
    174176   
     
    197199
    198200    bool m_inQuirksMode;
     201
     202    HashSet<String> m_stringsForDeduplication;
    199203};
    200204
Note: See TracChangeset for help on using the changeset viewer.