Changeset 209058 in webkit


Ignore:
Timestamp:
Nov 28, 2016 8:29:55 PM (7 years ago)
Author:
Darin Adler
Message:

Streamline and speed up tokenizer and segmented string classes
https://bugs.webkit.org/show_bug.cgi?id=165003

Reviewed by Sam Weinig.

Source/JavaScriptCore:

  • runtime/JSONObject.cpp:

(JSC::Stringifier::appendStringifiedValue): Use viewWithUnderlyingString when calling
StringBuilder::appendQuotedJSONString, since it now takes a StringView and there is
no benefit in creating a String for that function if one doesn't already exist.

Source/WebCore:

Profiling Speedometer on my iMac showed the tokenizer as one of the
hottest functions. This patch streamlines the segmented string class,
removing various unused features, and also improves some other functions
seen on the Speedometer profile. On my iMac I measured a speedup of
about 3%. Changes include:

  • Removed m_pushedChar1, m_pushedChar2, and m_empty data members from the SegmentedString class and all the code that used to handle them.
  • Simplified the SegmentedString advance functions so they are small enough to get inlined in the HTML tokenizer.
  • Updated callers to call the simpler SegmentedString advance functions that don't handle newlines in as many cases as possible.
  • Cut down on allocations of SegmentedString and made code move the segmented string and the strings that are moved into it rather than copying them whenever possible.
  • Simplified segmented string functions, removing some branches, mostly from the non-fast paths.
  • Removed small unused functions and small functions used in only one or two places, made more functions private and renamed for clarity.
  • bindings/js/JSHTMLDocumentCustom.cpp:

(WebCore::documentWrite): Moved a little more of the common code in here
from the two functions belwo. Removed obsolete comment saying this was not
following the DOM specification because it is. Removed unneeded special
cases for 1 argument and no arguments. Take a reference instead of a pointer.
(WebCore::JSHTMLDocument::write): Updated for above.
(WebCore::JSHTMLDocument::writeln): Ditto.

  • css/parser/CSSTokenizer.cpp: Added now-needed include.
  • css/parser/CSSTokenizer.h: Removed unneeded include.
  • css/parser/CSSTokenizerInputStream.h: Added definition of kEndOfFileMarker

here; this is now separate from the use in the HTMLParser. In the long run,
unclear to me whether it is really needed in either.

  • dom/Document.cpp:

(WebCore::Document::prepareToWrite): Added. Helper function used by the three
different variants of write. Using this may prevent us from having to construct
a SegmentedString just to append one string after future refactoring.
(WebCore::Document::write): Updated to take an rvalue reference and move the
value through.
(WebCore::Document::writeln): Use a single write call instead of two.

  • dom/Document.h: Changed write to take an rvalue reference to SegmentedString

rather than a const reference.

  • dom/DocumentParser.h: Changed insert to take an rvalue reference to

SegmentedString. In the future, should probably overload to take a single
string since that is the normal case.

  • dom/RawDataDocumentParser.h: Updated for change to DocumentParser.
  • html/FTPDirectoryDocument.cpp:

(WebCore::FTPDirectoryDocumentParser::append): Refactored a bit, just enough
so that we don't need an assignment operator for SegmentedString that can
copy a String.

  • html/parser/HTMLDocumentParser.cpp:

(WebCore::HTMLDocumentParser::insert): Updated to take an rvalue reference,
and move the value through.

  • html/parser/HTMLDocumentParser.h: Updated for the above.
  • html/parser/HTMLEntityParser.cpp:

(WebCore::HTMLEntityParser::consumeNamedEntity): Updated for name changes.
Changed the twao calls to advance here to call advancePastNonNewline; no
change in behavior, but asserts what the code was assuming before, that the
character was not a newline.

  • html/parser/HTMLInputStream.h:

(WebCore::HTMLInputStream::appendToEnd): Updated to take an rvalue reference,
and move the value through.
(WebCore::HTMLInputStream::insertAtCurrentInsertionPoint): Ditto.
(WebCore::HTMLInputStream::markEndOfFile): Removed the code to construct a
SegmentedString, overkill since we can just append an individual string.
(WebCore::HTMLInputStream::splitInto): Rewrote the move idiom here to actually
use move, which will reduce reference count churn and other unneeded work.

  • html/parser/HTMLMetaCharsetParser.cpp:

(WebCore::HTMLMetaCharsetParser::checkForMetaCharset): Removed unneeded
construction of a SegmentedString, just to append a string.

  • html/parser/HTMLSourceTracker.cpp:

(WebCore::HTMLSourceTracker::HTMLSourceTracker): Moved to the class definition.
(WebCore::HTMLSourceTracker::source): Updated for function name change.

  • html/parser/HTMLSourceTracker.h: Updated for above.
  • html/parser/HTMLTokenizer.cpp: Added now-needed include.

(WebCore::HTMLTokenizer::emitAndResumeInDataState): Use advancePastNonNewline,
since this function is never called in response to a newline character.
(WebCore::HTMLTokenizer::commitToPartialEndTag): Ditto.
(WebCore::HTMLTokenizer::commitToCompleteEndTag): Ditto.
(WebCore::HTMLTokenizer::processToken): Use ADVANCE_PAST_NON_NEWLINE_TO macro
instead of ADVANCE_TO in cases where the character we are advancing past is
known not to be a newline, so we can use the more efficient advance function
that doesn't check for the newline character.

  • html/parser/InputStreamPreprocessor.h: Moved kEndOfFileMarker to

SegmentedString.h; not sure that's a good place for it either. In the long run,
unclear to me whether this is really needed.
(WebCore::InputStreamPreprocessor::peek): Added UNLIKELY for the empty check.
Added LIKELY for the not-special character check.
(WebCore::InputStreamPreprocessor::advance): Updated for the new name of the
advanceAndUpdateLineNumber function.
(WebCore::InputStreamPreprocessor::advancePastNonNewline): Added. More
efficient than advance for cases where the last characer is known not to be
a newline character.
(WebCore::InputStreamPreprocessor::skipNextNewLine): Deleted. Was unused.
(WebCore::InputStreamPreprocessor::reset): Deleted. Was unused except in the
constructor; added initial values for the data members to replace.
(WebCore::InputStreamPreprocessor::processNextInputCharacter): Removed long
FIXME comment that didn't really need to be here. Reorganized a bit.
(WebCore::InputStreamPreprocessor::isAtEndOfFile): Renamed and made static.

  • html/track/BufferedLineReader.cpp:

(WebCore::BufferedLineReader::nextLine): Updated to not use the poorly named
scanCharacter function to advance past a newline. Also renamed from getLine
and changed to return Optional<String> instead of using a boolean to indicate
failure and an out argument.

  • html/track/BufferedLineReader.h:

(WebCore::BufferedLineReader::BufferedLineReader): Use the default, putting
initial values on each data member below.
(WebCore::BufferedLineReader::append): Updated to take an rvalue reference,
and move the value through.
(WebCore::BufferedLineReader::scanCharacter): Deleted. Was poorly named,
and easy to replace with two lines of code at its two call sites.
(WebCore::BufferedLineReader::reset): Rewrote to correctly clear all the
data members of the class, not just the segmented string.

  • html/track/InbandGenericTextTrack.cpp:

(WebCore::InbandGenericTextTrack::parseWebVTTFileHeader): Updated to take
an rvalue reference and move the value through.

  • html/track/InbandGenericTextTrack.h: Updated for the above.
  • html/track/InbandTextTrack.h: Updated since parseWebVTTFileHeader now

takes an rvalue reference.

  • html/track/WebVTTParser.cpp:

(WebCore::WebVTTParser::parseFileHeader): Updated to take an rvalue reference
and move the value through.
(WebCore::WebVTTParser::parseBytes): Updated to pass ownership of the string
in to the line reader append function.
(WebCore::WebVTTParser::parseCueData): Use auto and WTFMove for WebVTTCueData.
(WebCore::WebVTTParser::flush): More of the same.
(WebCore::WebVTTParser::parse): Changed to use nextLine instead of getLine.

  • html/track/WebVTTParser.h: Updated for the above.
  • html/track/WebVTTTokenizer.cpp:

(WebCore::advanceAndEmitToken): Use advanceAndUpdateLineNumber by its new
name, just advance. No change in behavior.
(WebCore::WebVTTTokenizer::WebVTTTokenizer): Pass a String, not a
SegmentedString, to add the end of file marker.

  • platform/graphics/InbandTextTrackPrivateClient.h: Updated since

parseWebVTTFileHeader takes an rvalue reference.

  • platform/text/SegmentedString.cpp:

(WebCore::SegmentedString::Substring::appendTo): Moved here from the header.
The only caller is SegmentedString::toString, inside this file.
(WebCore::SegmentedString::SegmentedString): Deleted the copy constructor.
No longer needed.
(WebCore::SegmentedString::operator=): Defined a move assignment operator
rather than an ordinary assignment operator, since that's what the call
sites really need.
(WebCore::SegmentedString::length): Simplified since we no longer need to
support pushed characters.
(WebCore::SegmentedString::setExcludeLineNumbers): Simplified, since we
can just iterate m_otherSubstrings without an extra check. Also changed to
write directly to the data member of Substring instead of using a function.
(WebCore::SegmentedString::updateAdvanceFunctionPointersForEmptyString):
Added. Used when we run out of characters.
(WebCore::SegmentedString::clear): Removed code to clear now-deleted members.
Updated for changes to other member names.
(WebCore::SegmentedString::appendSubstring): Renamed from just append to
avoid ambiguity with the public append function. Changed to take an rvalue
reference, and move in, and added code to set m_currentCharacter properly,
so the caller doesn't have to deal with that.
(WebCore::SegmentedString::close): Updated to use m_isClosed by its new name.
Also removed unneeded comment about assertion that fires when trying to close
an already closed string.
(WebCore::SegmentedString::append): Added overloads for rvalue references of
both entire SegmentedString objects and of String. Streamlined to just call
appendSubstring and append to the deque.
(WebCore::SegmentedString::pushBack): Tightened up since we don't allow empty
strings and changed to take just a string, not an entire segmented string.
(WebCore::SegmentedString::advanceSubstring): Moved logic into the
advancePastSingleCharacterSubstringWithoutUpdatingLineNumber function.
(WebCore::SegmentedString::toString): Simplified now that we don't need to
support pushed characters.
(WebCore::SegmentedString::advancePastNonNewlines): Deleted.
(WebCore::SegmentedString::advance8): Deleted.
(WebCore::SegmentedString::advanceWithoutUpdatingLineNumber16): Renamed from
advance16. Simplified now that there are no pushed characters. Also changed to
access data members of m_currentSubstring directly instead of calling a function.
(WebCore::SegmentedString::advanceAndUpdateLineNumber8): Deleted.
(WebCore::SegmentedString::advanceAndUpdateLineNumber16): Ditto.
(WebCore::SegmentedString::advancePastSingleCharacterSubstringWithoutUpdatingLineNumber):
Renamed from advanceSlowCase. Removed uneeded logic to handle pushed characters.
Moved code in here from advanceSubstring.
(WebCore::SegmentedString::advancePastSingleCharacterSubstring): Renamed from
advanceAndUpdateLineNumberSlowCase. Simplified by calling the function above.
(WebCore::SegmentedString::advanceEmpty): Broke assertion up into two.
(WebCore::SegmentedString::updateSlowCaseFunctionPointers): Updated for name changes.
(WebCore::SegmentedString::advancePastSlowCase): Changed name and meaning of
boolean argument. Rewrote to use the String class less; it's now used only when
we fail to match after the first character rather than being used for the actual
comparison with the literal.

  • platform/text/SegmentedString.h: Moved all non-trivial function bodies out of

the class definition to make things easier to read. Moved the SegmentedSubstring
class inside the SegmentedString class, making it a private struct named Substring.
Removed the m_ prefix from data members of the struct, removed many functions from
the struct and made its union be anonymous instead of naming it m_data. Removed
unneeded StringBuilder.h include.
(WebCore::SegmentedString::isEmpty): Changed to use the length of the substring
instead of a separate boolean. We never create an empty substring, nor leave one
in place as the current substring unless the entire segmented string is empty.
(WebCore::SegmentedString::advancePast): Updated to use the new member function
template instead of a non-template member function. The new member function is
entirely rewritten and does the matching directly rather than allocating a string
just to do prefix matching.
(WebCore::SegmentedString::advancePastLettersIgnoringASCIICase): Renamed to make
it clear that the literal must be all non-letters or lowercase letters as with
the other "letters ignoring ASCII case" functions. The three call sites all fit
the bill. Implement by calling the new function template.
(WebCore::SegmentedString::currentCharacter): Renamed from currentChar.
(WebCore::SegmentedString::Substring::Substring): Use an rvalue reference and
move the string in.
(WebCore::SegmentedString::Substring::currentCharacter): Simplified since this
is never used on an empty substring.
(WebCore::SegmentedString::Substring::incrementAndGetCurrentCharacter): Ditto.
(WebCore::SegmentedString::SegmentedString): Overload to take an rvalue reference.
Simplified since there are now fewer data members.
(WebCore::SegmentedString::advanceWithoutUpdatingLineNumber): Renamed from
advance, since this is only safe to use if there is some reason it is OK to skip
updating the line number.
(WebCore::SegmentedString::advance): Renamed from advanceAndUpdateLineNumber,
since doing that is the normal desired behavior and not worth mentioning in the
public function name.
(WebCore::SegmentedString::advancePastNewline): Renamed from
advancePastNewlineAndUpdateLineNumber.
(WebCore::SegmentedString::numberOfCharactersConsumed): Greatly simplified since
pushed characters are no longer supported.
(WebCore::SegmentedString::characterMismatch): Added. Used by advancePast.

  • xml/parser/CharacterReferenceParserInlines.h:

(WebCore::unconsumeCharacters): Use toString rather than toStringPreserveCapacity
because the SegmentedString is going to take ownership of the string.
(WebCore::consumeCharacterReference): Updated to use the pushBack that takes just
a String, not a SegmentedString. Also use advancePastNonNewline.

  • xml/parser/MarkupTokenizerInlines.h: Added ADVANCE_PAST_NON_NEWLINE_TO.
  • xml/parser/XMLDocumentParser.cpp:

(WebCore::XMLDocumentParser::insert): Updated since this takes an rvalue reference.
(WebCore::XMLDocumentParser::append): Removed unnecessary code to create a
SegmentedString.

  • xml/parser/XMLDocumentParser.h: Updated for above. Also fixed indentation

and initialized most data members.

  • xml/parser/XMLDocumentParserLibxml2.cpp:

(WebCore::XMLDocumentParser::XMLDocumentParser): Moved most data member
initialization into the class definition.
(WebCore::XMLDocumentParser::resumeParsing): Removed code that copied a
segmented string, but converted the whole thing into a string before using it.
Now we convert to a string right away.

Source/WTF:

  • wtf/text/StringBuilder.cpp:

(WTF::StringBuilder::bufferCharacters<LChar>): Moved this here from
the header since it is only used inside the class. Also renamed from
getBufferCharacters.
(WTF::StringBuilder::bufferCharacters<UChar>): Ditto.
(WTF::StringBuilder::appendUninitializedUpconvert): Added. Helper
for the upconvert case in the 16-bit overload of StrinBuilder::append.
(WTF::StringBuilder::append): Changed to use appendUninitializedUpconvert.
(WTF::quotedJSONStringLength): Added. Used in new appendQuotedJSONString
implementation below that now correctly determines the size of what will
be appended by walking thorugh the string twice.
(WTF::appendQuotedJSONStringInternal): Moved the code that writes the
quote marks in here. Also made a few coding style tweaks.
(WTF::StringBuilder::appendQuotedJSONString): Rewrote to use a much
simpler algorithm that grows the string the same way the append function
does. The old code would use reserveCapacity in a way that was costly when
doing a lot of appends on the same string, and also allocated far too much
memory for normal use cases where characters did not need to be turned
into escape sequences.

  • wtf/text/StringBuilder.h:

(WTF::StringBuilder::append): Tweaked style a bit, fixed a bug where the
m_is8Bit field wasn't set correctly in one case, optimized the function that
adds substrings for the case where this is the first append and the substring
happens to cover the entire string. Also clarified the assertions and removed
an unneeded check from that substring overload.
(WTF::equal): Reimplemented, using equalCommon.

Location:
trunk/Source
Files:
40 edited

Legend:

Unmodified
Added
Removed
  • trunk/Source/JavaScriptCore/ChangeLog

    r209043 r209058  
     12016-11-28  Darin Adler  <darin@apple.com>
     2
     3        Streamline and speed up tokenizer and segmented string classes
     4        https://bugs.webkit.org/show_bug.cgi?id=165003
     5
     6        Reviewed by Sam Weinig.
     7
     8        * runtime/JSONObject.cpp:
     9        (JSC::Stringifier::appendStringifiedValue): Use viewWithUnderlyingString when calling
     10        StringBuilder::appendQuotedJSONString, since it now takes a StringView and there is
     11        no benefit in creating a String for that function if one doesn't already exist.
     12
    1132016-11-21  Mark Lam  <mark.lam@apple.com>
    214
  • trunk/Source/JavaScriptCore/runtime/JSONObject.cpp

    r208966 r209058  
    11/*
    2  * Copyright (C) 2009, 2016 Apple Inc. All rights reserved.
     2 * Copyright (C) 2009-2016 Apple Inc. All rights reserved.
    33 *
    44 * Redistribution and use in source and binary forms, with or without
     
    355355
    356356    if (value.isString()) {
    357         builder.appendQuotedJSONString(asString(value)->value(m_exec));
     357        builder.appendQuotedJSONString(asString(value)->viewWithUnderlyingString(*m_exec).view);
    358358        return StringifySucceeded;
    359359    }
  • trunk/Source/WTF/ChangeLog

    r208985 r209058  
     12016-11-28  Darin Adler  <darin@apple.com>
     2
     3        Streamline and speed up tokenizer and segmented string classes
     4        https://bugs.webkit.org/show_bug.cgi?id=165003
     5
     6        Reviewed by Sam Weinig.
     7
     8        * wtf/text/StringBuilder.cpp:
     9        (WTF::StringBuilder::bufferCharacters<LChar>): Moved this here from
     10        the header since it is only used inside the class. Also renamed from
     11        getBufferCharacters.
     12        (WTF::StringBuilder::bufferCharacters<UChar>): Ditto.
     13        (WTF::StringBuilder::appendUninitializedUpconvert): Added. Helper
     14        for the upconvert case in the 16-bit overload of StrinBuilder::append.
     15        (WTF::StringBuilder::append): Changed to use appendUninitializedUpconvert.
     16        (WTF::quotedJSONStringLength): Added. Used in new appendQuotedJSONString
     17        implementation below that now correctly determines the size of what will
     18        be appended by walking thorugh the string twice.
     19        (WTF::appendQuotedJSONStringInternal): Moved the code that writes the
     20        quote marks in here. Also made a few coding style tweaks.
     21        (WTF::StringBuilder::appendQuotedJSONString): Rewrote to use a much
     22        simpler algorithm that grows the string the same way the append function
     23        does. The old code would use reserveCapacity in a way that was costly when
     24        doing a lot of appends on the same string, and also allocated far too much
     25        memory for normal use cases where characters did not need to be turned
     26        into escape sequences.
     27
     28        * wtf/text/StringBuilder.h:
     29        (WTF::StringBuilder::append): Tweaked style a bit, fixed a bug where the
     30        m_is8Bit field wasn't set correctly in one case, optimized the function that
     31        adds substrings for the case where this is the first append and the substring
     32        happens to cover the entire string. Also clarified the assertions and removed
     33        an unneeded check from that substring overload.
     34        (WTF::equal): Reimplemented, using equalCommon.
     35
    1362016-11-26  Yusuke Suzuki  <utatane.tea@gmail.com>
    237
  • trunk/Source/WTF/wtf/text/StringBuilder.cpp

    r205847 r209058  
    11/*
    2  * Copyright (C) 2010, 2013, 2016 Apple Inc. All rights reserved.
     2 * Copyright (C) 2010-2016 Apple Inc. All rights reserved.
    33 * Copyright (C) 2012 Google Inc. All rights reserved.
    44 *
     
    3030#include "IntegerToStringConversion.h"
    3131#include "MathExtras.h"
    32 #include "WTFString.h"
    3332#include <wtf/dtoa.h>
    3433
     
    3938    static const unsigned minimumCapacity = 16;
    4039    return std::max(requiredLength, std::max(minimumCapacity, capacity * 2));
     40}
     41
     42template<> ALWAYS_INLINE LChar* StringBuilder::bufferCharacters<LChar>()
     43{
     44    ASSERT(m_is8Bit);
     45    return m_bufferCharacters8;
     46}
     47
     48template<> ALWAYS_INLINE UChar* StringBuilder::bufferCharacters<UChar>()
     49{
     50    ASSERT(!m_is8Bit);
     51    return m_bufferCharacters16;
    4152}
    4253
     
    98109{
    99110    ASSERT(m_is8Bit);
     111
    100112    // Copy the existing data into a new buffer, set result to point to the end of the existing data.
    101113    auto buffer = StringImpl::createUninitialized(requiredLength, m_bufferCharacters8);
     
    113125{
    114126    ASSERT(!m_is8Bit);
     127
    115128    // Copy the existing data into a new buffer, set result to point to the end of the existing data.
    116129    auto buffer = StringImpl::createUninitialized(requiredLength, m_bufferCharacters16);
     
    125138// Allocate a new 16 bit buffer, copying in currentCharacters (which is 8 bit and may come
    126139// from either m_string or m_buffer, neither will be reassigned until the copy has completed).
    127 void StringBuilder::allocateBufferUpConvert(const LChar* currentCharacters, unsigned requiredLength)
     140void StringBuilder::allocateBufferUpconvert(const LChar* currentCharacters, unsigned requiredLength)
    128141{
    129142    ASSERT(m_is8Bit);
    130143    ASSERT(requiredLength >= m_length);
     144
    131145    // Copy the existing data into a new buffer, set result to point to the end of the existing data.
    132146    auto buffer = StringImpl::createUninitialized(requiredLength, m_bufferCharacters16);
     
    142156}
    143157
    144 template <>
    145 void StringBuilder::reallocateBuffer<LChar>(unsigned requiredLength)
     158template<> void StringBuilder::reallocateBuffer<LChar>(unsigned requiredLength)
    146159{
    147160    // If the buffer has only one ref (by this StringBuilder), reallocate it,
     
    159172}
    160173
    161 template <>
    162 void StringBuilder::reallocateBuffer<UChar>(unsigned requiredLength)
     174template<> void StringBuilder::reallocateBuffer<UChar>(unsigned requiredLength)
    163175{
    164176    // If the buffer has only one ref (by this StringBuilder), reallocate it,
     
    167179   
    168180    if (m_buffer->is8Bit())
    169         allocateBufferUpConvert(m_buffer->characters8(), requiredLength);
     181        allocateBufferUpconvert(m_buffer->characters8(), requiredLength);
    170182    else if (m_buffer->hasOneRef())
    171183        m_buffer = StringImpl::reallocate(m_buffer.releaseNonNull(), requiredLength, m_bufferCharacters16);
     
    189201        if (newCapacity > m_length) {
    190202            if (!m_length) {
    191                 LChar* nullPlaceholder = 0;
     203                LChar* nullPlaceholder = nullptr;
    192204                allocateBuffer(nullPlaceholder, newCapacity);
    193205            } else if (m_string.is8Bit())
     
    202214// Make 'length' additional capacity be available in m_buffer, update m_string & m_length,
    203215// return a pointer to the newly allocated storage.
    204 template <typename CharType>
    205 ALWAYS_INLINE CharType* StringBuilder::appendUninitialized(unsigned length)
     216template<typename CharacterType> ALWAYS_INLINE CharacterType* StringBuilder::appendUninitialized(unsigned length)
    206217{
    207218    ASSERT(length);
     
    218229        m_string = String();
    219230        m_length = requiredLength;
    220         return getBufferCharacters<CharType>() + currentLength;
    221     }
    222    
    223     return appendUninitializedSlow<CharType>(requiredLength);
     231        return bufferCharacters<CharacterType>() + currentLength;
     232    }
     233
     234    return appendUninitializedSlow<CharacterType>(requiredLength);
    224235}
    225236
    226237// Make 'length' additional capacity be available in m_buffer, update m_string & m_length,
    227238// return a pointer to the newly allocated storage.
    228 template <typename CharType>
    229 CharType* StringBuilder::appendUninitializedSlow(unsigned requiredLength)
     239template<typename CharacterType> CharacterType* StringBuilder::appendUninitializedSlow(unsigned requiredLength)
    230240{
    231241    ASSERT(requiredLength);
     
    234244        // If the buffer is valid it must be at least as long as the current builder contents!
    235245        ASSERT(m_buffer->length() >= m_length);
    236        
    237         reallocateBuffer<CharType>(expandedCapacity(capacity(), requiredLength));
     246        reallocateBuffer<CharacterType>(expandedCapacity(capacity(), requiredLength));
    238247    } else {
    239248        ASSERT(m_string.length() == m_length);
    240         allocateBuffer(m_length ? m_string.characters<CharType>() : 0, expandedCapacity(capacity(), requiredLength));
    241     }
    242    
    243     CharType* result = getBufferCharacters<CharType>() + m_length;
     249        allocateBuffer(m_length ? m_string.characters<CharacterType>() : nullptr, expandedCapacity(capacity(), requiredLength));
     250    }
     251   
     252    auto* result = bufferCharacters<CharacterType>() + m_length;
    244253    m_length = requiredLength;
    245254    ASSERT(m_buffer->length() >= m_length);
     
    247256}
    248257
     258inline UChar* StringBuilder::appendUninitializedUpconvert(unsigned length)
     259{
     260    unsigned requiredLength = length + m_length;
     261    if (requiredLength < length)
     262        CRASH();
     263
     264    if (m_buffer) {
     265        // If the buffer is valid it must be at least as long as the current builder contents!
     266        ASSERT(m_buffer->length() >= m_length);
     267        allocateBufferUpconvert(m_buffer->characters8(), expandedCapacity(capacity(), requiredLength));
     268    } else {
     269        ASSERT(m_string.length() == m_length);
     270        allocateBufferUpconvert(m_string.isNull() ? nullptr : m_string.characters8(), expandedCapacity(capacity(), requiredLength));
     271    }
     272
     273    auto* result = m_bufferCharacters16 + m_length;
     274    m_length = requiredLength;
     275    return result;
     276}
     277
    249278void StringBuilder::append(const UChar* characters, unsigned length)
    250279{
     
    255284
    256285    if (m_is8Bit) {
    257         if (length == 1 && !(*characters & ~0xff)) {
     286        if (length == 1 && !(*characters & ~0xFF)) {
    258287            // Append as 8 bit character
    259288            LChar lChar = static_cast<LChar>(*characters);
     
    261290            return;
    262291        }
    263 
    264         // Calculate the new size of the builder after appending.
    265         unsigned requiredLength = length + m_length;
    266         if (requiredLength < length)
    267             CRASH();
    268        
    269         if (m_buffer) {
    270             // If the buffer is valid it must be at least as long as the current builder contents!
    271             ASSERT(m_buffer->length() >= m_length);
    272            
    273             allocateBufferUpConvert(m_buffer->characters8(), expandedCapacity(capacity(), requiredLength));
    274         } else {
    275             ASSERT(m_string.length() == m_length);
    276             allocateBufferUpConvert(m_string.isNull() ? 0 : m_string.characters8(), expandedCapacity(capacity(), requiredLength));
    277         }
    278 
    279         memcpy(m_bufferCharacters16 + m_length, characters, static_cast<size_t>(length) * sizeof(UChar));
    280         m_length = requiredLength;
     292        memcpy(appendUninitializedUpconvert(length), characters, static_cast<size_t>(length) * sizeof(UChar));
    281293    } else
    282294        memcpy(appendUninitialized<UChar>(length), characters, static_cast<size_t>(length) * sizeof(UChar));
     295
    283296    ASSERT(m_buffer->length() >= m_length);
    284297}
     
    291304
    292305    if (m_is8Bit) {
    293         LChar* dest = appendUninitialized<LChar>(length);
     306        auto* destination = appendUninitialized<LChar>(length);
     307        // FIXME: How did we determine a threshold of 8 here was the right one?
     308        // Also, this kind of optimization could be useful anywhere else we have a
     309        // performance-sensitive code path that calls memcpy.
    294310        if (length > 8)
    295             memcpy(dest, characters, static_cast<size_t>(length) * sizeof(LChar));
     311            memcpy(destination, characters, length);
    296312        else {
    297313            const LChar* end = characters + length;
    298314            while (characters < end)
    299                 *(dest++) = *(characters++);
     315                *destination++ = *characters++;
    300316        }
    301317    } else {
    302         UChar* dest = appendUninitialized<UChar>(length);
     318        auto* destination = appendUninitialized<UChar>(length);
    303319        const LChar* end = characters + length;
    304320        while (characters < end)
    305             *(dest++) = *(characters++);
     321            *destination++ = *characters++;
    306322    }
    307323}
     
    386402}
    387403
    388 template <typename OutputCharacterType, typename InputCharacterType>
    389 static void appendQuotedJSONStringInternal(OutputCharacterType*& output, const InputCharacterType* input, unsigned length)
    390 {
    391     for (const InputCharacterType* end = input + length; input != end; ++input) {
    392         if (LIKELY(*input > 0x1F)) {
    393             if (*input == '"' || *input == '\\')
     404template<typename LengthType, typename CharacterType> static LengthType quotedJSONStringLength(const CharacterType* input, unsigned length)
     405{
     406    LengthType quotedLength = 2;
     407    for (unsigned i = 0; i < length; ++i) {
     408        auto character = input[i];
     409        if (LIKELY(character > 0x1F)) {
     410            switch (character) {
     411            case '"':
     412            case '\\':
     413                quotedLength += 2;
     414                break;
     415            default:
     416                ++quotedLength;
     417                break;
     418            }
     419        } else {
     420            switch (character) {
     421            case '\t':
     422            case '\r':
     423            case '\n':
     424            case '\f':
     425            case '\b':
     426                quotedLength += 2;
     427                break;
     428            default:
     429                quotedLength += 6;
     430            }
     431        }
     432    }
     433    return quotedLength;
     434}
     435
     436template<typename CharacterType> static inline unsigned quotedJSONStringLength(const CharacterType* input, unsigned length)
     437{
     438    constexpr auto maxSafeLength = (std::numeric_limits<unsigned>::max() - 2) / 6;
     439    if (length <= maxSafeLength)
     440        return quotedJSONStringLength<unsigned>(input, length);
     441    return quotedJSONStringLength<Checked<unsigned>>(input, length).unsafeGet();
     442}
     443
     444template<typename OutputCharacterType, typename InputCharacterType> static inline void appendQuotedJSONStringInternal(OutputCharacterType* output, const InputCharacterType* input, unsigned length)
     445{
     446    *output++ = '"';
     447    for (unsigned i = 0; i < length; ++i) {
     448        auto character = input[i];
     449        if (LIKELY(character > 0x1F)) {
     450            if (UNLIKELY(character == '"' || character == '\\'))
    394451                *output++ = '\\';
    395             *output++ = *input;
     452            *output++ = character;
    396453            continue;
    397454        }
    398         switch (*input) {
     455        switch (character) {
    399456        case '\t':
    400457            *output++ = '\\';
     
    418475            break;
    419476        default:
    420             ASSERT((*input & 0xFF00) == 0);
    421             static const char hexDigits[] = "0123456789abcdef";
     477            ASSERT(!(character & ~0xFF));
    422478            *output++ = '\\';
    423479            *output++ = 'u';
    424480            *output++ = '0';
    425481            *output++ = '0';
    426             *output++ = static_cast<LChar>(hexDigits[(*input >> 4) & 0xF]);
    427             *output++ = static_cast<LChar>(hexDigits[*input & 0xF]);
     482            *output++ = upperNibbleToLowercaseASCIIHexDigit(character);
     483            *output++ = lowerNibbleToLowercaseASCIIHexDigit(character);
    428484            break;
    429485        }
    430486    }
    431 }
    432 
    433 void StringBuilder::appendQuotedJSONString(const String& string)
    434 {
    435     // Make sure we have enough buffer space to append this string without having
    436     // to worry about reallocating in the middle.
    437     // The 2 is for the '"' quotes on each end.
    438     // The 6 is for characters that need to be \uNNNN encoded.
    439     Checked<unsigned> stringLength = string.length();
    440     Checked<unsigned> maximumCapacityRequired = length();
    441     maximumCapacityRequired += 2 + stringLength * 6;
    442     unsigned allocationSize = maximumCapacityRequired.unsafeGet();
    443     // This max() is here to allow us to allocate sizes between the range [2^31, 2^32 - 2] because roundUpToPowerOfTwo(1<<31 + some int smaller than 1<<31) == 0.
    444     allocationSize = std::max(allocationSize, roundUpToPowerOfTwo(allocationSize));
    445 
    446     if (is8Bit() && !string.is8Bit())
    447         allocateBufferUpConvert(m_bufferCharacters8, allocationSize);
    448     else
    449         reserveCapacity(allocationSize);
    450     ASSERT(m_buffer->length() >= allocationSize);
    451 
    452     if (is8Bit()) {
    453         ASSERT(string.is8Bit());
    454         LChar* output = m_bufferCharacters8 + m_length;
    455         *output++ = '"';
    456         appendQuotedJSONStringInternal(output, string.characters8(), string.length());
    457         *output++ = '"';
    458         m_length = output - m_bufferCharacters8;
     487    *output = '"';
     488}
     489
     490void StringBuilder::appendQuotedJSONString(StringView string)
     491{
     492    unsigned length = string.length();
     493    if (string.is8Bit()) {
     494        auto* characters = string.characters8();
     495        if (m_is8Bit)
     496            appendQuotedJSONStringInternal(appendUninitialized<LChar>(quotedJSONStringLength(characters, length)), characters, length);
     497        else
     498            appendQuotedJSONStringInternal(appendUninitialized<UChar>(quotedJSONStringLength(characters, length)), characters, length);
    459499    } else {
    460         UChar* output = m_bufferCharacters16 + m_length;
    461         *output++ = '"';
    462         if (string.is8Bit())
    463             appendQuotedJSONStringInternal(output, string.characters8(), string.length());
     500        auto* characters = string.characters16();
     501        if (m_is8Bit)
     502            appendQuotedJSONStringInternal(appendUninitializedUpconvert(quotedJSONStringLength(characters, length)), characters, length);
    464503        else
    465             appendQuotedJSONStringInternal(output, string.characters16(), string.length());
    466         *output++ = '"';
    467         m_length = output - m_bufferCharacters16;
    468     }
    469     ASSERT(m_buffer->length() >= m_length);
     504            appendQuotedJSONStringInternal(appendUninitialized<UChar>(quotedJSONStringLength(characters, length)), characters, length);
     505    }
    470506}
    471507
  • trunk/Source/WTF/wtf/text/StringBuilder.h

    r205847 r209058  
    11/*
    2  * Copyright (C) 2009-2010, 2012-2013, 2016 Apple Inc. All rights reserved.
     2 * Copyright (C) 2009-2016 Apple Inc. All rights reserved.
    33 * Copyright (C) 2012 Google Inc. All rights reserved.
    44 *
     
    2525 */
    2626
    27 #ifndef StringBuilder_h
    28 #define StringBuilder_h
    29 
    30 #include <wtf/text/AtomicString.h>
     27#pragma once
     28
    3129#include <wtf/text/StringView.h>
    32 #include <wtf/text/WTFString.h>
    3330
    3431namespace WTF {
    3532
    3633class StringBuilder {
    37     // Disallow copying since it's expensive and we don't want code to do it by accident.
     34    // Disallow copying since it's expensive and we don't want anyone to do it by accident.
    3835    WTF_MAKE_NONCOPYABLE(StringBuilder);
    3936
    4037public:
    41     StringBuilder()
    42         : m_length(0)
    43         , m_is8Bit(true)
    44         , m_bufferCharacters8(0)
    45     {
    46     }
     38    StringBuilder() = default;
    4739
    4840    WTF_EXPORT_PRIVATE void append(const UChar*, unsigned);
     
    5143    ALWAYS_INLINE void append(const char* characters, unsigned length) { append(reinterpret_cast<const LChar*>(characters), length); }
    5244
    53     void append(const AtomicString& atomicString)
    54     {
    55         append(atomicString.string());
    56     }
     45    void append(const AtomicString& atomicString) { append(atomicString.string()); }
    5746
    5847    void append(const String& string)
    5948    {
    60         if (!string.length())
    61             return;
    62 
    63         // If we're appending to an empty string, and there is not a buffer (reserveCapacity has not been called)
    64         // then just retain the string.
     49        unsigned length = string.length();
     50        if (!length)
     51            return;
     52
     53        // If we're appending to an empty string, and there is not a buffer
     54        // (reserveCapacity has not been called) then just retain the string.
    6555        if (!m_length && !m_buffer) {
    6656            m_string = string;
    67             m_length = string.length();
    68             m_is8Bit = m_string.is8Bit();
     57            m_length = length;
     58            m_is8Bit = string.is8Bit();
    6959            return;
    7060        }
    7161
    7262        if (string.is8Bit())
    73             append(string.characters8(), string.length());
     63            append(string.characters8(), length);
    7464        else
    75             append(string.characters16(), string.length());
     65            append(string.characters16(), length);
    7666    }
    7767
     
    8171            return;
    8272
    83         // If we're appending to an empty string, and there is not a buffer (reserveCapacity has not been called)
    84         // then just retain the string.
     73        // If we're appending to an empty string, and there is not a buffer
     74        // (reserveCapacity has not been called) then just retain the string.
    8575        if (!m_length && !m_buffer && !other.m_string.isNull()) {
    8676            m_string = other.m_string;
    8777            m_length = other.m_length;
     78            m_is8Bit = other.m_is8Bit;
    8879            return;
    8980        }
     
    10697    WTF_EXPORT_PRIVATE void append(CFStringRef);
    10798#endif
     99
    108100#if USE(CF) && defined(__OBJC__)
    109101    void append(NSString *string) { append((__bridge CFStringRef)string); }
     
    112104    void append(const String& string, unsigned offset, unsigned length)
    113105    {
    114         if (!string.length())
    115             return;
    116 
    117         if ((offset + length) > string.length())
    118             return;
     106        ASSERT(offset <= string.length());
     107        ASSERT(offset + length <= string.length());
     108
     109        if (!length)
     110            return;
     111
     112        // If we're appending to an empty string, and there is not a buffer
     113        // (reserveCapacity has not been called) then just retain the string.
     114        if (!offset && !m_length && !m_buffer && length == string.length()) {
     115            m_string = string;
     116            m_length = length;
     117            m_is8Bit = string.is8Bit();
     118            return;
     119        }
    119120
    120121        if (string.is8Bit())
     
    130131    }
    131132
    132     void append(UChar c)
     133    void append(UChar character)
    133134    {
    134135        if (m_buffer && m_length < m_buffer->length() && m_string.isNull()) {
    135136            if (!m_is8Bit) {
    136                 m_bufferCharacters16[m_length++] = c;
     137                m_bufferCharacters16[m_length++] = character;
    137138                return;
    138139            }
    139 
    140             if (!(c & ~0xff)) {
    141                 m_bufferCharacters8[m_length++] = static_cast<LChar>(c);
     140            if (!(character & ~0xFF)) {
     141                m_bufferCharacters8[m_length++] = static_cast<LChar>(character);
    142142                return;
    143143            }
    144144        }
    145         append(&c, 1);
    146     }
    147 
    148     void append(LChar c)
     145        append(&character, 1);
     146    }
     147
     148    void append(LChar character)
    149149    {
    150150        if (m_buffer && m_length < m_buffer->length() && m_string.isNull()) {
    151151            if (m_is8Bit)
    152                 m_bufferCharacters8[m_length++] = c;
     152                m_bufferCharacters8[m_length++] = character;
    153153            else
    154                 m_bufferCharacters16[m_length++] = c;
     154                m_bufferCharacters16[m_length++] = character;
    155155        } else
    156             append(&c, 1);
    157     }
    158 
    159     void append(char c)
    160     {
    161         append(static_cast<LChar>(c));
    162     }
     156            append(&character, 1);
     157    }
     158
     159    void append(char character) { append(static_cast<LChar>(character)); }
    163160
    164161    void append(UChar32 c)
     
    172169    }
    173170
    174     WTF_EXPORT_PRIVATE void appendQuotedJSONString(const String&);
    175 
    176     template<unsigned charactersCount>
    177     ALWAYS_INLINE void appendLiteral(const char (&characters)[charactersCount]) { append(characters, charactersCount - 1); }
     171    WTF_EXPORT_PRIVATE void appendQuotedJSONString(StringView);
     172
     173    template<unsigned charactersCount> ALWAYS_INLINE void appendLiteral(const char (&characters)[charactersCount]) { append(characters, charactersCount - 1); }
    178174
    179175    WTF_EXPORT_PRIVATE void appendNumber(int);
     
    221217    }
    222218
    223     unsigned length() const
    224     {
    225         return m_length;
    226     }
    227 
     219    unsigned length() const { return m_length; }
    228220    bool isEmpty() const { return !m_length; }
    229221
    230222    WTF_EXPORT_PRIVATE void reserveCapacity(unsigned newCapacity);
    231223
    232     unsigned capacity() const
    233     {
    234         return m_buffer ? m_buffer->length() : m_length;
    235     }
     224    unsigned capacity() const { return m_buffer ? m_buffer->length() : m_length; }
    236225
    237226    WTF_EXPORT_PRIVATE void resize(unsigned newSize);
    238 
    239227    WTF_EXPORT_PRIVATE bool canShrink() const;
    240 
    241228    WTF_EXPORT_PRIVATE void shrinkToFit();
    242229
     
    295282    void allocateBuffer(const LChar* currentCharacters, unsigned requiredLength);
    296283    void allocateBuffer(const UChar* currentCharacters, unsigned requiredLength);
    297     void allocateBufferUpConvert(const LChar* currentCharacters, unsigned requiredLength);
    298     template <typename CharType>
    299     void reallocateBuffer(unsigned requiredLength);
    300     template <typename CharType>
    301     ALWAYS_INLINE CharType* appendUninitialized(unsigned length);
    302     template <typename CharType>
    303     CharType* appendUninitializedSlow(unsigned length);
    304     template <typename CharType>
    305     ALWAYS_INLINE CharType * getBufferCharacters();
     284    void allocateBufferUpconvert(const LChar* currentCharacters, unsigned requiredLength);
     285    template<typename CharacterType> void reallocateBuffer(unsigned requiredLength);
     286    UChar* appendUninitializedUpconvert(unsigned length);
     287    template<typename CharacterType> CharacterType* appendUninitialized(unsigned length);
     288    template<typename CharacterType> CharacterType* appendUninitializedSlow(unsigned length);
     289    template<typename CharacterType> CharacterType* bufferCharacters();
    306290    WTF_EXPORT_PRIVATE void reifyString() const;
    307291
    308     unsigned m_length;
     292    unsigned m_length { 0 };
    309293    mutable String m_string;
    310294    RefPtr<StringImpl> m_buffer;
    311     bool m_is8Bit;
     295    bool m_is8Bit { true };
    312296    union {
    313         LChar* m_bufferCharacters8;
     297        LChar* m_bufferCharacters8 { nullptr };
    314298        UChar* m_bufferCharacters16;
    315299    };
    316300};
    317301
    318 template <>
    319 ALWAYS_INLINE LChar* StringBuilder::getBufferCharacters<LChar>()
    320 {
    321     ASSERT(m_is8Bit);
    322     return m_bufferCharacters8;
    323 }
    324 
    325 template <>
    326 ALWAYS_INLINE UChar* StringBuilder::getBufferCharacters<UChar>()
    327 {
    328     ASSERT(!m_is8Bit);
    329     return m_bufferCharacters16;
    330 }   
    331 
    332 template <typename CharType>
    333 bool equal(const StringBuilder& s, const CharType* buffer, unsigned length)
     302template<typename StringType> bool equal(const StringBuilder&, const StringType&);
     303template<typename CharacterType> bool equal(const StringBuilder&, const CharacterType*, unsigned length);
     304
     305bool operator==(const StringBuilder&, const StringBuilder&);
     306bool operator!=(const StringBuilder&, const StringBuilder&);
     307bool operator==(const StringBuilder&, const String&);
     308bool operator!=(const StringBuilder&, const String&);
     309bool operator==(const String&, const StringBuilder&);
     310bool operator!=(const String&, const StringBuilder&);
     311
     312template<typename CharacterType> inline bool equal(const StringBuilder& s, const CharacterType* buffer, unsigned length)
    334313{
    335314    if (s.length() != length)
     
    342321}
    343322
    344 template <typename StringType>
    345 bool equal(const StringBuilder& a, const StringType& b)
     323template<typename StringType> inline bool equal(const StringBuilder& a, const StringType& b)
    346324{
    347     if (a.length() != b.length())
    348         return false;
    349 
    350     if (!a.length())
    351         return true;
    352 
    353     if (a.is8Bit()) {
    354         if (b.is8Bit())
    355             return equal(a.characters8(), b.characters8(), a.length());
    356         return equal(a.characters8(), b.characters16(), a.length());
    357     }
    358 
    359     if (b.is8Bit())
    360         return equal(a.characters16(), b.characters8(), a.length());
    361     return equal(a.characters16(), b.characters16(), a.length());
     325    return equalCommon(a, b);
    362326}
    363327
     
    372336
    373337using WTF::StringBuilder;
    374 
    375 #endif // StringBuilder_h
  • trunk/Source/WebCore/ChangeLog

    r209050 r209058  
     12016-11-28  Darin Adler  <darin@apple.com>
     2
     3        Streamline and speed up tokenizer and segmented string classes
     4        https://bugs.webkit.org/show_bug.cgi?id=165003
     5
     6        Reviewed by Sam Weinig.
     7
     8        Profiling Speedometer on my iMac showed the tokenizer as one of the
     9        hottest functions. This patch streamlines the segmented string class,
     10        removing various unused features, and also improves some other functions
     11        seen on the Speedometer profile. On my iMac I measured a speedup of
     12        about 3%. Changes include:
     13
     14        - Removed m_pushedChar1, m_pushedChar2, and m_empty data members from the
     15          SegmentedString class and all the code that used to handle them.
     16
     17        - Simplified the SegmentedString advance functions so they are small
     18          enough to get inlined in the HTML tokenizer.
     19
     20        - Updated callers to call the simpler SegmentedString advance functions
     21          that don't handle newlines in as many cases as possible.
     22
     23        - Cut down on allocations of SegmentedString and made code move the
     24          segmented string and the strings that are moved into it rather than
     25          copying them whenever possible.
     26
     27        - Simplified segmented string functions, removing some branches, mostly
     28          from the non-fast paths.
     29
     30        - Removed small unused functions and small functions used in only one
     31          or two places, made more functions private and renamed for clarity.
     32
     33        * bindings/js/JSHTMLDocumentCustom.cpp:
     34        (WebCore::documentWrite): Moved a little more of the common code in here
     35        from the two functions belwo. Removed obsolete comment saying this was not
     36        following the DOM specification because it is. Removed unneeded special
     37        cases for 1 argument and no arguments. Take a reference instead of a pointer.
     38        (WebCore::JSHTMLDocument::write): Updated for above.
     39        (WebCore::JSHTMLDocument::writeln): Ditto.
     40
     41        * css/parser/CSSTokenizer.cpp: Added now-needed include.
     42        * css/parser/CSSTokenizer.h: Removed unneeded include.
     43
     44        * css/parser/CSSTokenizerInputStream.h: Added definition of kEndOfFileMarker
     45        here; this is now separate from the use in the HTMLParser. In the long run,
     46        unclear to me whether it is really needed in either.
     47
     48        * dom/Document.cpp:
     49        (WebCore::Document::prepareToWrite): Added. Helper function used by the three
     50        different variants of write. Using this may prevent us from having to construct
     51        a SegmentedString just to append one string after future refactoring.
     52        (WebCore::Document::write): Updated to take an rvalue reference and move the
     53        value through.
     54        (WebCore::Document::writeln): Use a single write call instead of two.
     55
     56        * dom/Document.h: Changed write to take an rvalue reference to SegmentedString
     57        rather than a const reference.
     58
     59        * dom/DocumentParser.h: Changed insert to take an rvalue reference to
     60        SegmentedString. In the future, should probably overload to take a single
     61        string since that is the normal case.
     62
     63        * dom/RawDataDocumentParser.h: Updated for change to DocumentParser.
     64
     65        * html/FTPDirectoryDocument.cpp:
     66        (WebCore::FTPDirectoryDocumentParser::append): Refactored a bit, just enough
     67        so that we don't need an assignment operator for SegmentedString that can
     68        copy a String.
     69
     70        * html/parser/HTMLDocumentParser.cpp:
     71        (WebCore::HTMLDocumentParser::insert): Updated to take an rvalue reference,
     72        and move the value through.
     73        * html/parser/HTMLDocumentParser.h: Updated for the above.
     74
     75        * html/parser/HTMLEntityParser.cpp:
     76        (WebCore::HTMLEntityParser::consumeNamedEntity): Updated for name changes.
     77        Changed the twao calls to advance here to call advancePastNonNewline; no
     78        change in behavior, but asserts what the code was assuming before, that the
     79        character was not a newline.
     80
     81        * html/parser/HTMLInputStream.h:
     82        (WebCore::HTMLInputStream::appendToEnd): Updated to take an rvalue reference,
     83        and move the value through.
     84        (WebCore::HTMLInputStream::insertAtCurrentInsertionPoint): Ditto.
     85        (WebCore::HTMLInputStream::markEndOfFile): Removed the code to construct a
     86        SegmentedString, overkill since we can just append an individual string.
     87        (WebCore::HTMLInputStream::splitInto): Rewrote the move idiom here to actually
     88        use move, which will reduce reference count churn and other unneeded work.
     89
     90        * html/parser/HTMLMetaCharsetParser.cpp:
     91        (WebCore::HTMLMetaCharsetParser::checkForMetaCharset): Removed unneeded
     92        construction of a SegmentedString, just to append a string.
     93
     94        * html/parser/HTMLSourceTracker.cpp:
     95        (WebCore::HTMLSourceTracker::HTMLSourceTracker): Moved to the class definition.
     96        (WebCore::HTMLSourceTracker::source): Updated for function name change.
     97        * html/parser/HTMLSourceTracker.h: Updated for above.
     98
     99        * html/parser/HTMLTokenizer.cpp: Added now-needed include.
     100        (WebCore::HTMLTokenizer::emitAndResumeInDataState): Use advancePastNonNewline,
     101        since this function is never called in response to a newline character.
     102        (WebCore::HTMLTokenizer::commitToPartialEndTag): Ditto.
     103        (WebCore::HTMLTokenizer::commitToCompleteEndTag): Ditto.
     104        (WebCore::HTMLTokenizer::processToken): Use ADVANCE_PAST_NON_NEWLINE_TO macro
     105        instead of ADVANCE_TO in cases where the character we are advancing past is
     106        known not to be a newline, so we can use the more efficient advance function
     107        that doesn't check for the newline character.
     108
     109        * html/parser/InputStreamPreprocessor.h: Moved kEndOfFileMarker to
     110        SegmentedString.h; not sure that's a good place for it either. In the long run,
     111        unclear to me whether this is really needed.
     112        (WebCore::InputStreamPreprocessor::peek): Added UNLIKELY for the empty check.
     113        Added LIKELY for the not-special character check.
     114        (WebCore::InputStreamPreprocessor::advance): Updated for the new name of the
     115        advanceAndUpdateLineNumber function.
     116        (WebCore::InputStreamPreprocessor::advancePastNonNewline): Added. More
     117        efficient than advance for cases where the last characer is known not to be
     118        a newline character.
     119        (WebCore::InputStreamPreprocessor::skipNextNewLine): Deleted. Was unused.
     120        (WebCore::InputStreamPreprocessor::reset): Deleted. Was unused except in the
     121        constructor; added initial values for the data members to replace.
     122        (WebCore::InputStreamPreprocessor::processNextInputCharacter): Removed long
     123        FIXME comment that didn't really need to be here. Reorganized a bit.
     124        (WebCore::InputStreamPreprocessor::isAtEndOfFile): Renamed and made static.
     125
     126        * html/track/BufferedLineReader.cpp:
     127        (WebCore::BufferedLineReader::nextLine): Updated to not use the poorly named
     128        scanCharacter function to advance past a newline. Also renamed from getLine
     129        and changed to return Optional<String> instead of using a boolean to indicate
     130        failure and an out argument.
     131
     132        * html/track/BufferedLineReader.h:
     133        (WebCore::BufferedLineReader::BufferedLineReader): Use the default, putting
     134        initial values on each data member below.
     135        (WebCore::BufferedLineReader::append): Updated to take an rvalue reference,
     136        and move the value through.
     137        (WebCore::BufferedLineReader::scanCharacter): Deleted. Was poorly named,
     138        and easy to replace with two lines of code at its two call sites.
     139        (WebCore::BufferedLineReader::reset): Rewrote to correctly clear all the
     140        data members of the class, not just the segmented string.
     141
     142        * html/track/InbandGenericTextTrack.cpp:
     143        (WebCore::InbandGenericTextTrack::parseWebVTTFileHeader): Updated to take
     144        an rvalue reference and move the value through.
     145        * html/track/InbandGenericTextTrack.h: Updated for the above.
     146
     147        * html/track/InbandTextTrack.h: Updated since parseWebVTTFileHeader now
     148        takes an rvalue reference.
     149
     150        * html/track/WebVTTParser.cpp:
     151        (WebCore::WebVTTParser::parseFileHeader): Updated to take an rvalue reference
     152        and move the value through.
     153        (WebCore::WebVTTParser::parseBytes): Updated to pass ownership of the string
     154        in to the line reader append function.
     155        (WebCore::WebVTTParser::parseCueData): Use auto and WTFMove for WebVTTCueData.
     156        (WebCore::WebVTTParser::flush): More of the same.
     157        (WebCore::WebVTTParser::parse): Changed to use nextLine instead of getLine.
     158        * html/track/WebVTTParser.h: Updated for the above.
     159
     160        * html/track/WebVTTTokenizer.cpp:
     161        (WebCore::advanceAndEmitToken): Use advanceAndUpdateLineNumber by its new
     162        name, just advance. No change in behavior.
     163        (WebCore::WebVTTTokenizer::WebVTTTokenizer): Pass a String, not a
     164        SegmentedString, to add the end of file marker.
     165
     166        * platform/graphics/InbandTextTrackPrivateClient.h: Updated since
     167        parseWebVTTFileHeader takes an rvalue reference.
     168
     169        * platform/text/SegmentedString.cpp:
     170        (WebCore::SegmentedString::Substring::appendTo): Moved here from the header.
     171        The only caller is SegmentedString::toString, inside this file.
     172        (WebCore::SegmentedString::SegmentedString): Deleted the copy constructor.
     173        No longer needed.
     174        (WebCore::SegmentedString::operator=): Defined a move assignment operator
     175        rather than an ordinary assignment operator, since that's what the call
     176        sites really need.
     177        (WebCore::SegmentedString::length): Simplified since we no longer need to
     178        support pushed characters.
     179        (WebCore::SegmentedString::setExcludeLineNumbers): Simplified, since we
     180        can just iterate m_otherSubstrings without an extra check. Also changed to
     181        write directly to the data member of Substring instead of using a function.
     182        (WebCore::SegmentedString::updateAdvanceFunctionPointersForEmptyString):
     183        Added. Used when we run out of characters.
     184        (WebCore::SegmentedString::clear): Removed code to clear now-deleted members.
     185        Updated for changes to other member names.
     186        (WebCore::SegmentedString::appendSubstring): Renamed from just append to
     187        avoid ambiguity with the public append function. Changed to take an rvalue
     188        reference, and move in, and added code to set m_currentCharacter properly,
     189        so the caller doesn't have to deal with that.
     190        (WebCore::SegmentedString::close): Updated to use m_isClosed by its new name.
     191        Also removed unneeded comment about assertion that fires when trying to close
     192        an already closed string.
     193        (WebCore::SegmentedString::append): Added overloads for rvalue references of
     194        both entire SegmentedString objects and of String. Streamlined to just call
     195        appendSubstring and append to the deque.
     196        (WebCore::SegmentedString::pushBack): Tightened up since we don't allow empty
     197        strings and changed to take just a string, not an entire segmented string.
     198        (WebCore::SegmentedString::advanceSubstring): Moved logic into the
     199        advancePastSingleCharacterSubstringWithoutUpdatingLineNumber function.
     200        (WebCore::SegmentedString::toString): Simplified now that we don't need to
     201        support pushed characters.
     202        (WebCore::SegmentedString::advancePastNonNewlines): Deleted.
     203        (WebCore::SegmentedString::advance8): Deleted.
     204        (WebCore::SegmentedString::advanceWithoutUpdatingLineNumber16): Renamed from
     205        advance16. Simplified now that there are no pushed characters. Also changed to
     206        access data members of m_currentSubstring directly instead of calling a function.
     207        (WebCore::SegmentedString::advanceAndUpdateLineNumber8): Deleted.
     208        (WebCore::SegmentedString::advanceAndUpdateLineNumber16): Ditto.
     209        (WebCore::SegmentedString::advancePastSingleCharacterSubstringWithoutUpdatingLineNumber):
     210        Renamed from advanceSlowCase. Removed uneeded logic to handle pushed characters.
     211        Moved code in here from advanceSubstring.
     212        (WebCore::SegmentedString::advancePastSingleCharacterSubstring): Renamed from
     213        advanceAndUpdateLineNumberSlowCase. Simplified by calling the function above.
     214        (WebCore::SegmentedString::advanceEmpty): Broke assertion up into two.
     215        (WebCore::SegmentedString::updateSlowCaseFunctionPointers): Updated for name changes.
     216        (WebCore::SegmentedString::advancePastSlowCase): Changed name and meaning of
     217        boolean argument. Rewrote to use the String class less; it's now used only when
     218        we fail to match after the first character rather than being used for the actual
     219        comparison with the literal.
     220
     221        * platform/text/SegmentedString.h: Moved all non-trivial function bodies out of
     222        the class definition to make things easier to read. Moved the SegmentedSubstring
     223        class inside the SegmentedString class, making it a private struct named Substring.
     224        Removed the m_ prefix from data members of the struct, removed many functions from
     225        the struct and made its union be anonymous instead of naming it m_data. Removed
     226        unneeded StringBuilder.h include.
     227        (WebCore::SegmentedString::isEmpty): Changed to use the length of the substring
     228        instead of a separate boolean. We never create an empty substring, nor leave one
     229        in place as the current substring unless the entire segmented string is empty.
     230        (WebCore::SegmentedString::advancePast): Updated to use the new member function
     231        template instead of a non-template member function. The new member function is
     232        entirely rewritten and does the matching directly rather than allocating a string
     233        just to do prefix matching.
     234        (WebCore::SegmentedString::advancePastLettersIgnoringASCIICase): Renamed to make
     235        it clear that the literal must be all non-letters or lowercase letters as with
     236        the other "letters ignoring ASCII case" functions. The three call sites all fit
     237        the bill. Implement by calling the new function template.
     238        (WebCore::SegmentedString::currentCharacter): Renamed from currentChar.
     239        (WebCore::SegmentedString::Substring::Substring): Use an rvalue reference and
     240        move the string in.
     241        (WebCore::SegmentedString::Substring::currentCharacter): Simplified since this
     242        is never used on an empty substring.
     243        (WebCore::SegmentedString::Substring::incrementAndGetCurrentCharacter): Ditto.
     244        (WebCore::SegmentedString::SegmentedString): Overload to take an rvalue reference.
     245        Simplified since there are now fewer data members.
     246        (WebCore::SegmentedString::advanceWithoutUpdatingLineNumber): Renamed from
     247        advance, since this is only safe to use if there is some reason it is OK to skip
     248        updating the line number.
     249        (WebCore::SegmentedString::advance): Renamed from advanceAndUpdateLineNumber,
     250        since doing that is the normal desired behavior and not worth mentioning in the
     251        public function name.
     252        (WebCore::SegmentedString::advancePastNewline): Renamed from
     253        advancePastNewlineAndUpdateLineNumber.
     254        (WebCore::SegmentedString::numberOfCharactersConsumed): Greatly simplified since
     255        pushed characters are no longer supported.
     256        (WebCore::SegmentedString::characterMismatch): Added. Used by advancePast.
     257
     258        * xml/parser/CharacterReferenceParserInlines.h:
     259        (WebCore::unconsumeCharacters): Use toString rather than toStringPreserveCapacity
     260        because the SegmentedString is going to take ownership of the string.
     261        (WebCore::consumeCharacterReference): Updated to use the pushBack that takes just
     262        a String, not a SegmentedString. Also use advancePastNonNewline.
     263
     264        * xml/parser/MarkupTokenizerInlines.h: Added ADVANCE_PAST_NON_NEWLINE_TO.
     265
     266        * xml/parser/XMLDocumentParser.cpp:
     267        (WebCore::XMLDocumentParser::insert): Updated since this takes an rvalue reference.
     268        (WebCore::XMLDocumentParser::append): Removed unnecessary code to create a
     269        SegmentedString.
     270        * xml/parser/XMLDocumentParser.h: Updated for above. Also fixed indentation
     271        and initialized most data members.
     272        * xml/parser/XMLDocumentParserLibxml2.cpp:
     273        (WebCore::XMLDocumentParser::XMLDocumentParser): Moved most data member
     274        initialization into the class definition.
     275        (WebCore::XMLDocumentParser::resumeParsing): Removed code that copied a
     276        segmented string, but converted the whole thing into a string before using it.
     277        Now we convert to a string right away.
     278
    12792016-11-28  Chris Dumez  <cdumez@apple.com>
    2280
  • trunk/Source/WebCore/bindings/js/JSHTMLDocumentCustom.cpp

    r208112 r209058  
    11/*
    2  * Copyright (C) 2007-2009, 2016 Apple Inc. All rights reserved.
     2 * Copyright (C) 2007-2016 Apple Inc. All rights reserved.
    33 *
    44 * Redistribution and use in source and binary forms, with or without
     
    2727#include "JSHTMLDocument.h"
    2828
    29 #include "Frame.h"
    30 #include "HTMLCollection.h"
    31 #include "HTMLDocument.h"
    32 #include "HTMLElement.h"
    3329#include "HTMLIFrameElement.h"
    34 #include "HTMLNames.h"
    35 #include "JSDOMWindow.h"
    3630#include "JSDOMWindowCustom.h"
    37 #include "JSDOMWindowShell.h"
    38 #include "JSDocumentCustom.h"
    3931#include "JSHTMLCollection.h"
    40 #include "JSMainThreadExecState.h"
    4132#include "SegmentedString.h"
    42 #include "DocumentParser.h"
    43 #include <interpreter/StackVisitor.h>
    44 #include <runtime/Error.h>
    45 #include <runtime/JSCell.h>
    46 #include <wtf/unicode/CharacterNames.h>
    4733
    4834using namespace JSC;
     
    5541{
    5642    auto& document = passedDocument.get();
    57     JSObject* wrapper = createWrapper<HTMLDocument>(globalObject, WTFMove(passedDocument));
    58 
     43    auto* wrapper = createWrapper<HTMLDocument>(globalObject, WTFMove(passedDocument));
    5944    reportMemoryForDocumentIfFrameless(*state, document);
    60 
    6145    return wrapper;
    6246}
     
    6953}
    7054
    71 bool JSHTMLDocument::getOwnPropertySlot(JSObject* object, ExecState* exec, PropertyName propertyName, PropertySlot& slot)
     55bool JSHTMLDocument::getOwnPropertySlot(JSObject* object, ExecState* state, PropertyName propertyName, PropertySlot& slot)
    7256{
    73     JSHTMLDocument* thisObject = jsCast<JSHTMLDocument*>(object);
    74     ASSERT_GC_OBJECT_INHERITS(thisObject, info());
     57    auto& thisObject = *jsCast<JSHTMLDocument*>(object);
     58    ASSERT_GC_OBJECT_INHERITS((&thisObject), info());
    7559
    7660    if (propertyName == "open") {
    77         if (Base::getOwnPropertySlot(thisObject, exec, propertyName, slot))
     61        if (Base::getOwnPropertySlot(&thisObject, state, propertyName, slot))
    7862            return true;
    79 
    80         slot.setCustom(thisObject, ReadOnly | DontDelete | DontEnum, nonCachingStaticFunctionGetter<jsHTMLDocumentPrototypeFunctionOpen, 2>);
     63        slot.setCustom(&thisObject, ReadOnly | DontDelete | DontEnum, nonCachingStaticFunctionGetter<jsHTMLDocumentPrototypeFunctionOpen, 2>);
    8164        return true;
    8265    }
    8366
    8467    JSValue value;
    85     if (thisObject->nameGetter(exec, propertyName, value)) {
    86         slot.setValue(thisObject, ReadOnly | DontDelete | DontEnum, value);
     68    if (thisObject.nameGetter(state, propertyName, value)) {
     69        slot.setValue(&thisObject, ReadOnly | DontDelete | DontEnum, value);
    8770        return true;
    8871    }
    8972
    90     return Base::getOwnPropertySlot(thisObject, exec, propertyName, slot);
     73    return Base::getOwnPropertySlot(&thisObject, state, propertyName, slot);
    9174}
    9275
    93 bool JSHTMLDocument::nameGetter(ExecState* exec, PropertyName propertyName, JSValue& value)
     76bool JSHTMLDocument::nameGetter(ExecState* state, PropertyName propertyName, JSValue& value)
    9477{
    9578    auto& document = wrapped();
    9679
    97     AtomicStringImpl* atomicPropertyName = propertyName.publicName();
     80    auto* atomicPropertyName = propertyName.publicName();
    9881    if (!atomicPropertyName || !document.hasDocumentNamedItem(*atomicPropertyName))
    9982        return false;
    10083
    10184    if (UNLIKELY(document.documentNamedItemContainsMultipleElements(*atomicPropertyName))) {
    102         Ref<HTMLCollection> collection = document.documentNamedItems(atomicPropertyName);
     85        auto collection = document.documentNamedItems(atomicPropertyName);
    10386        ASSERT(collection->length() > 1);
    104         value = toJS(exec, globalObject(), collection);
     87        value = toJS(state, globalObject(), collection);
    10588        return true;
    10689    }
    10790
    108     Element& element = *document.documentNamedItem(*atomicPropertyName);
     91    auto& element = *document.documentNamedItem(*atomicPropertyName);
    10992    if (UNLIKELY(is<HTMLIFrameElement>(element))) {
    110         if (Frame* frame = downcast<HTMLIFrameElement>(element).contentFrame()) {
    111             value = toJS(exec, frame);
     93        if (auto* frame = downcast<HTMLIFrameElement>(element).contentFrame()) {
     94            value = toJS(state, frame);
    11295            return true;
    11396        }
    11497    }
    11598
    116     value = toJS(exec, globalObject(), element);
     99    value = toJS(state, globalObject(), element);
    117100    return true;
    118101}
     
    123106{
    124107    // If "all" has been overwritten, return the overwritten value
    125     JSValue v = getDirect(state.vm(), Identifier::fromString(&state, "all"));
    126     if (v)
    127         return v;
     108    if (auto overwrittenValue = getDirect(state.vm(), Identifier::fromString(&state, "all")))
     109        return overwrittenValue;
    128110
    129111    return toJS(&state, globalObject(), wrapped().all());
     
    136118}
    137119
    138 static Document* findCallingDocument(ExecState& state)
     120static inline Document* findCallingDocument(ExecState& state)
    139121{
    140122    CallerFunctor functor;
    141123    state.iterate(functor);
    142     CallFrame* callerFrame = functor.callerFrame();
     124    auto* callerFrame = functor.callerFrame();
    143125    if (!callerFrame)
    144126        return nullptr;
    145 
    146     return asJSDOMWindow(functor.callerFrame()->lexicalGlobalObject())->wrapped().document();
     127    return asJSDOMWindow(callerFrame->lexicalGlobalObject())->wrapped().document();
    147128}
    148129
     
    156137    // For compatibility with other browsers, pass open calls with more than 2 parameters to the window.
    157138    if (state.argumentCount() > 2) {
    158         if (Frame* frame = wrapped().frame()) {
    159             JSDOMWindowShell* wrapper = toJSDOMWindowShell(frame, currentWorld(&state));
    160             if (wrapper) {
    161                 JSValue function = wrapper->get(&state, Identifier::fromString(&state, "open"));
     139        if (auto* frame = wrapped().frame()) {
     140            if (auto* wrapper = toJSDOMWindowShell(frame, currentWorld(&state))) {
     141                auto function = wrapper->get(&state, Identifier::fromString(&state, "open"));
    162142                CallData callData;
    163                 CallType callType = ::getCallData(function, callData);
     143                auto callType = ::getCallData(function, callData);
    164144                if (callType == CallType::None)
    165145                    return throwTypeError(&state, scope);
     
    170150    }
    171151
    172     // document.open clobbers the security context of the document and
    173     // aliases it with the active security context.
    174     Document* activeDocument = asJSDOMWindow(state.lexicalGlobalObject())->wrapped().document();
    175 
    176     // In the case of two parameters or fewer, do a normal document open.
    177     wrapped().open(activeDocument);
     152    // Calling document.open clobbers the security context of the document and aliases it with the active security context.
     153    // FIXME: Is it correct that this does not use findCallingDocument as the write function below does?
     154    wrapped().open(asJSDOMWindow(state.lexicalGlobalObject())->wrapped().document());
     155    // FIXME: Why do we return the document instead of returning undefined?
    178156    return this;
    179157}
     
    181159enum NewlineRequirement { DoNotAddNewline, DoAddNewline };
    182160
    183 static inline void documentWrite(ExecState& state, JSHTMLDocument* thisDocument, NewlineRequirement addNewline)
     161static inline JSValue documentWrite(ExecState& state, JSHTMLDocument& document, NewlineRequirement addNewline)
    184162{
    185     HTMLDocument* document = &thisDocument->wrapped();
    186     // DOM only specifies single string argument, but browsers allow multiple or no arguments.
     163    VM& vm = state.vm();
     164    auto scope = DECLARE_THROW_SCOPE(vm);
    187165
    188     size_t size = state.argumentCount();
    189 
    190     String firstString = state.argument(0).toString(&state)->value(&state);
    191     SegmentedString segmentedString = firstString;
    192     if (size != 1) {
    193         if (!size)
    194             segmentedString.clear();
    195         else {
    196             for (size_t i = 1; i < size; ++i) {
    197                 String subsequentString = state.uncheckedArgument(i).toString(&state)->value(&state);
    198                 segmentedString.append(SegmentedString(subsequentString));
    199             }
    200         }
     166    SegmentedString segmentedString;
     167    size_t argumentCount = state.argumentCount();
     168    for (size_t i = 0; i < argumentCount; ++i) {
     169        segmentedString.append(state.uncheckedArgument(i).toWTFString(&state));
     170        RETURN_IF_EXCEPTION(scope, { });
    201171    }
    202172    if (addNewline)
    203         segmentedString.append(SegmentedString(String(&newlineCharacter, 1)));
     173        segmentedString.append(String { "\n" });
    204174
    205     Document* activeDocument = findCallingDocument(state);
    206     document->write(segmentedString, activeDocument);
     175    document.wrapped().write(WTFMove(segmentedString), findCallingDocument(state));
     176    return jsUndefined();
    207177}
    208178
    209179JSValue JSHTMLDocument::write(ExecState& state)
    210180{
    211     documentWrite(state, this, DoNotAddNewline);
    212     return jsUndefined();
     181    return documentWrite(state, *this, DoNotAddNewline);
    213182}
    214183
    215184JSValue JSHTMLDocument::writeln(ExecState& state)
    216185{
    217     documentWrite(state, this, DoAddNewline);
    218     return jsUndefined();
     186    return documentWrite(state, *this, DoAddNewline);
    219187}
    220188
  • trunk/Source/WebCore/css/parser/CSSTokenizer.cpp

    r205103 r209058  
    3636#include "CSSTokenizerInputStream.h"
    3737#include "HTMLParserIdioms.h"
     38#include <wtf/text/StringBuilder.h>
    3839#include <wtf/unicode/CharacterNames.h>
    3940
  • trunk/Source/WebCore/css/parser/CSSTokenizer.h

    r208668 r209058  
    3131
    3232#include "CSSParserToken.h"
    33 #include "InputStreamPreprocessor.h"
    3433#include <climits>
    3534#include <wtf/text/StringView.h>
  • trunk/Source/WebCore/css/parser/CSSTokenizerInputStream.h

    r208668 r209058  
    3131
    3232#include <wtf/text/StringView.h>
    33 #include <wtf/text/WTFString.h>
    3433
    3534namespace WebCore {
     35
     36constexpr LChar kEndOfFileMarker = 0;
    3637
    3738class CSSTokenizerInputStream {
  • trunk/Source/WebCore/dom/Document.cpp

    r208991 r209058  
    27922792}
    27932793
    2794 void Document::write(const SegmentedString& text, Document* ownerDocument)
     2794void Document::write(SegmentedString&& text, Document* ownerDocument)
    27952795{
    27962796    NestingLevelIncrementer nestingLevelIncrementer(m_writeRecursionDepth);
     
    28002800
    28012801    if (m_writeRecursionIsTooDeep)
    2802        return;
     2802        return;
    28032803
    28042804    bool hasInsertionPoint = m_parser && m_parser->hasInsertionPoint();
     
    28102810
    28112811    ASSERT(m_parser);
    2812     m_parser->insert(text);
     2812    m_parser->insert(WTFMove(text));
    28132813}
    28142814
    28152815void Document::write(const String& text, Document* ownerDocument)
    28162816{
    2817     write(SegmentedString(text), ownerDocument);
     2817    write(SegmentedString { text }, ownerDocument);
    28182818}
    28192819
    28202820void Document::writeln(const String& text, Document* ownerDocument)
    28212821{
    2822     write(text, ownerDocument);
    2823     write("\n", ownerDocument);
     2822    SegmentedString textWithNewline { text };
     2823    textWithNewline.append(String { "\n" });
     2824    write(WTFMove(textWithNewline), ownerDocument);
    28242825}
    28252826
  • trunk/Source/WebCore/dom/Document.h

    r208982 r209058  
    603603    void cancelParsing();
    604604
    605     void write(const SegmentedString& text, Document* ownerDocument = nullptr);
     605    void write(SegmentedString&& text, Document* ownerDocument = nullptr);
    606606    WEBCORE_EXPORT void write(const String& text, Document* ownerDocument = nullptr);
    607607    WEBCORE_EXPORT void writeln(const String& text, Document* ownerDocument = nullptr);
  • trunk/Source/WebCore/dom/DocumentParser.h

    r208179 r209058  
    4444
    4545    // insert is used by document.write.
    46     virtual void insert(const SegmentedString&) = 0;
     46    virtual void insert(SegmentedString&&) = 0;
    4747
    4848    // appendBytes and flush are used by DocumentWriter (the loader).
  • trunk/Source/WebCore/dom/RawDataDocumentParser.h

    r208179 r209058  
    5050    }
    5151
    52     void insert(const SegmentedString&) override
     52    void insert(SegmentedString&&) override
    5353    {
    5454        // <https://bugs.webkit.org/show_bug.cgi?id=25397>: JS code can always call document.write, we need to handle it.
  • trunk/Source/WebCore/html/FTPDirectoryDocument.cpp

    r208658 r209058  
    345345void FTPDirectoryDocumentParser::append(RefPtr<StringImpl>&& inputSource)
    346346{
    347     String source(WTFMove(inputSource));
    348 
    349347    // Make sure we have the table element to append to by loading the template set in the pref, or
    350348    // creating a very basic document with the appropriate table
     
    358356
    359357    m_dest = m_buffer;
    360     SegmentedString str = source;
    361     while (!str.isEmpty()) {
    362         UChar c = str.currentChar();
     358    SegmentedString string { String { WTFMove(inputSource) } };
     359    while (!string.isEmpty()) {
     360        UChar c = string.currentCharacter();
    363361
    364362        if (c == '\r') {
     
    377375        }
    378376
    379         str.advance();
     377        string.advance();
    380378
    381379        // Maybe enlarge the buffer
  • trunk/Source/WebCore/html/parser/HTMLDocumentParser.cpp

    r208840 r209058  
    329329}
    330330
    331 void HTMLDocumentParser::insert(const SegmentedString& source)
     331void HTMLDocumentParser::insert(SegmentedString&& source)
    332332{
    333333    if (isStopped())
     
    338338    Ref<HTMLDocumentParser> protectedThis(*this);
    339339
    340     SegmentedString excludedLineNumberSource(source);
    341     excludedLineNumberSource.setExcludeLineNumbers();
    342     m_input.insertAtCurrentInsertionPoint(excludedLineNumberSource);
     340    source.setExcludeLineNumbers();
     341    m_input.insertAtCurrentInsertionPoint(WTFMove(source));
    343342    pumpTokenizerIfPossible(ForceSynchronous);
    344343
     
    364363    Ref<HTMLDocumentParser> protectedThis(*this);
    365364
    366     String source(WTFMove(inputSource));
     365    String source { WTFMove(inputSource) };
    367366
    368367    if (m_preloadScanner) {
  • trunk/Source/WebCore/html/parser/HTMLDocumentParser.h

    r208179 r209058  
    6666    explicit HTMLDocumentParser(HTMLDocument&);
    6767
    68     void insert(const SegmentedString&) final;
     68    void insert(SegmentedString&&) final;
    6969    void append(RefPtr<StringImpl>&&) override;
    7070    void finish() override;
  • trunk/Source/WebCore/html/parser/HTMLEntityParser.cpp

    r183552 r209058  
    6262        HTMLEntitySearch entitySearch;
    6363        while (!source.isEmpty()) {
    64             cc = source.currentChar();
     64            cc = source.currentCharacter();
    6565            entitySearch.advance(cc);
    6666            if (!entitySearch.isEntityPrefix())
    6767                break;
    6868            consumedCharacters.append(cc);
    69             source.advance();
     69            source.advancePastNonNewline();
    7070        }
    7171        notEnoughCharacters = source.isEmpty();
     
    8989            const LChar* reference = entitySearch.mostRecentMatch()->entity;
    9090            for (int i = 0; i < length; ++i) {
    91                 cc = source.currentChar();
     91                cc = source.currentCharacter();
    9292                ASSERT_UNUSED(reference, cc == *reference++);
    9393                consumedCharacters.append(cc);
    94                 source.advance();
     94                source.advancePastNonNewline();
    9595                ASSERT(!source.isEmpty());
    9696            }
    97             cc = source.currentChar();
     97            cc = source.currentCharacter();
    9898        }
    9999        if (entitySearch.mostRecentMatch()->lastCharacter() == ';'
  • trunk/Source/WebCore/html/parser/HTMLInputStream.h

    r208179 r209058  
    2626#pragma once
    2727
    28 #include "InputStreamPreprocessor.h"
    2928#include "SegmentedString.h"
    3029#include <wtf/text/TextPosition.h>
     
    5756    }
    5857
    59     void appendToEnd(const SegmentedString& string)
     58    void appendToEnd(SegmentedString&& string)
    6059    {
    61         m_last->append(string);
     60        m_last->append(WTFMove(string));
    6261    }
    6362
    64     void insertAtCurrentInsertionPoint(const SegmentedString& string)
     63    void insertAtCurrentInsertionPoint(SegmentedString&& string)
    6564    {
    66         m_first.append(string);
     65        m_first.append(WTFMove(string));
    6766    }
    6867
     
    7473    void markEndOfFile()
    7574    {
    76         m_last->append(SegmentedString(String(&kEndOfFileMarker, 1)));
     75        m_last->append(String { &kEndOfFileMarker, 1 });
    7776        m_last->close();
    7877    }
     
    9392    void splitInto(SegmentedString& next)
    9493    {
    95         next = m_first;
    96         m_first = SegmentedString();
     94        next = WTFMove(m_first);
    9795        if (m_last == &m_first) {
    9896            // We used to only have one SegmentedString in the InputStream
  • trunk/Source/WebCore/html/parser/HTMLMetaCharsetParser.cpp

    r195452 r209058  
    11/*
    22 * Copyright (C) 2010 Google Inc. All Rights Reserved.
    3  * Copyright (C) 2015 Apple Inc. All Rights Reserved.
     3 * Copyright (C) 2015-2016 Apple Inc. All Rights Reserved.
    44 *
    55 * Redistribution and use in source and binary forms, with or without
     
    152152    // least bytesToCheckUnconditionally bytes of input.
    153153
    154     static const int bytesToCheckUnconditionally = 1024;
     154    constexpr int bytesToCheckUnconditionally = 1024;
    155155
    156     m_input.append(SegmentedString(m_codec->decode(data, length)));
     156    m_input.append(m_codec->decode(data, length));
    157157
    158158    while (auto token = m_tokenizer.nextToken(m_input)) {
  • trunk/Source/WebCore/html/parser/HTMLSourceTracker.cpp

    r207848 r209058  
    3232
    3333namespace WebCore {
    34 
    35 HTMLSourceTracker::HTMLSourceTracker()
    36 {
    37 }
    3834
    3935void HTMLSourceTracker::startToken(SegmentedString& currentInput, HTMLTokenizer& tokenizer)
     
    7975    unsigned i = 0;
    8076    for ( ; i < length && !m_previousSource.isEmpty(); ++i) {
    81         source.append(m_previousSource.currentChar());
     77        source.append(m_previousSource.currentCharacter());
    8278        m_previousSource.advance();
    8379    }
    8480    for ( ; i < length; ++i) {
    8581        ASSERT(!m_currentSource.isEmpty());
    86         source.append(m_currentSource.currentChar());
     82        source.append(m_currentSource.currentCharacter());
    8783        m_currentSource.advance();
    8884    }
  • trunk/Source/WebCore/html/parser/HTMLSourceTracker.h

    r208179 r209058  
    3737    WTF_MAKE_NONCOPYABLE(HTMLSourceTracker);
    3838public:
    39     HTMLSourceTracker();
     39    HTMLSourceTracker() = default;
    4040
    4141    void startToken(SegmentedString&, HTMLTokenizer&);
  • trunk/Source/WebCore/html/parser/HTMLTokenizer.cpp

    r178265 r209058  
    11/*
    2  * Copyright (C) 2008, 2015 Apple Inc. All Rights Reserved.
     2 * Copyright (C) 2008-2016 Apple Inc. All Rights Reserved.
    33 * Copyright (C) 2009 Torch Mobile, Inc. http://www.torchmobile.com/
    44 * Copyright (C) 2010 Google, Inc. All Rights Reserved.
     
    3232#include "HTMLNames.h"
    3333#include "MarkupTokenizerInlines.h"
    34 #include <wtf/ASCIICType.h>
     34#include <wtf/text/StringBuilder.h>
    3535
    3636using namespace WTF;
     
    9797    saveEndTagNameIfNeeded();
    9898    m_state = DataState;
    99     source.advanceAndUpdateLineNumber();
     99    source.advancePastNonNewline();
    100100    return true;
    101101}
     
    158158bool HTMLTokenizer::commitToPartialEndTag(SegmentedString& source, UChar character, State state)
    159159{
    160     ASSERT(source.currentChar() == character);
     160    ASSERT(source.currentCharacter() == character);
    161161    appendToTemporaryBuffer(character);
    162     source.advanceAndUpdateLineNumber();
     162    source.advancePastNonNewline();
    163163
    164164    if (haveBufferedCharacterToken()) {
     
    175175bool HTMLTokenizer::commitToCompleteEndTag(SegmentedString& source)
    176176{
    177     ASSERT(source.currentChar() == '>');
     177    ASSERT(source.currentCharacter() == '>');
    178178    appendToTemporaryBuffer('>');
    179     source.advance();
     179    source.advancePastNonNewline();
    180180
    181181    m_state = DataState;
     
    213213    BEGIN_STATE(DataState)
    214214        if (character == '&')
    215             ADVANCE_TO(CharacterReferenceInDataState);
     215            ADVANCE_PAST_NON_NEWLINE_TO(CharacterReferenceInDataState);
    216216        if (character == '<') {
    217217            if (haveBufferedCharacterToken())
    218218                RETURN_IN_CURRENT_STATE(true);
    219             ADVANCE_TO(TagOpenState);
     219            ADVANCE_PAST_NON_NEWLINE_TO(TagOpenState);
    220220        }
    221221        if (character == kEndOfFileMarker)
     
    233233    BEGIN_STATE(RCDATAState)
    234234        if (character == '&')
    235             ADVANCE_TO(CharacterReferenceInRCDATAState);
     235            ADVANCE_PAST_NON_NEWLINE_TO(CharacterReferenceInRCDATAState);
    236236        if (character == '<')
    237             ADVANCE_TO(RCDATALessThanSignState);
     237            ADVANCE_PAST_NON_NEWLINE_TO(RCDATALessThanSignState);
    238238        if (character == kEndOfFileMarker)
    239239            RECONSUME_IN(DataState);
     
    250250    BEGIN_STATE(RAWTEXTState)
    251251        if (character == '<')
    252             ADVANCE_TO(RAWTEXTLessThanSignState);
     252            ADVANCE_PAST_NON_NEWLINE_TO(RAWTEXTLessThanSignState);
    253253        if (character == kEndOfFileMarker)
    254254            RECONSUME_IN(DataState);
     
    259259    BEGIN_STATE(ScriptDataState)
    260260        if (character == '<')
    261             ADVANCE_TO(ScriptDataLessThanSignState);
     261            ADVANCE_PAST_NON_NEWLINE_TO(ScriptDataLessThanSignState);
    262262        if (character == kEndOfFileMarker)
    263263            RECONSUME_IN(DataState);
     
    275275    BEGIN_STATE(TagOpenState)
    276276        if (character == '!')
    277             ADVANCE_TO(MarkupDeclarationOpenState);
     277            ADVANCE_PAST_NON_NEWLINE_TO(MarkupDeclarationOpenState);
    278278        if (character == '/')
    279             ADVANCE_TO(EndTagOpenState);
     279            ADVANCE_PAST_NON_NEWLINE_TO(EndTagOpenState);
    280280        if (isASCIIAlpha(character)) {
    281281            m_token.beginStartTag(convertASCIIAlphaToLower(character));
    282             ADVANCE_TO(TagNameState);
     282            ADVANCE_PAST_NON_NEWLINE_TO(TagNameState);
    283283        }
    284284        if (character == '?') {
     
    298298            m_token.beginEndTag(convertASCIIAlphaToLower(character));
    299299            m_appropriateEndTagName.clear();
    300             ADVANCE_TO(TagNameState);
    301         }
    302         if (character == '>') {
    303             parseError();
    304             ADVANCE_TO(DataState);
     300            ADVANCE_PAST_NON_NEWLINE_TO(TagNameState);
     301        }
     302        if (character == '>') {
     303            parseError();
     304            ADVANCE_PAST_NON_NEWLINE_TO(DataState);
    305305        }
    306306        if (character == kEndOfFileMarker) {
     
    318318            ADVANCE_TO(BeforeAttributeNameState);
    319319        if (character == '/')
    320             ADVANCE_TO(SelfClosingStartTagState);
     320            ADVANCE_PAST_NON_NEWLINE_TO(SelfClosingStartTagState);
    321321        if (character == '>')
    322322            return emitAndResumeInDataState(source);
     
    328328        }
    329329        m_token.appendToName(toASCIILower(character));
    330         ADVANCE_TO(TagNameState);
     330        ADVANCE_PAST_NON_NEWLINE_TO(TagNameState);
    331331    END_STATE()
    332332
     
    335335            m_temporaryBuffer.clear();
    336336            ASSERT(m_bufferedEndTagName.isEmpty());
    337             ADVANCE_TO(RCDATAEndTagOpenState);
     337            ADVANCE_PAST_NON_NEWLINE_TO(RCDATAEndTagOpenState);
    338338        }
    339339        bufferASCIICharacter('<');
     
    345345            appendToTemporaryBuffer(character);
    346346            appendToPossibleEndTag(convertASCIIAlphaToLower(character));
    347             ADVANCE_TO(RCDATAEndTagNameState);
     347            ADVANCE_PAST_NON_NEWLINE_TO(RCDATAEndTagNameState);
    348348        }
    349349        bufferASCIICharacter('<');
     
    356356            appendToTemporaryBuffer(character);
    357357            appendToPossibleEndTag(convertASCIIAlphaToLower(character));
    358             ADVANCE_TO(RCDATAEndTagNameState);
     358            ADVANCE_PAST_NON_NEWLINE_TO(RCDATAEndTagNameState);
    359359        }
    360360        if (isTokenizerWhitespace(character)) {
     
    386386            m_temporaryBuffer.clear();
    387387            ASSERT(m_bufferedEndTagName.isEmpty());
    388             ADVANCE_TO(RAWTEXTEndTagOpenState);
     388            ADVANCE_PAST_NON_NEWLINE_TO(RAWTEXTEndTagOpenState);
    389389        }
    390390        bufferASCIICharacter('<');
     
    396396            appendToTemporaryBuffer(character);
    397397            appendToPossibleEndTag(convertASCIIAlphaToLower(character));
    398             ADVANCE_TO(RAWTEXTEndTagNameState);
     398            ADVANCE_PAST_NON_NEWLINE_TO(RAWTEXTEndTagNameState);
    399399        }
    400400        bufferASCIICharacter('<');
     
    407407            appendToTemporaryBuffer(character);
    408408            appendToPossibleEndTag(convertASCIIAlphaToLower(character));
    409             ADVANCE_TO(RAWTEXTEndTagNameState);
     409            ADVANCE_PAST_NON_NEWLINE_TO(RAWTEXTEndTagNameState);
    410410        }
    411411        if (isTokenizerWhitespace(character)) {
     
    437437            m_temporaryBuffer.clear();
    438438            ASSERT(m_bufferedEndTagName.isEmpty());
    439             ADVANCE_TO(ScriptDataEndTagOpenState);
     439            ADVANCE_PAST_NON_NEWLINE_TO(ScriptDataEndTagOpenState);
    440440        }
    441441        if (character == '!') {
    442442            bufferASCIICharacter('<');
    443443            bufferASCIICharacter('!');
    444             ADVANCE_TO(ScriptDataEscapeStartState);
     444            ADVANCE_PAST_NON_NEWLINE_TO(ScriptDataEscapeStartState);
    445445        }
    446446        bufferASCIICharacter('<');
     
    452452            appendToTemporaryBuffer(character);
    453453            appendToPossibleEndTag(convertASCIIAlphaToLower(character));
    454             ADVANCE_TO(ScriptDataEndTagNameState);
     454            ADVANCE_PAST_NON_NEWLINE_TO(ScriptDataEndTagNameState);
    455455        }
    456456        bufferASCIICharacter('<');
     
    463463            appendToTemporaryBuffer(character);
    464464            appendToPossibleEndTag(convertASCIIAlphaToLower(character));
    465             ADVANCE_TO(ScriptDataEndTagNameState);
     465            ADVANCE_PAST_NON_NEWLINE_TO(ScriptDataEndTagNameState);
    466466        }
    467467        if (isTokenizerWhitespace(character)) {
     
    492492        if (character == '-') {
    493493            bufferASCIICharacter('-');
    494             ADVANCE_TO(ScriptDataEscapeStartDashState);
     494            ADVANCE_PAST_NON_NEWLINE_TO(ScriptDataEscapeStartDashState);
    495495        } else
    496496            RECONSUME_IN(ScriptDataState);
     
    500500        if (character == '-') {
    501501            bufferASCIICharacter('-');
    502             ADVANCE_TO(ScriptDataEscapedDashDashState);
     502            ADVANCE_PAST_NON_NEWLINE_TO(ScriptDataEscapedDashDashState);
    503503        } else
    504504            RECONSUME_IN(ScriptDataState);
     
    508508        if (character == '-') {
    509509            bufferASCIICharacter('-');
    510             ADVANCE_TO(ScriptDataEscapedDashState);
     510            ADVANCE_PAST_NON_NEWLINE_TO(ScriptDataEscapedDashState);
    511511        }
    512512        if (character == '<')
    513             ADVANCE_TO(ScriptDataEscapedLessThanSignState);
     513            ADVANCE_PAST_NON_NEWLINE_TO(ScriptDataEscapedLessThanSignState);
    514514        if (character == kEndOfFileMarker) {
    515515            parseError();
     
    523523        if (character == '-') {
    524524            bufferASCIICharacter('-');
    525             ADVANCE_TO(ScriptDataEscapedDashDashState);
     525            ADVANCE_PAST_NON_NEWLINE_TO(ScriptDataEscapedDashDashState);
    526526        }
    527527        if (character == '<')
    528             ADVANCE_TO(ScriptDataEscapedLessThanSignState);
     528            ADVANCE_PAST_NON_NEWLINE_TO(ScriptDataEscapedLessThanSignState);
    529529        if (character == kEndOfFileMarker) {
    530530            parseError();
     
    538538        if (character == '-') {
    539539            bufferASCIICharacter('-');
    540             ADVANCE_TO(ScriptDataEscapedDashDashState);
     540            ADVANCE_PAST_NON_NEWLINE_TO(ScriptDataEscapedDashDashState);
    541541        }
    542542        if (character == '<')
    543             ADVANCE_TO(ScriptDataEscapedLessThanSignState);
     543            ADVANCE_PAST_NON_NEWLINE_TO(ScriptDataEscapedLessThanSignState);
    544544        if (character == '>') {
    545545            bufferASCIICharacter('>');
    546             ADVANCE_TO(ScriptDataState);
     546            ADVANCE_PAST_NON_NEWLINE_TO(ScriptDataState);
    547547        }
    548548        if (character == kEndOfFileMarker) {
     
    558558            m_temporaryBuffer.clear();
    559559            ASSERT(m_bufferedEndTagName.isEmpty());
    560             ADVANCE_TO(ScriptDataEscapedEndTagOpenState);
     560            ADVANCE_PAST_NON_NEWLINE_TO(ScriptDataEscapedEndTagOpenState);
    561561        }
    562562        if (isASCIIAlpha(character)) {
     
    565565            m_temporaryBuffer.clear();
    566566            appendToTemporaryBuffer(convertASCIIAlphaToLower(character));
    567             ADVANCE_TO(ScriptDataDoubleEscapeStartState);
     567            ADVANCE_PAST_NON_NEWLINE_TO(ScriptDataDoubleEscapeStartState);
    568568        }
    569569        bufferASCIICharacter('<');
     
    575575            appendToTemporaryBuffer(character);
    576576            appendToPossibleEndTag(convertASCIIAlphaToLower(character));
    577             ADVANCE_TO(ScriptDataEscapedEndTagNameState);
     577            ADVANCE_PAST_NON_NEWLINE_TO(ScriptDataEscapedEndTagNameState);
    578578        }
    579579        bufferASCIICharacter('<');
     
    586586            appendToTemporaryBuffer(character);
    587587            appendToPossibleEndTag(convertASCIIAlphaToLower(character));
    588             ADVANCE_TO(ScriptDataEscapedEndTagNameState);
     588            ADVANCE_PAST_NON_NEWLINE_TO(ScriptDataEscapedEndTagNameState);
    589589        }
    590590        if (isTokenizerWhitespace(character)) {
     
    623623            bufferASCIICharacter(character);
    624624            appendToTemporaryBuffer(convertASCIIAlphaToLower(character));
    625             ADVANCE_TO(ScriptDataDoubleEscapeStartState);
     625            ADVANCE_PAST_NON_NEWLINE_TO(ScriptDataDoubleEscapeStartState);
    626626        }
    627627        RECONSUME_IN(ScriptDataEscapedState);
     
    631631        if (character == '-') {
    632632            bufferASCIICharacter('-');
    633             ADVANCE_TO(ScriptDataDoubleEscapedDashState);
     633            ADVANCE_PAST_NON_NEWLINE_TO(ScriptDataDoubleEscapedDashState);
    634634        }
    635635        if (character == '<') {
    636636            bufferASCIICharacter('<');
    637             ADVANCE_TO(ScriptDataDoubleEscapedLessThanSignState);
     637            ADVANCE_PAST_NON_NEWLINE_TO(ScriptDataDoubleEscapedLessThanSignState);
    638638        }
    639639        if (character == kEndOfFileMarker) {
     
    648648        if (character == '-') {
    649649            bufferASCIICharacter('-');
    650             ADVANCE_TO(ScriptDataDoubleEscapedDashDashState);
     650            ADVANCE_PAST_NON_NEWLINE_TO(ScriptDataDoubleEscapedDashDashState);
    651651        }
    652652        if (character == '<') {
    653653            bufferASCIICharacter('<');
    654             ADVANCE_TO(ScriptDataDoubleEscapedLessThanSignState);
     654            ADVANCE_PAST_NON_NEWLINE_TO(ScriptDataDoubleEscapedLessThanSignState);
    655655        }
    656656        if (character == kEndOfFileMarker) {
     
    665665        if (character == '-') {
    666666            bufferASCIICharacter('-');
    667             ADVANCE_TO(ScriptDataDoubleEscapedDashDashState);
     667            ADVANCE_PAST_NON_NEWLINE_TO(ScriptDataDoubleEscapedDashDashState);
    668668        }
    669669        if (character == '<') {
    670670            bufferASCIICharacter('<');
    671             ADVANCE_TO(ScriptDataDoubleEscapedLessThanSignState);
     671            ADVANCE_PAST_NON_NEWLINE_TO(ScriptDataDoubleEscapedLessThanSignState);
    672672        }
    673673        if (character == '>') {
    674674            bufferASCIICharacter('>');
    675             ADVANCE_TO(ScriptDataState);
     675            ADVANCE_PAST_NON_NEWLINE_TO(ScriptDataState);
    676676        }
    677677        if (character == kEndOfFileMarker) {
     
    687687            bufferASCIICharacter('/');
    688688            m_temporaryBuffer.clear();
    689             ADVANCE_TO(ScriptDataDoubleEscapeEndState);
     689            ADVANCE_PAST_NON_NEWLINE_TO(ScriptDataDoubleEscapeEndState);
    690690        }
    691691        RECONSUME_IN(ScriptDataDoubleEscapedState);
     
    703703            bufferASCIICharacter(character);
    704704            appendToTemporaryBuffer(convertASCIIAlphaToLower(character));
    705             ADVANCE_TO(ScriptDataDoubleEscapeEndState);
     705            ADVANCE_PAST_NON_NEWLINE_TO(ScriptDataDoubleEscapeEndState);
    706706        }
    707707        RECONSUME_IN(ScriptDataDoubleEscapedState);
     
    712712            ADVANCE_TO(BeforeAttributeNameState);
    713713        if (character == '/')
    714             ADVANCE_TO(SelfClosingStartTagState);
     714            ADVANCE_PAST_NON_NEWLINE_TO(SelfClosingStartTagState);
    715715        if (character == '>')
    716716            return emitAndResumeInDataState(source);
     
    725725        m_token.beginAttribute(source.numberOfCharactersConsumed());
    726726        m_token.appendToAttributeName(toASCIILower(character));
    727         ADVANCE_TO(AttributeNameState);
     727        ADVANCE_PAST_NON_NEWLINE_TO(AttributeNameState);
    728728    END_STATE()
    729729
     
    732732            ADVANCE_TO(AfterAttributeNameState);
    733733        if (character == '/')
    734             ADVANCE_TO(SelfClosingStartTagState);
     734            ADVANCE_PAST_NON_NEWLINE_TO(SelfClosingStartTagState);
    735735        if (character == '=')
    736             ADVANCE_TO(BeforeAttributeValueState);
     736            ADVANCE_PAST_NON_NEWLINE_TO(BeforeAttributeValueState);
    737737        if (character == '>')
    738738            return emitAndResumeInDataState(source);
     
    746746            parseError();
    747747        m_token.appendToAttributeName(toASCIILower(character));
    748         ADVANCE_TO(AttributeNameState);
     748        ADVANCE_PAST_NON_NEWLINE_TO(AttributeNameState);
    749749    END_STATE()
    750750
     
    753753            ADVANCE_TO(AfterAttributeNameState);
    754754        if (character == '/')
    755             ADVANCE_TO(SelfClosingStartTagState);
     755            ADVANCE_PAST_NON_NEWLINE_TO(SelfClosingStartTagState);
    756756        if (character == '=')
    757             ADVANCE_TO(BeforeAttributeValueState);
     757            ADVANCE_PAST_NON_NEWLINE_TO(BeforeAttributeValueState);
    758758        if (character == '>')
    759759            return emitAndResumeInDataState(source);
     
    768768        m_token.beginAttribute(source.numberOfCharactersConsumed());
    769769        m_token.appendToAttributeName(toASCIILower(character));
    770         ADVANCE_TO(AttributeNameState);
     770        ADVANCE_PAST_NON_NEWLINE_TO(AttributeNameState);
    771771    END_STATE()
    772772
     
    775775            ADVANCE_TO(BeforeAttributeValueState);
    776776        if (character == '"')
    777             ADVANCE_TO(AttributeValueDoubleQuotedState);
     777            ADVANCE_PAST_NON_NEWLINE_TO(AttributeValueDoubleQuotedState);
    778778        if (character == '&')
    779779            RECONSUME_IN(AttributeValueUnquotedState);
    780780        if (character == '\'')
    781             ADVANCE_TO(AttributeValueSingleQuotedState);
     781            ADVANCE_PAST_NON_NEWLINE_TO(AttributeValueSingleQuotedState);
    782782        if (character == '>') {
    783783            parseError();
     
    791791            parseError();
    792792        m_token.appendToAttributeValue(character);
    793         ADVANCE_TO(AttributeValueUnquotedState);
     793        ADVANCE_PAST_NON_NEWLINE_TO(AttributeValueUnquotedState);
    794794    END_STATE()
    795795
     
    797797        if (character == '"') {
    798798            m_token.endAttribute(source.numberOfCharactersConsumed());
    799             ADVANCE_TO(AfterAttributeValueQuotedState);
     799            ADVANCE_PAST_NON_NEWLINE_TO(AfterAttributeValueQuotedState);
    800800        }
    801801        if (character == '&') {
    802802            m_additionalAllowedCharacter = '"';
    803             ADVANCE_TO(CharacterReferenceInAttributeValueState);
     803            ADVANCE_PAST_NON_NEWLINE_TO(CharacterReferenceInAttributeValueState);
    804804        }
    805805        if (character == kEndOfFileMarker) {
     
    815815        if (character == '\'') {
    816816            m_token.endAttribute(source.numberOfCharactersConsumed());
    817             ADVANCE_TO(AfterAttributeValueQuotedState);
     817            ADVANCE_PAST_NON_NEWLINE_TO(AfterAttributeValueQuotedState);
    818818        }
    819819        if (character == '&') {
    820820            m_additionalAllowedCharacter = '\'';
    821             ADVANCE_TO(CharacterReferenceInAttributeValueState);
     821            ADVANCE_PAST_NON_NEWLINE_TO(CharacterReferenceInAttributeValueState);
    822822        }
    823823        if (character == kEndOfFileMarker) {
     
    837837        if (character == '&') {
    838838            m_additionalAllowedCharacter = '>';
    839             ADVANCE_TO(CharacterReferenceInAttributeValueState);
     839            ADVANCE_PAST_NON_NEWLINE_TO(CharacterReferenceInAttributeValueState);
    840840        }
    841841        if (character == '>') {
     
    851851            parseError();
    852852        m_token.appendToAttributeValue(character);
    853         ADVANCE_TO(AttributeValueUnquotedState);
     853        ADVANCE_PAST_NON_NEWLINE_TO(AttributeValueUnquotedState);
    854854    END_STATE()
    855855
     
    883883            ADVANCE_TO(BeforeAttributeNameState);
    884884        if (character == '/')
    885             ADVANCE_TO(SelfClosingStartTagState);
     885            ADVANCE_PAST_NON_NEWLINE_TO(SelfClosingStartTagState);
    886886        if (character == '>')
    887887            return emitAndResumeInDataState(source);
     
    933933                RETURN_IN_CURRENT_STATE(haveBufferedCharacterToken());
    934934        } else if (isASCIIAlphaCaselessEqual(character, 'd')) {
    935             auto result = source.advancePastIgnoringCase("doctype");
     935            auto result = source.advancePastLettersIgnoringASCIICase("doctype");
    936936            if (result == SegmentedString::DidMatch)
    937937                SWITCH_TO(DOCTYPEState);
     
    951951    BEGIN_STATE(CommentStartState)
    952952        if (character == '-')
    953             ADVANCE_TO(CommentStartDashState);
     953            ADVANCE_PAST_NON_NEWLINE_TO(CommentStartDashState);
    954954        if (character == '>') {
    955955            parseError();
     
    966966    BEGIN_STATE(CommentStartDashState)
    967967        if (character == '-')
    968             ADVANCE_TO(CommentEndState);
     968            ADVANCE_PAST_NON_NEWLINE_TO(CommentEndState);
    969969        if (character == '>') {
    970970            parseError();
     
    982982    BEGIN_STATE(CommentState)
    983983        if (character == '-')
    984             ADVANCE_TO(CommentEndDashState);
     984            ADVANCE_PAST_NON_NEWLINE_TO(CommentEndDashState);
    985985        if (character == kEndOfFileMarker) {
    986986            parseError();
     
    993993    BEGIN_STATE(CommentEndDashState)
    994994        if (character == '-')
    995             ADVANCE_TO(CommentEndState);
     995            ADVANCE_PAST_NON_NEWLINE_TO(CommentEndState);
    996996        if (character == kEndOfFileMarker) {
    997997            parseError();
     
    10081008        if (character == '!') {
    10091009            parseError();
    1010             ADVANCE_TO(CommentEndBangState);
     1010            ADVANCE_PAST_NON_NEWLINE_TO(CommentEndBangState);
    10111011        }
    10121012        if (character == '-') {
    10131013            parseError();
    10141014            m_token.appendToComment('-');
    1015             ADVANCE_TO(CommentEndState);
     1015            ADVANCE_PAST_NON_NEWLINE_TO(CommentEndState);
    10161016        }
    10171017        if (character == kEndOfFileMarker) {
     
    10311031            m_token.appendToComment('-');
    10321032            m_token.appendToComment('!');
    1033             ADVANCE_TO(CommentEndDashState);
     1033            ADVANCE_PAST_NON_NEWLINE_TO(CommentEndDashState);
    10341034        }
    10351035        if (character == '>')
     
    10751075        }
    10761076        m_token.beginDOCTYPE(toASCIILower(character));
    1077         ADVANCE_TO(DOCTYPENameState);
     1077        ADVANCE_PAST_NON_NEWLINE_TO(DOCTYPENameState);
    10781078    END_STATE()
    10791079
     
    10891089        }
    10901090        m_token.appendToName(toASCIILower(character));
    1091         ADVANCE_TO(DOCTYPENameState);
     1091        ADVANCE_PAST_NON_NEWLINE_TO(DOCTYPENameState);
    10921092    END_STATE()
    10931093
     
    11031103        }
    11041104        if (isASCIIAlphaCaselessEqual(character, 'p')) {
    1105             auto result = source.advancePastIgnoringCase("public");
     1105            auto result = source.advancePastLettersIgnoringASCIICase("public");
    11061106            if (result == SegmentedString::DidMatch)
    11071107                SWITCH_TO(AfterDOCTYPEPublicKeywordState);
     
    11091109                RETURN_IN_CURRENT_STATE(haveBufferedCharacterToken());
    11101110        } else if (isASCIIAlphaCaselessEqual(character, 's')) {
    1111             auto result = source.advancePastIgnoringCase("system");
     1111            auto result = source.advancePastLettersIgnoringASCIICase("system");
    11121112            if (result == SegmentedString::DidMatch)
    11131113                SWITCH_TO(AfterDOCTYPESystemKeywordState);
     
    11171117        parseError();
    11181118        m_token.setForceQuirks();
    1119         ADVANCE_TO(BogusDOCTYPEState);
     1119        ADVANCE_PAST_NON_NEWLINE_TO(BogusDOCTYPEState);
    11201120    END_STATE()
    11211121
     
    11261126            parseError();
    11271127            m_token.setPublicIdentifierToEmptyString();
    1128             ADVANCE_TO(DOCTYPEPublicIdentifierDoubleQuotedState);
     1128            ADVANCE_PAST_NON_NEWLINE_TO(DOCTYPEPublicIdentifierDoubleQuotedState);
    11291129        }
    11301130        if (character == '\'') {
    11311131            parseError();
    11321132            m_token.setPublicIdentifierToEmptyString();
    1133             ADVANCE_TO(DOCTYPEPublicIdentifierSingleQuotedState);
     1133            ADVANCE_PAST_NON_NEWLINE_TO(DOCTYPEPublicIdentifierSingleQuotedState);
    11341134        }
    11351135        if (character == '>') {
     
    11451145        parseError();
    11461146        m_token.setForceQuirks();
    1147         ADVANCE_TO(BogusDOCTYPEState);
     1147        ADVANCE_PAST_NON_NEWLINE_TO(BogusDOCTYPEState);
    11481148    END_STATE()
    11491149
     
    11531153        if (character == '"') {
    11541154            m_token.setPublicIdentifierToEmptyString();
    1155             ADVANCE_TO(DOCTYPEPublicIdentifierDoubleQuotedState);
     1155            ADVANCE_PAST_NON_NEWLINE_TO(DOCTYPEPublicIdentifierDoubleQuotedState);
    11561156        }
    11571157        if (character == '\'') {
    11581158            m_token.setPublicIdentifierToEmptyString();
    1159             ADVANCE_TO(DOCTYPEPublicIdentifierSingleQuotedState);
     1159            ADVANCE_PAST_NON_NEWLINE_TO(DOCTYPEPublicIdentifierSingleQuotedState);
    11601160        }
    11611161        if (character == '>') {
     
    11711171        parseError();
    11721172        m_token.setForceQuirks();
    1173         ADVANCE_TO(BogusDOCTYPEState);
     1173        ADVANCE_PAST_NON_NEWLINE_TO(BogusDOCTYPEState);
    11741174    END_STATE()
    11751175
    11761176    BEGIN_STATE(DOCTYPEPublicIdentifierDoubleQuotedState)
    11771177        if (character == '"')
    1178             ADVANCE_TO(AfterDOCTYPEPublicIdentifierState);
     1178            ADVANCE_PAST_NON_NEWLINE_TO(AfterDOCTYPEPublicIdentifierState);
    11791179        if (character == '>') {
    11801180            parseError();
     
    11931193    BEGIN_STATE(DOCTYPEPublicIdentifierSingleQuotedState)
    11941194        if (character == '\'')
    1195             ADVANCE_TO(AfterDOCTYPEPublicIdentifierState);
     1195            ADVANCE_PAST_NON_NEWLINE_TO(AfterDOCTYPEPublicIdentifierState);
    11961196        if (character == '>') {
    11971197            parseError();
     
    12161216            parseError();
    12171217            m_token.setSystemIdentifierToEmptyString();
    1218             ADVANCE_TO(DOCTYPESystemIdentifierDoubleQuotedState);
     1218            ADVANCE_PAST_NON_NEWLINE_TO(DOCTYPESystemIdentifierDoubleQuotedState);
    12191219        }
    12201220        if (character == '\'') {
    12211221            parseError();
    12221222            m_token.setSystemIdentifierToEmptyString();
    1223             ADVANCE_TO(DOCTYPESystemIdentifierSingleQuotedState);
     1223            ADVANCE_PAST_NON_NEWLINE_TO(DOCTYPESystemIdentifierSingleQuotedState);
    12241224        }
    12251225        if (character == kEndOfFileMarker) {
     
    12301230        parseError();
    12311231        m_token.setForceQuirks();
    1232         ADVANCE_TO(BogusDOCTYPEState);
     1232        ADVANCE_PAST_NON_NEWLINE_TO(BogusDOCTYPEState);
    12331233    END_STATE()
    12341234
     
    12401240        if (character == '"') {
    12411241            m_token.setSystemIdentifierToEmptyString();
    1242             ADVANCE_TO(DOCTYPESystemIdentifierDoubleQuotedState);
     1242            ADVANCE_PAST_NON_NEWLINE_TO(DOCTYPESystemIdentifierDoubleQuotedState);
    12431243        }
    12441244        if (character == '\'') {
    12451245            m_token.setSystemIdentifierToEmptyString();
    1246             ADVANCE_TO(DOCTYPESystemIdentifierSingleQuotedState);
     1246            ADVANCE_PAST_NON_NEWLINE_TO(DOCTYPESystemIdentifierSingleQuotedState);
    12471247        }
    12481248        if (character == kEndOfFileMarker) {
     
    12531253        parseError();
    12541254        m_token.setForceQuirks();
    1255         ADVANCE_TO(BogusDOCTYPEState);
     1255        ADVANCE_PAST_NON_NEWLINE_TO(BogusDOCTYPEState);
    12561256    END_STATE()
    12571257
     
    12621262            parseError();
    12631263            m_token.setSystemIdentifierToEmptyString();
    1264             ADVANCE_TO(DOCTYPESystemIdentifierDoubleQuotedState);
     1264            ADVANCE_PAST_NON_NEWLINE_TO(DOCTYPESystemIdentifierDoubleQuotedState);
    12651265        }
    12661266        if (character == '\'') {
    12671267            parseError();
    12681268            m_token.setSystemIdentifierToEmptyString();
    1269             ADVANCE_TO(DOCTYPESystemIdentifierSingleQuotedState);
     1269            ADVANCE_PAST_NON_NEWLINE_TO(DOCTYPESystemIdentifierSingleQuotedState);
    12701270        }
    12711271        if (character == '>') {
     
    12811281        parseError();
    12821282        m_token.setForceQuirks();
    1283         ADVANCE_TO(BogusDOCTYPEState);
     1283        ADVANCE_PAST_NON_NEWLINE_TO(BogusDOCTYPEState);
    12841284    END_STATE()
    12851285
     
    12891289        if (character == '"') {
    12901290            m_token.setSystemIdentifierToEmptyString();
    1291             ADVANCE_TO(DOCTYPESystemIdentifierDoubleQuotedState);
     1291            ADVANCE_PAST_NON_NEWLINE_TO(DOCTYPESystemIdentifierDoubleQuotedState);
    12921292        }
    12931293        if (character == '\'') {
    12941294            m_token.setSystemIdentifierToEmptyString();
    1295             ADVANCE_TO(DOCTYPESystemIdentifierSingleQuotedState);
     1295            ADVANCE_PAST_NON_NEWLINE_TO(DOCTYPESystemIdentifierSingleQuotedState);
    12961296        }
    12971297        if (character == '>') {
     
    13071307        parseError();
    13081308        m_token.setForceQuirks();
    1309         ADVANCE_TO(BogusDOCTYPEState);
     1309        ADVANCE_PAST_NON_NEWLINE_TO(BogusDOCTYPEState);
    13101310    END_STATE()
    13111311
    13121312    BEGIN_STATE(DOCTYPESystemIdentifierDoubleQuotedState)
    13131313        if (character == '"')
    1314             ADVANCE_TO(AfterDOCTYPESystemIdentifierState);
     1314            ADVANCE_PAST_NON_NEWLINE_TO(AfterDOCTYPESystemIdentifierState);
    13151315        if (character == '>') {
    13161316            parseError();
     
    13291329    BEGIN_STATE(DOCTYPESystemIdentifierSingleQuotedState)
    13301330        if (character == '\'')
    1331             ADVANCE_TO(AfterDOCTYPESystemIdentifierState);
     1331            ADVANCE_PAST_NON_NEWLINE_TO(AfterDOCTYPESystemIdentifierState);
    13321332        if (character == '>') {
    13331333            parseError();
     
    13551355        }
    13561356        parseError();
    1357         ADVANCE_TO(BogusDOCTYPEState);
     1357        ADVANCE_PAST_NON_NEWLINE_TO(BogusDOCTYPEState);
    13581358    END_STATE()
    13591359
     
    13681368    BEGIN_STATE(CDATASectionState)
    13691369        if (character == ']')
    1370             ADVANCE_TO(CDATASectionRightSquareBracketState);
     1370            ADVANCE_PAST_NON_NEWLINE_TO(CDATASectionRightSquareBracketState);
    13711371        if (character == kEndOfFileMarker)
    13721372            RECONSUME_IN(DataState);
     
    13771377    BEGIN_STATE(CDATASectionRightSquareBracketState)
    13781378        if (character == ']')
    1379             ADVANCE_TO(CDATASectionDoubleRightSquareBracketState);
     1379            ADVANCE_PAST_NON_NEWLINE_TO(CDATASectionDoubleRightSquareBracketState);
    13801380        bufferASCIICharacter(']');
    13811381        RECONSUME_IN(CDATASectionState);
     
    13841384    BEGIN_STATE(CDATASectionDoubleRightSquareBracketState)
    13851385        if (character == '>')
    1386             ADVANCE_TO(DataState);
     1386            ADVANCE_PAST_NON_NEWLINE_TO(DataState);
    13871387        bufferASCIICharacter(']');
    13881388        bufferASCIICharacter(']');
  • trunk/Source/WebCore/html/parser/InputStreamPreprocessor.h

    r208179 r209058  
    2929
    3030#include "SegmentedString.h"
    31 #include <wtf/Noncopyable.h>
    3231#include <wtf/unicode/CharacterNames.h>
    3332
    3433namespace WebCore {
    3534
    36 const LChar kEndOfFileMarker = 0;
    37 
    3835// http://www.whatwg.org/specs/web-apps/current-work/#preprocessing-the-input-stream
    3936template <typename Tokenizer>
    4037class InputStreamPreprocessor {
    41     WTF_MAKE_NONCOPYABLE(InputStreamPreprocessor);
    4238public:
    4339    explicit InputStreamPreprocessor(Tokenizer& tokenizer)
    4440        : m_tokenizer(tokenizer)
    4541    {
    46         reset();
    4742    }
    4843
     
    5449    ALWAYS_INLINE bool peek(SegmentedString& source, bool skipNullCharacters = false)
    5550    {
    56         if (source.isEmpty())
     51        if (UNLIKELY(source.isEmpty()))
    5752            return false;
    5853
    59         m_nextInputCharacter = source.currentChar();
     54        m_nextInputCharacter = source.currentCharacter();
    6055
    6156        // Every branch in this function is expensive, so we have a
     
    6358        // handling. Please run the parser benchmark whenever you touch
    6459        // this function. It's very hot.
    65         static const UChar specialCharacterMask = '\n' | '\r' | '\0';
    66         if (m_nextInputCharacter & ~specialCharacterMask) {
     60        constexpr UChar specialCharacterMask = '\n' | '\r' | '\0';
     61        if (LIKELY(m_nextInputCharacter & ~specialCharacterMask)) {
    6762            m_skipNextNewLine = false;
    6863            return true;
    6964        }
     65
    7066        return processNextInputCharacter(source, skipNullCharacters);
    7167    }
     
    7470    ALWAYS_INLINE bool advance(SegmentedString& source, bool skipNullCharacters = false)
    7571    {
    76         source.advanceAndUpdateLineNumber();
     72        source.advance();
    7773        return peek(source, skipNullCharacters);
    7874    }
    79 
    80     bool skipNextNewLine() const { return m_skipNextNewLine; }
    81 
    82     void reset(bool skipNextNewLine = false)
     75    ALWAYS_INLINE bool advancePastNonNewline(SegmentedString& source, bool skipNullCharacters = false)
    8376    {
    84         m_nextInputCharacter = '\0';
    85         m_skipNextNewLine = skipNextNewLine;
     77        source.advancePastNonNewline();
     78        return peek(source, skipNullCharacters);
    8679    }
    8780
     
    9083    {
    9184    ProcessAgain:
    92         ASSERT(m_nextInputCharacter == source.currentChar());
    93 
     85        ASSERT(m_nextInputCharacter == source.currentCharacter());
    9486        if (m_nextInputCharacter == '\n' && m_skipNextNewLine) {
    9587            m_skipNextNewLine = false;
    96             source.advancePastNewlineAndUpdateLineNumber();
     88            source.advancePastNewline();
    9789            if (source.isEmpty())
    9890                return false;
    99             m_nextInputCharacter = source.currentChar();
     91            m_nextInputCharacter = source.currentCharacter();
    10092        }
    10193        if (m_nextInputCharacter == '\r') {
    10294            m_nextInputCharacter = '\n';
    10395            m_skipNextNewLine = true;
    104         } else {
    105             m_skipNextNewLine = false;
    106             // FIXME: The spec indicates that the surrogate pair range as well as
    107             // a number of specific character values are parse errors and should be replaced
    108             // by the replacement character. We suspect this is a problem with the spec as doing
    109             // that filtering breaks surrogate pair handling and causes us not to match Minefield.
    110             if (m_nextInputCharacter == '\0' && !shouldTreatNullAsEndOfFileMarker(source)) {
    111                 if (skipNullCharacters && !m_tokenizer.neverSkipNullCharacters()) {
    112                     source.advancePastNonNewline();
    113                     if (source.isEmpty())
    114                         return false;
    115                     m_nextInputCharacter = source.currentChar();
    116                     goto ProcessAgain;
    117                 }
    118                 m_nextInputCharacter = replacementCharacter;
    119             }
     96            return true;
    12097        }
     98        m_skipNextNewLine = false;
     99        if (m_nextInputCharacter || isAtEndOfFile(source))
     100            return true;
     101        if (skipNullCharacters && !m_tokenizer.neverSkipNullCharacters()) {
     102            source.advancePastNonNewline();
     103            if (source.isEmpty())
     104                return false;
     105            m_nextInputCharacter = source.currentCharacter();
     106            goto ProcessAgain;
     107        }
     108        m_nextInputCharacter = replacementCharacter;
    121109        return true;
    122110    }
    123111
    124     bool shouldTreatNullAsEndOfFileMarker(SegmentedString& source) const
     112    static bool isAtEndOfFile(SegmentedString& source)
    125113    {
    126114        return source.isClosed() && source.length() == 1;
     
    130118
    131119    // http://www.whatwg.org/specs/web-apps/current-work/#next-input-character
    132     UChar m_nextInputCharacter;
    133     bool m_skipNextNewLine;
     120    UChar m_nextInputCharacter { 0 };
     121    bool m_skipNextNewLine { false };
    134122};
    135123
  • trunk/Source/WebCore/html/track/BufferedLineReader.cpp

    r165997 r209058  
    3636namespace WebCore {
    3737
    38 bool BufferedLineReader::getLine(String& line)
     38std::optional<String> BufferedLineReader::nextLine()
    3939{
    4040    if (m_maybeSkipLF) {
     
    4343        // then skip it, and then (unconditionally) return the buffered line.
    4444        if (!m_buffer.isEmpty()) {
    45             scanCharacter(newlineCharacter);
     45            if (m_buffer.currentCharacter() == newlineCharacter)
     46                m_buffer.advancePastNewline();
    4647            m_maybeSkipLF = false;
    4748        }
    4849        // If there was no (new) data available, then keep m_maybeSkipLF set,
    49         // and fall through all the way down to the EOS check at the end of
    50         // the method.
     50        // and fall through all the way down to the EOS check at the end of the function.
    5151    }
    5252
     
    5454    bool checkForLF = false;
    5555    while (!m_buffer.isEmpty()) {
    56         UChar c = m_buffer.currentChar();
     56        UChar character = m_buffer.currentCharacter();
    5757        m_buffer.advance();
    5858
    59         if (c == newlineCharacter || c == carriageReturn) {
     59        if (character == newlineCharacter || character == carriageReturn) {
    6060            // We found a line ending. Return the accumulated line.
    6161            shouldReturnLine = true;
    62             checkForLF = (c == carriageReturn);
     62            checkForLF = (character == carriageReturn);
    6363            break;
    6464        }
     
    6666        // NULs are transformed into U+FFFD (REPLACEMENT CHAR.) in step 1 of
    6767        // the WebVTT parser algorithm.
    68         if (c == '\0')
    69             c = replacementCharacter;
     68        if (character == '\0')
     69            character = replacementCharacter;
    7070
    71         m_lineBuffer.append(c);
     71        m_lineBuffer.append(character);
    7272    }
    7373
     
    7575        // May be in the middle of a CRLF pair.
    7676        if (!m_buffer.isEmpty()) {
    77             // Scan a potential newline character.
    78             scanCharacter(newlineCharacter);
     77            if (m_buffer.currentCharacter() == newlineCharacter)
     78                m_buffer.advancePastNewline();
    7979        } else {
    80             // Check for the LF on the next call (unless we reached EOS, in
     80            // Check for the newline on the next call (unless we reached EOS, in
    8181            // which case we'll return the contents of the line buffer, and
    8282            // reset state for the next line.)
     
    9393
    9494    if (shouldReturnLine) {
    95         line = m_lineBuffer.toString();
     95        auto line = m_lineBuffer.toString();
    9696        m_lineBuffer.clear();
    97         return true;
     97        return WTFMove(line);
    9898    }
    9999
    100100    ASSERT(m_buffer.isEmpty());
    101     return false;
     101    return std::nullopt;
    102102}
    103103
  • trunk/Source/WebCore/html/track/BufferedLineReader.h

    r208179 r209058  
    3939//
    4040// Converts a stream of data (== a sequence of Strings) into a set of
    41 // lines. CR, LR or CRLF are considered linebreaks. Normalizes NULs (U+0000)
    42 // to 'REPLACEMENT CHARACTER' (U+FFFD) and does not return the linebreaks as
     41// lines. CR, LR or CRLF are considered line breaks. Normalizes NULs (U+0000)
     42// to 'REPLACEMENT CHARACTER' (U+FFFD) and does not return the line breaks as
    4343// part of the result.
    4444class BufferedLineReader {
    4545    WTF_MAKE_NONCOPYABLE(BufferedLineReader);
    4646public:
    47     BufferedLineReader()
    48         : m_endOfStream(false)
    49         , m_maybeSkipLF(false) { }
     47    BufferedLineReader() = default;
     48    void reset();
    5049
    51     // Append data to the internal buffer.
    52     void append(const String& data)
     50    void append(String&& data)
    5351    {
    5452        ASSERT(!m_endOfStream);
    55         m_buffer.append(SegmentedString(data));
     53        m_buffer.append(WTFMove(data));
    5654    }
    5755
    58     // Indicate that no more data will be appended. This will cause any
    59     // potentially "unterminated" line to be returned from getLine.
    60     void setEndOfStream() { m_endOfStream = true; }
    61 
    62     // Attempt to read a line from the internal buffer (fed via append).
    63     // If successful, true is returned and |line| is set to the line that was
    64     // read. If no line could be read false is returned.
    65     bool getLine(String& line);
    66 
    67     // Returns true if EOS has been reached proper.
     56    void appendEndOfStream() { m_endOfStream = true; }
    6857    bool isAtEndOfStream() const { return m_endOfStream && m_buffer.isEmpty(); }
    6958
    70     void reset() { m_buffer.clear(); }
     59    std::optional<String> nextLine();
    7160
    7261private:
    73     // Consume the next character the buffer if it is the character |c|.
    74     void scanCharacter(UChar c)
    75     {
    76         ASSERT(!m_buffer.isEmpty());
    77         if (m_buffer.currentChar() == c)
    78             m_buffer.advance();
    79     }
    80 
    8162    SegmentedString m_buffer;
    8263    StringBuilder m_lineBuffer;
    83     bool m_endOfStream;
    84     bool m_maybeSkipLF;
     64    bool m_endOfStream { false };
     65    bool m_maybeSkipLF { false };
    8566};
    8667
     68inline void BufferedLineReader::reset()
     69{
     70    m_buffer.clear();
     71    m_lineBuffer.clear();
     72    m_endOfStream = false;
     73    m_maybeSkipLF = false;
     74}
     75
    8776} // namespace WebCore
  • trunk/Source/WebCore/html/track/InbandGenericTextTrack.cpp

    r208658 r209058  
    186186}
    187187
    188 void InbandGenericTextTrack::parseWebVTTFileHeader(InbandTextTrackPrivate* trackPrivate, String header)
     188void InbandGenericTextTrack::parseWebVTTFileHeader(InbandTextTrackPrivate* trackPrivate, String&& header)
    189189{
    190190    ASSERT_UNUSED(trackPrivate, trackPrivate == m_private);
    191     parser().parseFileHeader(header);
     191    parser().parseFileHeader(WTFMove(header));
    192192}
    193193
  • trunk/Source/WebCore/html/track/InbandGenericTextTrack.h

    r207907 r209058  
    7373    WebVTTParser& parser();
    7474    void parseWebVTTCueData(InbandTextTrackPrivate*, const ISOWebVTTCue&) final;
    75     void parseWebVTTFileHeader(InbandTextTrackPrivate*, String) final;
     75    void parseWebVTTFileHeader(InbandTextTrackPrivate*, String&&) final;
    7676
    7777    void newCuesParsed() final;
  • trunk/Source/WebCore/html/track/InbandTextTrack.h

    r200361 r209058  
    8080    void removeGenericCue(InbandTextTrackPrivate*, GenericCueData*) override { ASSERT_NOT_REACHED(); }
    8181
    82     void parseWebVTTFileHeader(InbandTextTrackPrivate*, String) override { ASSERT_NOT_REACHED(); }
     82    void parseWebVTTFileHeader(InbandTextTrackPrivate*, String&&) override { ASSERT_NOT_REACHED(); }
    8383    void parseWebVTTCueData(InbandTextTrackPrivate*, const char*, unsigned) override { ASSERT_NOT_REACHED(); }
    8484    void parseWebVTTCueData(InbandTextTrackPrivate*, const ISOWebVTTCue&) override { ASSERT_NOT_REACHED(); }
  • trunk/Source/WebCore/html/track/WebVTTParser.cpp

    r203302 r209058  
    105105}
    106106
    107 void WebVTTParser::parseFileHeader(const String& data)
     107void WebVTTParser::parseFileHeader(String&& data)
    108108{
    109109    m_state = Initial;
    110110    m_lineReader.reset();
    111     m_lineReader.append(data);
     111    m_lineReader.append(WTFMove(data));
    112112    parse();
    113113}
     
    115115void WebVTTParser::parseBytes(const char* data, unsigned length)
    116116{
    117     String textData = m_decoder->decode(data, length);
    118     m_lineReader.append(textData);
     117    m_lineReader.append(m_decoder->decode(data, length));
    119118    parse();
    120119}
     
    122121void WebVTTParser::parseCueData(const ISOWebVTTCue& data)
    123122{
    124     RefPtr<WebVTTCueData> cue = WebVTTCueData::create();
     123    auto cue = WebVTTCueData::create();
    125124
    126125    MediaTime startTime = data.presentationTime();
     
    136135        cue->setOriginalStartTime(originalStartTime);
    137136
    138     m_cuelist.append(cue);
     137    m_cuelist.append(WTFMove(cue));
    139138    if (m_client)
    140139        m_client->newCuesParsed();
     
    143142void WebVTTParser::flush()
    144143{
    145     String textData = m_decoder->flush();
    146     m_lineReader.append(textData);
    147     m_lineReader.setEndOfStream();
     144    m_lineReader.append(m_decoder->flush());
     145    m_lineReader.appendEndOfStream();
    148146    parse();
    149147    flushPendingCue();
     
    154152    // WebVTT parser algorithm. (5.1 WebVTT file parsing.)
    155153    // Steps 1 - 3 - Initial setup.
    156     String line;
    157     while (m_lineReader.getLine(line)) {
    158         if (line.isNull())
    159             return;
    160 
     154    while (auto line = m_lineReader.nextLine()) {
    161155        switch (m_state) {
    162156        case Initial:
    163157            // Steps 4 - 9 - Check for a valid WebVTT signature.
    164             if (!hasRequiredFileIdentifier(line)) {
     158            if (!hasRequiredFileIdentifier(*line)) {
    165159                if (m_client)
    166160                    m_client->fileFailedToParse();
     
    172166
    173167        case Header:
    174             collectMetadataHeader(line);
    175 
    176             if (line.isEmpty()) {
     168            collectMetadataHeader(*line);
     169
     170            if (line->isEmpty()) {
    177171                // Steps 10-14 - Allow a header (comment area) under the WEBVTT line.
    178172                if (m_client && m_regionList.size())
     
    182176            }
    183177            // Step 15 - Break out of header loop if the line could be a timestamp line.
    184             if (line.contains("-->"))
    185                 m_state = recoverCue(line);
     178            if (line->contains("-->"))
     179                m_state = recoverCue(*line);
    186180
    187181            // Step 16 - Line is not the empty string and does not contain "-->".
     
    190184        case Id:
    191185            // Steps 17 - 20 - Allow any number of line terminators, then initialize new cue values.
    192             if (line.isEmpty())
     186            if (line->isEmpty())
    193187                break;
    194188
     
    197191
    198192            // Steps 22 - 25 - Check if this line contains an optional identifier or timing data.
    199             m_state = collectCueId(line);
     193            m_state = collectCueId(*line);
    200194            break;
    201195
    202196        case TimingsAndSettings:
    203197            // Steps 26 - 27 - Discard current cue if the line is empty.
    204             if (line.isEmpty()) {
     198            if (line->isEmpty()) {
    205199                m_state = Id;
    206200                break;
     
    208202
    209203            // Steps 28 - 29 - Collect cue timings and settings.
    210             m_state = collectTimingsAndSettings(line);
     204            m_state = collectTimingsAndSettings(*line);
    211205            break;
    212206
    213207        case CueText:
    214208            // Steps 31 - 41 - Collect the cue text, create a cue, and add it to the output.
    215             m_state = collectCueText(line);
     209            m_state = collectCueText(*line);
    216210            break;
    217211
    218212        case BadCue:
    219213            // Steps 42 - 48 - Discard lines until an empty line or a potential timing line is seen.
    220             m_state = ignoreBadCue(line);
     214            m_state = ignoreBadCue(*line);
    221215            break;
    222216
  • trunk/Source/WebCore/html/track/WebVTTParser.h

    r208179 r209058  
    134134    // Input data to the parser to parse.
    135135    void parseBytes(const char*, unsigned);
    136     void parseFileHeader(const String&);
     136    void parseFileHeader(String&&);
    137137    void parseCueData(const ISOWebVTTCue&);
    138138    void flush();
  • trunk/Source/WebCore/html/track/WebVTTTokenizer.cpp

    r178265 r209058  
    3131
    3232#include "config.h"
     33#include "WebVTTTokenizer.h"
    3334
    3435#if ENABLE(VIDEO_TRACK)
    35 
    36 #include "WebVTTTokenizer.h"
    3736
    3837#include "MarkupTokenizerInlines.h"
     
    4948        goto stateName;                                     \
    5049    } while (false)
    51    
     50
    5251template<unsigned charactersCount> ALWAYS_INLINE bool equalLiteral(const StringBuilder& s, const char (&characters)[charactersCount])
    5352{
     
    7069inline bool advanceAndEmitToken(SegmentedString& source, WebVTTToken& resultToken, const WebVTTToken& token)
    7170{
    72     source.advanceAndUpdateLineNumber();
     71    source.advance();
    7372    return emitToken(resultToken, token);
    7473}
     
    8079    // Append an EOF marker and close the input "stream".
    8180    ASSERT(!m_input.isClosed());
    82     m_input.append(SegmentedString(String(&kEndOfFileMarker, 1)));
     81    m_input.append(String { &kEndOfFileMarker, 1 });
    8382    m_input.close();
    8483}
  • trunk/Source/WebCore/platform/graphics/InbandTextTrackPrivateClient.h

    r206538 r209058  
    181181    virtual void removeGenericCue(InbandTextTrackPrivate*, GenericCueData*) = 0;
    182182
    183     virtual void parseWebVTTFileHeader(InbandTextTrackPrivate*, String) { ASSERT_NOT_REACHED(); }
     183    virtual void parseWebVTTFileHeader(InbandTextTrackPrivate*, String&&) { ASSERT_NOT_REACHED(); }
    184184    virtual void parseWebVTTCueData(InbandTextTrackPrivate*, const char* data, unsigned length) = 0;
    185185    virtual void parseWebVTTCueData(InbandTextTrackPrivate*, const ISOWebVTTCue&) = 0;
  • trunk/Source/WebCore/platform/text/SegmentedString.cpp

    r178265 r209058  
    11/*
    2     Copyright (C) 2004, 2005, 2006, 2007, 2008 Apple Inc. All rights reserved.
     2    Copyright (C) 2004-2016 Apple Inc. All rights reserved.
    33
    44    This library is free software; you can redistribute it and/or
     
    2121#include "SegmentedString.h"
    2222
     23#include <wtf/text/StringBuilder.h>
    2324#include <wtf/text/TextPosition.h>
    2425
    2526namespace WebCore {
    2627
    27 SegmentedString::SegmentedString(const SegmentedString& other)
    28     : m_pushedChar1(other.m_pushedChar1)
    29     , m_pushedChar2(other.m_pushedChar2)
    30     , m_currentString(other.m_currentString)
    31     , m_numberOfCharactersConsumedPriorToCurrentString(other.m_numberOfCharactersConsumedPriorToCurrentString)
    32     , m_numberOfCharactersConsumedPriorToCurrentLine(other.m_numberOfCharactersConsumedPriorToCurrentLine)
    33     , m_currentLine(other.m_currentLine)
    34     , m_substrings(other.m_substrings)
    35     , m_closed(other.m_closed)
    36     , m_empty(other.m_empty)
    37     , m_fastPathFlags(other.m_fastPathFlags)
    38     , m_advanceFunc(other.m_advanceFunc)
    39     , m_advanceAndUpdateLineNumberFunc(other.m_advanceAndUpdateLineNumberFunc)
    40 {
    41     if (m_pushedChar2)
    42         m_currentChar = m_pushedChar2;
    43     else if (m_pushedChar1)
    44         m_currentChar = m_pushedChar1;
    45     else
    46         m_currentChar = m_currentString.m_length ? m_currentString.getCurrentChar() : 0;
    47 }
    48 
    49 SegmentedString& SegmentedString::operator=(const SegmentedString& other)
    50 {
    51     m_pushedChar1 = other.m_pushedChar1;
    52     m_pushedChar2 = other.m_pushedChar2;
    53     m_currentString = other.m_currentString;
    54     m_substrings = other.m_substrings;
    55     if (m_pushedChar2)
    56         m_currentChar = m_pushedChar2;
    57     else if (m_pushedChar1)
    58         m_currentChar = m_pushedChar1;
    59     else
    60         m_currentChar = m_currentString.m_length ? m_currentString.getCurrentChar() : 0;
    61 
    62     m_closed = other.m_closed;
    63     m_empty = other.m_empty;
    64     m_fastPathFlags = other.m_fastPathFlags;
    65     m_numberOfCharactersConsumedPriorToCurrentString = other.m_numberOfCharactersConsumedPriorToCurrentString;
     28inline void SegmentedString::Substring::appendTo(StringBuilder& builder) const
     29{
     30    builder.append(string, string.length() - length, length);
     31}
     32
     33SegmentedString& SegmentedString::operator=(SegmentedString&& other)
     34{
     35    m_currentSubstring = WTFMove(other.m_currentSubstring);
     36    m_otherSubstrings = WTFMove(other.m_otherSubstrings);
     37
     38    m_isClosed = other.m_isClosed;
     39
     40    m_currentCharacter = other.m_currentCharacter;
     41
     42    m_numberOfCharactersConsumedPriorToCurrentSubstring = other.m_numberOfCharactersConsumedPriorToCurrentSubstring;
    6643    m_numberOfCharactersConsumedPriorToCurrentLine = other.m_numberOfCharactersConsumedPriorToCurrentLine;
    6744    m_currentLine = other.m_currentLine;
    6845
    69     m_advanceFunc = other.m_advanceFunc;
    70     m_advanceAndUpdateLineNumberFunc = other.m_advanceAndUpdateLineNumberFunc;
     46    m_fastPathFlags = other.m_fastPathFlags;
     47    m_advanceWithoutUpdatingLineNumberFunction = other.m_advanceWithoutUpdatingLineNumberFunction;
     48    m_advanceAndUpdateLineNumberFunction = other.m_advanceAndUpdateLineNumberFunction;
     49
     50    other.clear();
    7151
    7252    return *this;
     
    7555unsigned SegmentedString::length() const
    7656{
    77     unsigned length = m_currentString.m_length;
    78     if (m_pushedChar1) {
    79         ++length;
    80         if (m_pushedChar2)
    81             ++length;
    82     }
    83     if (isComposite()) {
    84         Deque<SegmentedSubstring>::const_iterator it = m_substrings.begin();
    85         Deque<SegmentedSubstring>::const_iterator e = m_substrings.end();
    86         for (; it != e; ++it)
    87             length += it->m_length;
    88     }
     57    unsigned length = m_currentSubstring.length;
     58    for (auto& substring : m_otherSubstrings)
     59        length += substring.length;
    8960    return length;
    9061}
     
    9263void SegmentedString::setExcludeLineNumbers()
    9364{
    94     m_currentString.setExcludeLineNumbers();
    95     if (isComposite()) {
    96         Deque<SegmentedSubstring>::iterator it = m_substrings.begin();
    97         Deque<SegmentedSubstring>::iterator e = m_substrings.end();
    98         for (; it != e; ++it)
    99             it->setExcludeLineNumbers();
    100     }
     65    if (!m_currentSubstring.doNotExcludeLineNumbers)
     66        return;
     67    m_currentSubstring.doNotExcludeLineNumbers = false;
     68    for (auto& substring : m_otherSubstrings)
     69        substring.doNotExcludeLineNumbers = false;
     70    updateAdvanceFunctionPointers();
    10171}
    10272
    10373void SegmentedString::clear()
    10474{
    105     m_pushedChar1 = 0;
    106     m_pushedChar2 = 0;
    107     m_currentChar = 0;
    108     m_currentString.clear();
    109     m_numberOfCharactersConsumedPriorToCurrentString = 0;
     75    m_currentSubstring.length = 0;
     76    m_otherSubstrings.clear();
     77
     78    m_isClosed = false;
     79
     80    m_currentCharacter = 0;
     81
     82    m_numberOfCharactersConsumedPriorToCurrentSubstring = 0;
    11083    m_numberOfCharactersConsumedPriorToCurrentLine = 0;
    11184    m_currentLine = 0;
    112     m_substrings.clear();
    113     m_closed = false;
    114     m_empty = true;
    115     m_fastPathFlags = NoFastPath;
    116     m_advanceFunc = &SegmentedString::advanceEmpty;
    117     m_advanceAndUpdateLineNumberFunc = &SegmentedString::advanceEmpty;
    118 }
    119 
    120 void SegmentedString::append(const SegmentedSubstring& s)
    121 {
    122     ASSERT(!m_closed);
    123     if (!s.m_length)
     85
     86    updateAdvanceFunctionPointersForEmptyString();
     87}
     88
     89inline void SegmentedString::appendSubstring(Substring&& substring)
     90{
     91    ASSERT(!m_isClosed);
     92    if (!substring.length)
    12493        return;
    125 
    126     if (!m_currentString.m_length) {
    127         m_numberOfCharactersConsumedPriorToCurrentString += m_currentString.numberOfCharactersConsumed();
    128         m_currentString = s;
    129         updateAdvanceFunctionPointers();
    130     } else
    131         m_substrings.append(s);
    132     m_empty = false;
    133 }
    134 
    135 void SegmentedString::pushBack(const SegmentedSubstring& s)
    136 {
    137     ASSERT(!m_pushedChar1);
    138     ASSERT(!s.numberOfCharactersConsumed());
    139     if (!s.m_length)
    140         return;
    141 
    142     // FIXME: We're assuming that the characters were originally consumed by
    143     //        this SegmentedString.  We're also ASSERTing that s is a fresh
    144     //        SegmentedSubstring.  These assumptions are sufficient for our
    145     //        current use, but we might need to handle the more elaborate
    146     //        cases in the future.
    147     m_numberOfCharactersConsumedPriorToCurrentString += m_currentString.numberOfCharactersConsumed();
    148     m_numberOfCharactersConsumedPriorToCurrentString -= s.m_length;
    149     if (!m_currentString.m_length) {
    150         m_currentString = s;
    151         updateAdvanceFunctionPointers();
    152     } else {
    153         // Shift our m_currentString into our list.
    154         m_substrings.prepend(m_currentString);
    155         m_currentString = s;
     94    if (m_currentSubstring.length)
     95        m_otherSubstrings.append(WTFMove(substring));
     96    else {
     97        m_numberOfCharactersConsumedPriorToCurrentSubstring += m_currentSubstring.numberOfCharactersConsumed();
     98        m_currentSubstring = WTFMove(substring);
     99        m_currentCharacter = m_currentSubstring.currentCharacter();
    156100        updateAdvanceFunctionPointers();
    157101    }
    158     m_empty = false;
     102}
     103
     104void SegmentedString::pushBack(String&& string)
     105{
     106    // We never create a substring for an empty string.
     107    ASSERT(string.length());
     108
     109    // The new substring we will create won't have the doNotExcludeLineNumbers set appropriately.
     110    // That was lost when the characters were consumed before pushing them back. But this does
     111    // not matter, because clients never use this for newlines. Catch that with this assertion.
     112    ASSERT(!string.contains('\n'));
     113
     114    // The characters in the string must be previously consumed characters from this segmented string.
     115    ASSERT(string.length() <= numberOfCharactersConsumed());
     116
     117    m_numberOfCharactersConsumedPriorToCurrentSubstring += m_currentSubstring.numberOfCharactersConsumed();
     118    if (m_currentSubstring.length)
     119        m_otherSubstrings.prepend(WTFMove(m_currentSubstring));
     120    m_currentSubstring = WTFMove(string);
     121    m_numberOfCharactersConsumedPriorToCurrentSubstring -= m_currentSubstring.length;
     122    m_currentCharacter = m_currentSubstring.currentCharacter();
     123    updateAdvanceFunctionPointers();
    159124}
    160125
    161126void SegmentedString::close()
    162127{
    163     // Closing a stream twice is likely a coding mistake.
    164     ASSERT(!m_closed);
    165     m_closed = true;
    166 }
    167 
    168 void SegmentedString::append(const SegmentedString& s)
    169 {
    170     ASSERT(!m_closed);
    171     ASSERT(!s.m_pushedChar1);
    172     append(s.m_currentString);
    173     if (s.isComposite()) {
    174         Deque<SegmentedSubstring>::const_iterator it = s.m_substrings.begin();
    175         Deque<SegmentedSubstring>::const_iterator e = s.m_substrings.end();
    176         for (; it != e; ++it)
    177             append(*it);
     128    ASSERT(!m_isClosed);
     129    m_isClosed = true;
     130}
     131
     132void SegmentedString::append(const SegmentedString& string)
     133{
     134    appendSubstring(Substring { string.m_currentSubstring });
     135    for (auto& substring : string.m_otherSubstrings)
     136        m_otherSubstrings.append(substring);
     137}
     138
     139void SegmentedString::append(SegmentedString&& string)
     140{
     141    appendSubstring(WTFMove(string.m_currentSubstring));
     142    for (auto& substring : string.m_otherSubstrings)
     143        m_otherSubstrings.append(WTFMove(substring));
     144}
     145
     146void SegmentedString::append(String&& string)
     147{
     148    appendSubstring(WTFMove(string));
     149}
     150
     151void SegmentedString::append(const String& string)
     152{
     153    appendSubstring(String { string });
     154}
     155
     156String SegmentedString::toString() const
     157{
     158    StringBuilder result;
     159    m_currentSubstring.appendTo(result);
     160    for (auto& substring : m_otherSubstrings)
     161        substring.appendTo(result);
     162    return result.toString();
     163}
     164
     165void SegmentedString::advanceWithoutUpdatingLineNumber16()
     166{
     167    m_currentCharacter = *++m_currentSubstring.currentCharacter16;
     168    decrementAndCheckLength();
     169}
     170
     171void SegmentedString::advanceAndUpdateLineNumber16()
     172{
     173    ASSERT(m_currentSubstring.doNotExcludeLineNumbers);
     174    processPossibleNewline();
     175    m_currentCharacter = *++m_currentSubstring.currentCharacter16;
     176    decrementAndCheckLength();
     177}
     178
     179inline void SegmentedString::advancePastSingleCharacterSubstringWithoutUpdatingLineNumber()
     180{
     181    ASSERT(m_currentSubstring.length == 1);
     182    if (m_otherSubstrings.isEmpty()) {
     183        m_currentSubstring.length = 0;
     184        m_currentCharacter = 0;
     185        updateAdvanceFunctionPointersForEmptyString();
     186        return;
    178187    }
    179     m_currentChar = m_pushedChar1 ? m_pushedChar1 : (m_currentString.m_length ? m_currentString.getCurrentChar() : 0);
    180 }
    181 
    182 void SegmentedString::pushBack(const SegmentedString& s)
    183 {
    184     ASSERT(!m_pushedChar1);
    185     ASSERT(!s.m_pushedChar1);
    186     if (s.isComposite()) {
    187         Deque<SegmentedSubstring>::const_reverse_iterator it = s.m_substrings.rbegin();
    188         Deque<SegmentedSubstring>::const_reverse_iterator e = s.m_substrings.rend();
    189         for (; it != e; ++it)
    190             pushBack(*it);
    191     }
    192     pushBack(s.m_currentString);
    193     m_currentChar = m_pushedChar1 ? m_pushedChar1 : (m_currentString.m_length ? m_currentString.getCurrentChar() : 0);
    194 }
    195 
    196 void SegmentedString::advanceSubstring()
    197 {
    198     if (isComposite()) {
    199         m_numberOfCharactersConsumedPriorToCurrentString += m_currentString.numberOfCharactersConsumed();
    200         m_currentString = m_substrings.takeFirst();
    201         // If we've previously consumed some characters of the non-current
    202         // string, we now account for those characters as part of the current
    203         // string, not as part of "prior to current string."
    204         m_numberOfCharactersConsumedPriorToCurrentString -= m_currentString.numberOfCharactersConsumed();
    205         updateAdvanceFunctionPointers();
    206     } else {
    207         m_currentString.clear();
    208         m_empty = true;
    209         m_fastPathFlags = NoFastPath;
    210         m_advanceFunc = &SegmentedString::advanceEmpty;
    211         m_advanceAndUpdateLineNumberFunc = &SegmentedString::advanceEmpty;
    212     }
    213 }
    214 
    215 String SegmentedString::toString() const
    216 {
    217     StringBuilder result;
    218     if (m_pushedChar1) {
    219         result.append(m_pushedChar1);
    220         if (m_pushedChar2)
    221             result.append(m_pushedChar2);
    222     }
    223     m_currentString.appendTo(result);
    224     if (isComposite()) {
    225         Deque<SegmentedSubstring>::const_iterator it = m_substrings.begin();
    226         Deque<SegmentedSubstring>::const_iterator e = m_substrings.end();
    227         for (; it != e; ++it)
    228             it->appendTo(result);
    229     }
    230     return result.toString();
    231 }
    232 
    233 void SegmentedString::advancePastNonNewlines(unsigned count, UChar* consumedCharacters)
    234 {
    235     ASSERT_WITH_SECURITY_IMPLICATION(count <= length());
    236     for (unsigned i = 0; i < count; ++i) {
    237         consumedCharacters[i] = currentChar();
    238         advancePastNonNewline();
    239     }
    240 }
    241 
    242 void SegmentedString::advance8()
    243 {
    244     ASSERT(!m_pushedChar1);
    245     decrementAndCheckLength();
    246     m_currentChar = m_currentString.incrementAndGetCurrentChar8();
    247 }
    248 
    249 void SegmentedString::advance16()
    250 {
    251     ASSERT(!m_pushedChar1);
    252     decrementAndCheckLength();
    253     m_currentChar = m_currentString.incrementAndGetCurrentChar16();
    254 }
    255 
    256 void SegmentedString::advanceAndUpdateLineNumber8()
    257 {
    258     ASSERT(!m_pushedChar1);
    259     ASSERT(m_currentString.getCurrentChar() == m_currentChar);
    260     if (m_currentChar == '\n') {
    261         ++m_currentLine;
    262         m_numberOfCharactersConsumedPriorToCurrentLine = numberOfCharactersConsumed() + 1;
    263     }
    264     decrementAndCheckLength();
    265     m_currentChar = m_currentString.incrementAndGetCurrentChar8();
    266 }
    267 
    268 void SegmentedString::advanceAndUpdateLineNumber16()
    269 {
    270     ASSERT(!m_pushedChar1);
    271     ASSERT(m_currentString.getCurrentChar() == m_currentChar);
    272     if (m_currentChar == '\n') {
    273         ++m_currentLine;
    274         m_numberOfCharactersConsumedPriorToCurrentLine = numberOfCharactersConsumed() + 1;
    275     }
    276     decrementAndCheckLength();
    277     m_currentChar = m_currentString.incrementAndGetCurrentChar16();
    278 }
    279 
    280 void SegmentedString::advanceSlowCase()
    281 {
    282     if (m_pushedChar1) {
    283         m_pushedChar1 = m_pushedChar2;
    284         m_pushedChar2 = 0;
    285 
    286         if (m_pushedChar1) {
    287             m_currentChar = m_pushedChar1;
    288             return;
    289         }
    290 
    291         updateAdvanceFunctionPointers();
    292     } else if (m_currentString.m_length) {
    293         if (--m_currentString.m_length == 0)
    294             advanceSubstring();
    295     } else if (!isComposite()) {
    296         m_currentString.clear();
    297         m_empty = true;
    298         m_fastPathFlags = NoFastPath;
    299         m_advanceFunc = &SegmentedString::advanceEmpty;
    300         m_advanceAndUpdateLineNumberFunc = &SegmentedString::advanceEmpty;
    301     }
    302     m_currentChar = m_currentString.m_length ? m_currentString.getCurrentChar() : 0;
    303 }
    304 
    305 void SegmentedString::advanceAndUpdateLineNumberSlowCase()
    306 {
    307     if (m_pushedChar1) {
    308         m_pushedChar1 = m_pushedChar2;
    309         m_pushedChar2 = 0;
    310 
    311         if (m_pushedChar1) {
    312             m_currentChar = m_pushedChar1;
    313             return;
    314         }
    315 
    316         updateAdvanceFunctionPointers();
    317     } else if (m_currentString.m_length) {
    318         if (m_currentString.getCurrentChar() == '\n' && m_currentString.doNotExcludeLineNumbers()) {
    319             ++m_currentLine;
    320             // Plus 1 because numberOfCharactersConsumed value hasn't incremented yet; it does with m_length decrement below.
    321             m_numberOfCharactersConsumedPriorToCurrentLine = numberOfCharactersConsumed() + 1;
    322         }
    323         if (--m_currentString.m_length == 0)
    324             advanceSubstring();
    325         else
    326             m_currentString.incrementAndGetCurrentChar(); // Only need the ++
    327     } else if (!isComposite()) {
    328         m_currentString.clear();
    329         m_empty = true;
    330         m_fastPathFlags = NoFastPath;
    331         m_advanceFunc = &SegmentedString::advanceEmpty;
    332         m_advanceAndUpdateLineNumberFunc = &SegmentedString::advanceEmpty;
    333     }
    334 
    335     m_currentChar = m_currentString.m_length ? m_currentString.getCurrentChar() : 0;
     188    m_numberOfCharactersConsumedPriorToCurrentSubstring += m_currentSubstring.numberOfCharactersConsumed();
     189    m_currentSubstring = m_otherSubstrings.takeFirst();
     190    // If we've previously consumed some characters of the non-current string, we now account for those
     191    // characters as part of the current string, not as part of "prior to current string."
     192    m_numberOfCharactersConsumedPriorToCurrentSubstring -= m_currentSubstring.numberOfCharactersConsumed();
     193    m_currentCharacter = m_currentSubstring.currentCharacter();
     194    updateAdvanceFunctionPointers();
     195}
     196
     197void SegmentedString::advancePastSingleCharacterSubstring()
     198{
     199    ASSERT(m_currentSubstring.length == 1);
     200    ASSERT(m_currentSubstring.doNotExcludeLineNumbers);
     201    processPossibleNewline();
     202    advancePastSingleCharacterSubstringWithoutUpdatingLineNumber();
    336203}
    337204
    338205void SegmentedString::advanceEmpty()
    339206{
    340     ASSERT(!m_currentString.m_length && !isComposite());
    341     m_currentChar = 0;
    342 }
    343 
    344 void SegmentedString::updateSlowCaseFunctionPointers()
    345 {
     207    ASSERT(!m_currentSubstring.length);
     208    ASSERT(m_otherSubstrings.isEmpty());
     209    ASSERT(!m_currentCharacter);
     210}
     211
     212void SegmentedString::updateAdvanceFunctionPointersForSingleCharacterSubstring()
     213{
     214    ASSERT(m_currentSubstring.length == 1);
    346215    m_fastPathFlags = NoFastPath;
    347     m_advanceFunc = &SegmentedString::advanceSlowCase;
    348     m_advanceAndUpdateLineNumberFunc = &SegmentedString::advanceAndUpdateLineNumberSlowCase;
     216    m_advanceWithoutUpdatingLineNumberFunction = &SegmentedString::advancePastSingleCharacterSubstringWithoutUpdatingLineNumber;
     217    if (m_currentSubstring.doNotExcludeLineNumbers)
     218        m_advanceAndUpdateLineNumberFunction = &SegmentedString::advancePastSingleCharacterSubstring;
     219    else
     220        m_advanceAndUpdateLineNumberFunction = &SegmentedString::advancePastSingleCharacterSubstringWithoutUpdatingLineNumber;
    349221}
    350222
     
    365237}
    366238
    367 SegmentedString::AdvancePastResult SegmentedString::advancePastSlowCase(const char* literal, bool caseSensitive)
    368 {
    369     unsigned length = strlen(literal);
     239SegmentedString::AdvancePastResult SegmentedString::advancePastSlowCase(const char* literal, bool lettersIgnoringASCIICase)
     240{
     241    constexpr unsigned maxLength = 10;
     242    ASSERT(!strchr(literal, '\n'));
     243    auto length = strlen(literal);
     244    ASSERT(length <= maxLength);
    370245    if (length > this->length())
    371246        return NotEnoughCharacters;
    372     UChar* consumedCharacters;
    373     String consumedString = String::createUninitialized(length, consumedCharacters);
    374     advancePastNonNewlines(length, consumedCharacters);
    375     if (consumedString.startsWith(literal, caseSensitive))
    376         return DidMatch;
    377     pushBack(SegmentedString(consumedString));
    378     return DidNotMatch;
    379 }
    380 
    381 }
     247    UChar consumedCharacters[maxLength];
     248    for (unsigned i = 0; i < length; ++i) {
     249        auto character = m_currentCharacter;
     250        if (characterMismatch(character, literal[i], lettersIgnoringASCIICase)) {
     251            if (i)
     252                pushBack(String { consumedCharacters, i });
     253            return DidNotMatch;
     254        }
     255        advancePastNonNewline();
     256        consumedCharacters[i] = character;
     257    }
     258    return DidMatch;
     259}
     260
     261void SegmentedString::updateAdvanceFunctionPointersForEmptyString()
     262{
     263    ASSERT(!m_currentSubstring.length);
     264    ASSERT(m_otherSubstrings.isEmpty());
     265    ASSERT(!m_currentCharacter);
     266    m_fastPathFlags = NoFastPath;
     267    m_advanceWithoutUpdatingLineNumberFunction = &SegmentedString::advanceEmpty;
     268    m_advanceAndUpdateLineNumberFunction = &SegmentedString::advanceEmpty;
     269}
     270
     271}
  • trunk/Source/WebCore/platform/text/SegmentedString.h

    r178265 r209058  
    11/*
    2     Copyright (C) 2004-2008, 2015 Apple Inc. All rights reserved.
     2    Copyright (C) 2004-2016 Apple Inc. All rights reserved.
    33
    44    This library is free software; you can redistribute it and/or
     
    1818*/
    1919
    20 #ifndef SegmentedString_h
    21 #define SegmentedString_h
     20#pragma once
    2221
    2322#include <wtf/Deque.h>
    24 #include <wtf/text/StringBuilder.h>
     23#include <wtf/text/WTFString.h>
    2524
    2625namespace WebCore {
    2726
    28 class SegmentedString;
    29 
    30 class SegmentedSubstring {
    31 public:
    32     SegmentedSubstring()
    33         : m_length(0)
    34         , m_doNotExcludeLineNumbers(true)
    35         , m_is8Bit(false)
    36     {
    37         m_data.string16Ptr = 0;
    38     }
    39 
    40     SegmentedSubstring(const String& str)
    41         : m_length(str.length())
    42         , m_doNotExcludeLineNumbers(true)
    43         , m_string(str)
    44     {
    45         if (m_length) {
    46             if (m_string.is8Bit()) {
    47                 m_is8Bit = true;
    48                 m_data.string8Ptr = m_string.characters8();
    49             } else {
    50                 m_is8Bit = false;
    51                 m_data.string16Ptr = m_string.characters16();
    52             }
    53         } else
    54             m_is8Bit = false;
    55     }
    56 
    57     void clear() { m_length = 0; m_data.string16Ptr = 0; m_is8Bit = false;}
    58    
    59     bool is8Bit() { return m_is8Bit; }
    60    
    61     bool excludeLineNumbers() const { return !m_doNotExcludeLineNumbers; }
    62     bool doNotExcludeLineNumbers() const { return m_doNotExcludeLineNumbers; }
    63 
    64     void setExcludeLineNumbers() { m_doNotExcludeLineNumbers = false; }
    65 
    66     int numberOfCharactersConsumed() const { return m_string.length() - m_length; }
    67 
    68     void appendTo(StringBuilder& builder) const
    69     {
    70         int offset = m_string.length() - m_length;
    71 
    72         if (!offset) {
    73             if (m_length)
    74                 builder.append(m_string);
    75         } else
    76             builder.append(m_string.substring(offset, m_length));
    77     }
    78 
    79     UChar getCurrentChar8()
    80     {
    81         return *m_data.string8Ptr;
    82     }
    83 
    84     UChar getCurrentChar16()
    85     {
    86         return m_data.string16Ptr ? *m_data.string16Ptr : 0;
    87     }
    88 
    89     UChar incrementAndGetCurrentChar8()
    90     {
    91         ASSERT(m_data.string8Ptr);
    92         return *++m_data.string8Ptr;
    93     }
    94 
    95     UChar incrementAndGetCurrentChar16()
    96     {
    97         ASSERT(m_data.string16Ptr);
    98         return *++m_data.string16Ptr;
    99     }
    100 
    101     String currentSubString(unsigned length)
    102     {
    103         int offset = m_string.length() - m_length;
    104         return m_string.substring(offset, length);
    105     }
    106 
    107     ALWAYS_INLINE UChar getCurrentChar()
    108     {
    109         ASSERT(m_length);
    110         if (is8Bit())
    111             return getCurrentChar8();
    112         return getCurrentChar16();
    113     }
    114    
    115     ALWAYS_INLINE UChar incrementAndGetCurrentChar()
    116     {
    117         ASSERT(m_length);
    118         if (is8Bit())
    119             return incrementAndGetCurrentChar8();
    120         return incrementAndGetCurrentChar16();
    121     }
    122 
    123 public:
    124     union {
    125         const LChar* string8Ptr;
    126         const UChar* string16Ptr;
    127     } m_data;
    128     int m_length;
    129 
    130 private:
    131     bool m_doNotExcludeLineNumbers;
    132     bool m_is8Bit;
    133     String m_string;
    134 };
     27// FIXME: This should not start with "k".
     28// FIXME: This is a shared tokenizer concept, not a SegmentedString concept, but this is the only common header for now.
     29constexpr LChar kEndOfFileMarker = 0;
    13530
    13631class SegmentedString {
    13732public:
    138     SegmentedString()
    139         : m_pushedChar1(0)
    140         , m_pushedChar2(0)
    141         , m_currentChar(0)
    142         , m_numberOfCharactersConsumedPriorToCurrentString(0)
    143         , m_numberOfCharactersConsumedPriorToCurrentLine(0)
    144         , m_currentLine(0)
    145         , m_closed(false)
    146         , m_empty(true)
    147         , m_fastPathFlags(NoFastPath)
    148         , m_advanceFunc(&SegmentedString::advanceEmpty)
    149         , m_advanceAndUpdateLineNumberFunc(&SegmentedString::advanceEmpty)
    150     {
    151     }
    152 
    153     SegmentedString(const String& str)
    154         : m_pushedChar1(0)
    155         , m_pushedChar2(0)
    156         , m_currentString(str)
    157         , m_currentChar(0)
    158         , m_numberOfCharactersConsumedPriorToCurrentString(0)
    159         , m_numberOfCharactersConsumedPriorToCurrentLine(0)
    160         , m_currentLine(0)
    161         , m_closed(false)
    162         , m_empty(!str.length())
    163         , m_fastPathFlags(NoFastPath)
    164     {
    165         if (m_currentString.m_length)
    166             m_currentChar = m_currentString.getCurrentChar();
    167         updateAdvanceFunctionPointers();
    168     }
    169 
    170     SegmentedString(const SegmentedString&);
    171     SegmentedString& operator=(const SegmentedString&);
     33    SegmentedString() = default;
     34    SegmentedString(String&&);
     35    SegmentedString(const String&);
     36
     37    SegmentedString(SegmentedString&&) = delete;
     38    SegmentedString(const SegmentedString&) = delete;
     39
     40    SegmentedString& operator=(SegmentedString&&);
     41    SegmentedString& operator=(const SegmentedString&) = default;
    17242
    17343    void clear();
    17444    void close();
    17545
     46    void append(SegmentedString&&);
    17647    void append(const SegmentedString&);
    177     void pushBack(const SegmentedString&);
     48
     49    void append(String&&);
     50    void append(const String&);
     51
     52    void pushBack(String&&);
    17853
    17954    void setExcludeLineNumbers();
    18055
    181     void push(UChar c)
    182     {
    183         if (!m_pushedChar1) {
    184             m_pushedChar1 = c;
    185             m_currentChar = m_pushedChar1 ? m_pushedChar1 : m_currentString.getCurrentChar();
    186             updateSlowCaseFunctionPointers();
    187         } else {
    188             ASSERT(!m_pushedChar2);
    189             m_pushedChar2 = c;
    190         }
    191     }
    192 
    193     bool isEmpty() const { return m_empty; }
     56    bool isEmpty() const { return !m_currentSubstring.length; }
    19457    unsigned length() const;
    19558
    196     bool isClosed() const { return m_closed; }
     59    bool isClosed() const { return m_isClosed; }
     60
     61    void advance();
     62    void advancePastNonNewline(); // Faster than calling advance when we know the current character is not a newline.
     63    void advancePastNewline(); // Faster than calling advance when we know the current character is a newline.
    19764
    19865    enum AdvancePastResult { DidNotMatch, DidMatch, NotEnoughCharacters };
    199     template<unsigned length> AdvancePastResult advancePast(const char (&literal)[length]) { return advancePast(literal, length - 1, true); }
    200     template<unsigned length> AdvancePastResult advancePastIgnoringCase(const char (&literal)[length]) { return advancePast(literal, length - 1, false); }
    201 
    202     void advance()
    203     {
    204         if (m_fastPathFlags & Use8BitAdvance) {
    205             ASSERT(!m_pushedChar1);
    206             bool haveOneCharacterLeft = (--m_currentString.m_length == 1);
    207             m_currentChar = m_currentString.incrementAndGetCurrentChar8();
    208 
    209             if (!haveOneCharacterLeft)
    210                 return;
    211 
    212             updateSlowCaseFunctionPointers();
    213 
    214             return;
    215         }
    216 
    217         (this->*m_advanceFunc)();
    218     }
    219 
    220     void advanceAndUpdateLineNumber()
    221     {
    222         if (m_fastPathFlags & Use8BitAdvance) {
    223             ASSERT(!m_pushedChar1);
    224 
    225             bool haveNewLine = (m_currentChar == '\n') & !!(m_fastPathFlags & Use8BitAdvanceAndUpdateLineNumbers);
    226             bool haveOneCharacterLeft = (--m_currentString.m_length == 1);
    227 
    228             m_currentChar = m_currentString.incrementAndGetCurrentChar8();
    229 
    230             if (!(haveNewLine | haveOneCharacterLeft))
    231                 return;
    232 
    233             if (haveNewLine) {
    234                 ++m_currentLine;
    235                 m_numberOfCharactersConsumedPriorToCurrentLine =  m_numberOfCharactersConsumedPriorToCurrentString + m_currentString.numberOfCharactersConsumed();
    236             }
    237 
    238             if (haveOneCharacterLeft)
    239                 updateSlowCaseFunctionPointers();
    240 
    241             return;
    242         }
    243 
    244         (this->*m_advanceAndUpdateLineNumberFunc)();
    245     }
    246 
    247     void advancePastNonNewline()
    248     {
    249         ASSERT(currentChar() != '\n');
    250         advance();
    251     }
    252 
    253     void advancePastNewlineAndUpdateLineNumber()
    254     {
    255         ASSERT(currentChar() == '\n');
    256         if (!m_pushedChar1 && m_currentString.m_length > 1) {
    257             int newLineFlag = m_currentString.doNotExcludeLineNumbers();
    258             m_currentLine += newLineFlag;
    259             if (newLineFlag)
    260                 m_numberOfCharactersConsumedPriorToCurrentLine = numberOfCharactersConsumed() + 1;
    261             decrementAndCheckLength();
    262             m_currentChar = m_currentString.incrementAndGetCurrentChar();
    263             return;
    264         }
    265         advanceAndUpdateLineNumberSlowCase();
    266     }
    267 
    268     int numberOfCharactersConsumed() const
    269     {
    270         int numberOfPushedCharacters = 0;
    271         if (m_pushedChar1) {
    272             ++numberOfPushedCharacters;
    273             if (m_pushedChar2)
    274                 ++numberOfPushedCharacters;
    275         }
    276         return m_numberOfCharactersConsumedPriorToCurrentString + m_currentString.numberOfCharactersConsumed() - numberOfPushedCharacters;
    277     }
     66    template<unsigned length> AdvancePastResult advancePast(const char (&literal)[length]) { return advancePast<length, false>(literal); }
     67    template<unsigned length> AdvancePastResult advancePastLettersIgnoringASCIICase(const char (&literal)[length]) { return advancePast<length, true>(literal); }
     68
     69    unsigned numberOfCharactersConsumed() const;
    27870
    27971    String toString() const;
    28072
    281     UChar currentChar() const { return m_currentChar; }   
     73    UChar currentCharacter() const { return m_currentCharacter; }
    28274
    28375    OrdinalNumber currentColumn() const;
     
    28981
    29082private:
     83    struct Substring {
     84        Substring() = default;
     85        Substring(String&&);
     86
     87        UChar currentCharacter() const;
     88        UChar currentCharacterPreIncrement();
     89
     90        unsigned numberOfCharactersConsumed() const;
     91        void appendTo(StringBuilder&) const;
     92
     93        String string;
     94        unsigned length { 0 };
     95        bool is8Bit;
     96        union {
     97            const LChar* currentCharacter8;
     98            const UChar* currentCharacter16;
     99        };
     100        bool doNotExcludeLineNumbers { true };
     101    };
     102
    291103    enum FastPathFlags {
    292104        NoFastPath = 0,
     
    295107    };
    296108
    297     void append(const SegmentedSubstring&);
    298     void pushBack(const SegmentedSubstring&);
    299 
    300     void advance8();
    301     void advance16();
    302     void advanceAndUpdateLineNumber8();
     109    void appendSubstring(Substring&&);
     110
     111    void processPossibleNewline();
     112    void startNewLine();
     113
     114    void advanceWithoutUpdatingLineNumber();
     115    void advanceWithoutUpdatingLineNumber16();
    303116    void advanceAndUpdateLineNumber16();
    304     void advanceSlowCase();
    305     void advanceAndUpdateLineNumberSlowCase();
     117    void advancePastSingleCharacterSubstringWithoutUpdatingLineNumber();
     118    void advancePastSingleCharacterSubstring();
    306119    void advanceEmpty();
    307     void advanceSubstring();
    308    
    309     void updateSlowCaseFunctionPointers();
    310 
    311     void decrementAndCheckLength()
    312     {
    313         ASSERT(m_currentString.m_length > 1);
    314         if (--m_currentString.m_length == 1)
    315             updateSlowCaseFunctionPointers();
    316     }
    317 
    318     void updateAdvanceFunctionPointers()
    319     {
    320         if ((m_currentString.m_length > 1) && !m_pushedChar1) {
    321             if (m_currentString.is8Bit()) {
    322                 m_advanceFunc = &SegmentedString::advance8;
    323                 m_fastPathFlags = Use8BitAdvance;
    324                 if (m_currentString.doNotExcludeLineNumbers()) {
    325                     m_advanceAndUpdateLineNumberFunc = &SegmentedString::advanceAndUpdateLineNumber8;
    326                     m_fastPathFlags |= Use8BitAdvanceAndUpdateLineNumbers;
    327                 } else
    328                     m_advanceAndUpdateLineNumberFunc = &SegmentedString::advance8;
    329                 return;
     120
     121    void updateAdvanceFunctionPointers();
     122    void updateAdvanceFunctionPointersForEmptyString();
     123    void updateAdvanceFunctionPointersForSingleCharacterSubstring();
     124
     125    void decrementAndCheckLength();
     126
     127    template<typename CharacterType> static bool characterMismatch(CharacterType, char, bool lettersIgnoringASCIICase);
     128    template<unsigned length, bool lettersIgnoringASCIICase> AdvancePastResult advancePast(const char (&literal)[length]);
     129    AdvancePastResult advancePastSlowCase(const char* literal, bool lettersIgnoringASCIICase);
     130
     131    Substring m_currentSubstring;
     132    Deque<Substring> m_otherSubstrings;
     133
     134    bool m_isClosed { false };
     135
     136    UChar m_currentCharacter { 0 };
     137
     138    unsigned m_numberOfCharactersConsumedPriorToCurrentSubstring { 0 };
     139    unsigned m_numberOfCharactersConsumedPriorToCurrentLine { 0 };
     140    int m_currentLine { 0 };
     141
     142    unsigned char m_fastPathFlags { NoFastPath };
     143    void (SegmentedString::*m_advanceWithoutUpdatingLineNumberFunction)() { &SegmentedString::advanceEmpty };
     144    void (SegmentedString::*m_advanceAndUpdateLineNumberFunction)() { &SegmentedString::advanceEmpty };
     145};
     146
     147inline SegmentedString::Substring::Substring(String&& passedString)
     148    : string(WTFMove(passedString))
     149    , length(string.length())
     150{
     151    if (length) {
     152        is8Bit = string.impl()->is8Bit();
     153        if (is8Bit)
     154            currentCharacter8 = string.impl()->characters8();
     155        else
     156            currentCharacter16 = string.impl()->characters16();
     157    }
     158}
     159
     160inline unsigned SegmentedString::Substring::numberOfCharactersConsumed() const
     161{
     162    return string.length() - length;
     163}
     164
     165ALWAYS_INLINE UChar SegmentedString::Substring::currentCharacter() const
     166{
     167    ASSERT(length);
     168    return is8Bit ? *currentCharacter8 : *currentCharacter16;
     169}
     170
     171ALWAYS_INLINE UChar SegmentedString::Substring::currentCharacterPreIncrement()
     172{
     173    ASSERT(length);
     174    return is8Bit ? *++currentCharacter8 : *++currentCharacter16;
     175}
     176
     177inline SegmentedString::SegmentedString(String&& string)
     178    : m_currentSubstring(WTFMove(string))
     179{
     180    if (m_currentSubstring.length) {
     181        m_currentCharacter = m_currentSubstring.currentCharacter();
     182        updateAdvanceFunctionPointers();
     183    }
     184}
     185
     186inline SegmentedString::SegmentedString(const String& string)
     187    : SegmentedString(String { string })
     188{
     189}
     190
     191ALWAYS_INLINE void SegmentedString::decrementAndCheckLength()
     192{
     193    ASSERT(m_currentSubstring.length > 1);
     194    if (UNLIKELY(--m_currentSubstring.length == 1))
     195        updateAdvanceFunctionPointersForSingleCharacterSubstring();
     196}
     197
     198ALWAYS_INLINE void SegmentedString::advanceWithoutUpdatingLineNumber()
     199{
     200    if (LIKELY(m_fastPathFlags & Use8BitAdvance)) {
     201        m_currentCharacter = *++m_currentSubstring.currentCharacter8;
     202        decrementAndCheckLength();
     203        return;
     204    }
     205
     206    (this->*m_advanceWithoutUpdatingLineNumberFunction)();
     207}
     208
     209inline void SegmentedString::startNewLine()
     210{
     211    ++m_currentLine;
     212    m_numberOfCharactersConsumedPriorToCurrentLine = numberOfCharactersConsumed();
     213}
     214
     215inline void SegmentedString::processPossibleNewline()
     216{
     217    if (m_currentCharacter == '\n')
     218        startNewLine();
     219}
     220
     221inline void SegmentedString::advance()
     222{
     223    if (LIKELY(m_fastPathFlags & Use8BitAdvance)) {
     224        ASSERT(m_currentSubstring.length > 1);
     225        bool lastCharacterWasNewline = m_currentCharacter == '\n';
     226        m_currentCharacter = *++m_currentSubstring.currentCharacter8;
     227        bool haveOneCharacterLeft = --m_currentSubstring.length == 1;
     228        if (LIKELY(!(lastCharacterWasNewline | haveOneCharacterLeft)))
     229            return;
     230        if (lastCharacterWasNewline & !!(m_fastPathFlags & Use8BitAdvanceAndUpdateLineNumbers))
     231            startNewLine();
     232        if (haveOneCharacterLeft)
     233            updateAdvanceFunctionPointersForSingleCharacterSubstring();
     234        return;
     235    }
     236
     237    (this->*m_advanceAndUpdateLineNumberFunction)();
     238}
     239
     240ALWAYS_INLINE void SegmentedString::advancePastNonNewline()
     241{
     242    ASSERT(m_currentCharacter != '\n');
     243    advanceWithoutUpdatingLineNumber();
     244}
     245
     246inline void SegmentedString::advancePastNewline()
     247{
     248    ASSERT(m_currentCharacter == '\n');
     249    if (m_currentSubstring.length > 1) {
     250        if (m_currentSubstring.doNotExcludeLineNumbers)
     251            startNewLine();
     252        m_currentCharacter = m_currentSubstring.currentCharacterPreIncrement();
     253        decrementAndCheckLength();
     254        return;
     255    }
     256
     257    (this->*m_advanceAndUpdateLineNumberFunction)();
     258}
     259
     260inline unsigned SegmentedString::numberOfCharactersConsumed() const
     261{
     262    return m_numberOfCharactersConsumedPriorToCurrentSubstring + m_currentSubstring.numberOfCharactersConsumed();
     263}
     264
     265template<typename CharacterType> ALWAYS_INLINE bool SegmentedString::characterMismatch(CharacterType a, char b, bool lettersIgnoringASCIICase)
     266{
     267    return lettersIgnoringASCIICase ? !isASCIIAlphaCaselessEqual(a, b) : a != b;
     268}
     269
     270template<unsigned lengthIncludingTerminator, bool lettersIgnoringASCIICase> SegmentedString::AdvancePastResult SegmentedString::advancePast(const char (&literal)[lengthIncludingTerminator])
     271{
     272    constexpr unsigned length = lengthIncludingTerminator - 1;
     273    ASSERT(!literal[length]);
     274    ASSERT(!strchr(literal, '\n'));
     275    if (length + 1 < m_currentSubstring.length) {
     276        if (m_currentSubstring.is8Bit) {
     277            for (unsigned i = 0; i < length; ++i) {
     278                if (characterMismatch(m_currentSubstring.currentCharacter8[i], literal[i], lettersIgnoringASCIICase))
     279                    return DidNotMatch;
    330280            }
    331 
    332             m_advanceFunc = &SegmentedString::advance16;
    333             m_fastPathFlags = NoFastPath;
    334             if (m_currentString.doNotExcludeLineNumbers())
    335                 m_advanceAndUpdateLineNumberFunc = &SegmentedString::advanceAndUpdateLineNumber16;
    336             else
    337                 m_advanceAndUpdateLineNumberFunc = &SegmentedString::advance16;
     281            m_currentSubstring.currentCharacter8 += length;
     282            m_currentCharacter = *m_currentSubstring.currentCharacter8;
     283        } else {
     284            for (unsigned i = 0; i < length; ++i) {
     285                if (characterMismatch(m_currentSubstring.currentCharacter16[i], literal[i], lettersIgnoringASCIICase))
     286                    return DidNotMatch;
     287            }
     288            m_currentSubstring.currentCharacter16 += length;
     289            m_currentCharacter = *m_currentSubstring.currentCharacter16;
     290        }
     291        m_currentSubstring.length -= length;
     292        return DidMatch;
     293    }
     294    return advancePastSlowCase(literal, lettersIgnoringASCIICase);
     295}
     296
     297inline void SegmentedString::updateAdvanceFunctionPointers()
     298{
     299    if (m_currentSubstring.length > 1) {
     300        if (m_currentSubstring.is8Bit) {
     301            m_fastPathFlags = Use8BitAdvance;
     302            if (m_currentSubstring.doNotExcludeLineNumbers)
     303                m_fastPathFlags |= Use8BitAdvanceAndUpdateLineNumbers;
    338304            return;
    339305        }
    340 
    341         if (!m_currentString.m_length && !isComposite()) {
    342             m_advanceFunc = &SegmentedString::advanceEmpty;
    343             m_fastPathFlags = NoFastPath;
    344             m_advanceAndUpdateLineNumberFunc = &SegmentedString::advanceEmpty;
    345         }
    346 
    347         updateSlowCaseFunctionPointers();
    348     }
    349 
    350     // Writes consumed characters into consumedCharacters, which must have space for at least |count| characters.
    351     void advancePastNonNewlines(unsigned count);
    352     void advancePastNonNewlines(unsigned count, UChar* consumedCharacters);
    353 
    354     AdvancePastResult advancePast(const char* literal, unsigned length, bool caseSensitive);
    355     AdvancePastResult advancePastSlowCase(const char* literal, bool caseSensitive);
    356 
    357     bool isComposite() const { return !m_substrings.isEmpty(); }
    358 
    359     UChar m_pushedChar1;
    360     UChar m_pushedChar2;
    361     SegmentedSubstring m_currentString;
    362     UChar m_currentChar;
    363     int m_numberOfCharactersConsumedPriorToCurrentString;
    364     int m_numberOfCharactersConsumedPriorToCurrentLine;
    365     int m_currentLine;
    366     Deque<SegmentedSubstring> m_substrings;
    367     bool m_closed;
    368     bool m_empty;
    369     unsigned char m_fastPathFlags;
    370     void (SegmentedString::*m_advanceFunc)();
    371     void (SegmentedString::*m_advanceAndUpdateLineNumberFunc)();
    372 };
    373 
    374 inline void SegmentedString::advancePastNonNewlines(unsigned count)
    375 {
    376     for (unsigned i = 0; i < count; ++i)
    377         advancePastNonNewline();
    378 }
    379 
    380 inline SegmentedString::AdvancePastResult SegmentedString::advancePast(const char* literal, unsigned length, bool caseSensitive)
    381 {
    382     ASSERT(strlen(literal) == length);
    383     ASSERT(!strchr(literal, '\n'));
    384     if (!m_pushedChar1) {
    385         if (length <= static_cast<unsigned>(m_currentString.m_length)) {
    386             if (!m_currentString.currentSubString(length).startsWith(literal, caseSensitive))
    387                 return DidNotMatch;
    388             advancePastNonNewlines(length);
    389             return DidMatch;
    390         }
    391     }
    392     return advancePastSlowCase(literal, caseSensitive);
    393 }
    394 
    395 }
    396 
    397 #endif
     306        m_fastPathFlags = NoFastPath;
     307        m_advanceWithoutUpdatingLineNumberFunction = &SegmentedString::advanceWithoutUpdatingLineNumber16;
     308        if (m_currentSubstring.doNotExcludeLineNumbers)
     309            m_advanceAndUpdateLineNumberFunction = &SegmentedString::advanceAndUpdateLineNumber16;
     310        else
     311            m_advanceAndUpdateLineNumberFunction = &SegmentedString::advanceWithoutUpdatingLineNumber16;
     312        return;
     313    }
     314
     315    if (!m_currentSubstring.length) {
     316        updateAdvanceFunctionPointersForEmptyString();
     317        return;
     318    }
     319
     320    updateAdvanceFunctionPointersForSingleCharacterSubstring();
     321}
     322
     323}
  • trunk/Source/WebCore/xml/parser/CharacterReferenceParserInlines.h

    r208646 r209058  
    3131namespace WebCore {
    3232
    33 inline void unconsumeCharacters(SegmentedString& source, const StringBuilder& consumedCharacters)
     33inline void unconsumeCharacters(SegmentedString& source, StringBuilder& consumedCharacters)
    3434{
    35     source.pushBack(SegmentedString(consumedCharacters.toStringPreserveCapacity()));
     35    source.pushBack(consumedCharacters.toString());
    3636}
    3737
     
    5757   
    5858    while (!source.isEmpty()) {
    59         UChar character = source.currentChar();
     59        UChar character = source.currentCharacter();
    6060        switch (state) {
    6161        case Initial:
     
    8686                goto Decimal;
    8787            }
    88             source.pushBack(SegmentedString(ASCIILiteral("#")));
     88            source.pushBack(ASCIILiteral("#"));
    8989            return false;
    9090        case MaybeHexLowerCaseX:
     
    9393                goto Hex;
    9494            }
    95             source.pushBack(SegmentedString(ASCIILiteral("#x")));
     95            source.pushBack(ASCIILiteral("#x"));
    9696            return false;
    9797        case MaybeHexUpperCaseX:
     
    100100                goto Hex;
    101101            }
    102             source.pushBack(SegmentedString(ASCIILiteral("#X")));
     102            source.pushBack(ASCIILiteral("#X"));
    103103            return false;
    104104        case Hex:
     
    111111            }
    112112            if (character == ';') {
    113                 source.advance();
     113                source.advancePastNonNewline();
    114114                decodedCharacter.append(ParserFunctions::legalEntityFor(overflow ? 0 : result));
    115115                return true;
     
    130130            }
    131131            if (character == ';') {
    132                 source.advance();
     132                source.advancePastNonNewline();
    133133                decodedCharacter.append(ParserFunctions::legalEntityFor(overflow ? 0 : result));
    134134                return true;
     
    145145        }
    146146        consumedCharacters.append(character);
    147         source.advance();
     147        source.advancePastNonNewline();
    148148    }
    149149    ASSERT(source.isEmpty());
  • trunk/Source/WebCore/xml/parser/MarkupTokenizerInlines.h

    r208646 r209058  
    11/*
    2  * Copyright (C) 2008, 2015 Apple Inc. All Rights Reserved.
     2 * Copyright (C) 2008-2016 Apple Inc. All Rights Reserved.
    33 * Copyright (C) 2009 Torch Mobile, Inc. http://www.torchmobile.com/
    44 * Copyright (C) 2010 Google, Inc. All Rights Reserved.
     
    2828#pragma once
    2929
    30 #include "SegmentedString.h"
    31 
    3230#if COMPILER(MSVC)
    3331// Disable the "unreachable code" warning so we can compile the ASSERT_NOT_REACHED in the END_STATE macro.
     
    4543    case stateName:                                             \
    4644    stateName: {                                                \
    47         const auto currentState = stateName;                    \
     45        constexpr auto currentState = stateName;                \
    4846        UNUSED_PARAM(currentState);
    4947
     
    7573        goto newState;                                          \
    7674    } while (false)
     75#define ADVANCE_PAST_NON_NEWLINE_TO(newState)                   \
     76    do {                                                        \
     77        if (!m_preprocessor.advancePastNonNewline(source, isNullCharacterSkippingState(newState))) { \
     78            m_state = newState;                                 \
     79            return haveBufferedCharacterToken();                \
     80        }                                                       \
     81        character = m_preprocessor.nextInputCharacter();        \
     82        goto newState;                                          \
     83    } while (false)
    7784
    7885// For more complex cases, caller consumes the characters first and then uses this macro.
  • trunk/Source/WebCore/xml/parser/XMLDocumentParser.cpp

    r208840 r209058  
    101101}
    102102
    103 void XMLDocumentParser::insert(const SegmentedString&)
     103void XMLDocumentParser::insert(SegmentedString&&)
    104104{
    105105    ASSERT_NOT_REACHED();
     
    108108void XMLDocumentParser::append(RefPtr<StringImpl>&& inputSource)
    109109{
    110     SegmentedString source(WTFMove(inputSource));
     110    String source { WTFMove(inputSource) };
     111
    111112    if (m_sawXSLTransform || !m_sawFirstElement)
    112113        m_originalSourceForTransform.append(source);
     
    120121    }
    121122
    122     doWrite(source.toString());
     123    doWrite(source);
    123124
    124125    // After parsing, dispatch image beforeload events.
     
    153154}
    154155
    155 
    156156bool XMLDocumentParser::updateLeafTextNode()
    157157{
  • trunk/Source/WebCore/xml/parser/XMLDocumentParser.h

    r208840 r209058  
    11/*
    22 * Copyright (C) 2000 Peter Kelly (pmk@post.com)
    3  * Copyright (C) 2005, 2006, 2007 Apple Inc. All rights reserved.
     3 * Copyright (C) 2005-2016 Apple Inc. All rights reserved.
    44 * Copyright (C) 2007 Samuel Weinig (sam@webkit.org)
    55 * Copyright (C) 2008 Nokia Corporation and/or its subsidiary(-ies)
     
    3030#include "SegmentedString.h"
    3131#include "XMLErrors.h"
     32#include <libxml/tree.h>
     33#include <libxml/xmlstring.h>
    3234#include <wtf/HashMap.h>
    3335#include <wtf/text/AtomicStringHash.h>
    3436#include <wtf/text/CString.h>
    35 
    36 #include <libxml/tree.h>
    37 #include <libxml/xmlstring.h>
    3837
    3938namespace WebCore {
     
    4241class CachedResourceLoader;
    4342class DocumentFragment;
    44 class Document;
    4543class Element;
    4644class FrameView;
    4745class PendingCallbacks;
    48 class PendingScript;
    4946class Text;
    5047
    51     class XMLParserContext : public RefCounted<XMLParserContext> {
    52     public:
    53         static RefPtr<XMLParserContext> createMemoryParser(xmlSAXHandlerPtr, void* userData, const CString& chunk);
    54         static Ref<XMLParserContext> createStringParser(xmlSAXHandlerPtr, void* userData);
    55         ~XMLParserContext();
    56         xmlParserCtxtPtr context() const { return m_context; }
     48class XMLParserContext : public RefCounted<XMLParserContext> {
     49public:
     50    static RefPtr<XMLParserContext> createMemoryParser(xmlSAXHandlerPtr, void* userData, const CString& chunk);
     51    static Ref<XMLParserContext> createStringParser(xmlSAXHandlerPtr, void* userData);
     52    ~XMLParserContext();
     53    xmlParserCtxtPtr context() const { return m_context; }
    5754
    58     private:
    59         XMLParserContext(xmlParserCtxtPtr context)
    60             : m_context(context)
    61         {
    62         }
    63         xmlParserCtxtPtr m_context;
    64     };
     55private:
     56    XMLParserContext(xmlParserCtxtPtr context)
     57        : m_context(context)
     58    {
     59    }
     60    xmlParserCtxtPtr m_context;
     61};
    6562
    66     class XMLDocumentParser final : public ScriptableDocumentParser, public PendingScriptClient {
    67         WTF_MAKE_FAST_ALLOCATED;
    68     public:
    69         static Ref<XMLDocumentParser> create(Document& document, FrameView* view)
    70         {
    71             return adoptRef(*new XMLDocumentParser(document, view));
    72         }
    73         static Ref<XMLDocumentParser> create(DocumentFragment& fragment, Element* element, ParserContentPolicy parserContentPolicy)
    74         {
    75             return adoptRef(*new XMLDocumentParser(fragment, element, parserContentPolicy));
    76         }
     63class XMLDocumentParser final : public ScriptableDocumentParser, public PendingScriptClient {
     64    WTF_MAKE_FAST_ALLOCATED;
     65public:
     66    static Ref<XMLDocumentParser> create(Document& document, FrameView* view)
     67    {
     68        return adoptRef(*new XMLDocumentParser(document, view));
     69    }
     70    static Ref<XMLDocumentParser> create(DocumentFragment& fragment, Element* element, ParserContentPolicy parserContentPolicy)
     71    {
     72        return adoptRef(*new XMLDocumentParser(fragment, element, parserContentPolicy));
     73    }
    7774
    78         ~XMLDocumentParser();
     75    ~XMLDocumentParser();
    7976
    80         // Exposed for callbacks:
    81         void handleError(XMLErrors::ErrorType, const char* message, TextPosition);
     77    // Exposed for callbacks:
     78    void handleError(XMLErrors::ErrorType, const char* message, TextPosition);
    8279
    83         void setIsXHTMLDocument(bool isXHTML) { m_isXHTMLDocument = isXHTML; }
    84         bool isXHTMLDocument() const { return m_isXHTMLDocument; }
     80    void setIsXHTMLDocument(bool isXHTML) { m_isXHTMLDocument = isXHTML; }
     81    bool isXHTMLDocument() const { return m_isXHTMLDocument; }
    8582
    86         static bool parseDocumentFragment(const String&, DocumentFragment&, Element* parent = nullptr, ParserContentPolicy = AllowScriptingContent);
     83    static bool parseDocumentFragment(const String&, DocumentFragment&, Element* parent = nullptr, ParserContentPolicy = AllowScriptingContent);
    8784
    88         // Used by the XMLHttpRequest to check if the responseXML was well formed.
    89         bool wellFormed() const override { return !m_sawError; }
     85    // Used by XMLHttpRequest to check if the responseXML was well formed.
     86    bool wellFormed() const final { return !m_sawError; }
    9087
    91         static bool supportsXMLVersion(const String&);
     88    static bool supportsXMLVersion(const String&);
    9289
    93     private:
    94         XMLDocumentParser(Document&, FrameView* = nullptr);
    95         XMLDocumentParser(DocumentFragment&, Element*, ParserContentPolicy);
     90private:
     91    explicit XMLDocumentParser(Document&, FrameView* = nullptr);
     92    XMLDocumentParser(DocumentFragment&, Element*, ParserContentPolicy);
    9693
    97         // From DocumentParser
    98         void insert(const SegmentedString&) override;
    99         void append(RefPtr<StringImpl>&&) override;
    100         void finish() override;
    101         bool isWaitingForScripts() const override;
    102         void stopParsing() override;
    103         void detach() override;
     94    void insert(SegmentedString&&) final;
     95    void append(RefPtr<StringImpl>&&) final;
     96    void finish() final;
     97    bool isWaitingForScripts() const final;
     98    void stopParsing() final;
     99    void detach() final;
    104100
    105         TextPosition textPosition() const override;
    106         bool shouldAssociateConsoleMessagesWithTextPosition() const override;
     101    TextPosition textPosition() const final;
     102    bool shouldAssociateConsoleMessagesWithTextPosition() const final;
    107103
    108         void notifyFinished(PendingScript&) final;
     104    void notifyFinished(PendingScript&) final;
    109105
    110         void end();
     106    void end();
    111107
    112         void pauseParsing();
    113         void resumeParsing();
     108    void pauseParsing();
     109    void resumeParsing();
    114110
    115         bool appendFragmentSource(const String&);
     111    bool appendFragmentSource(const String&);
    116112
    117     public:
    118         // callbacks from parser SAX
    119         void error(XMLErrors::ErrorType, const char* message, va_list args) WTF_ATTRIBUTE_PRINTF(3, 0);
    120         void startElementNs(const xmlChar* xmlLocalName, const xmlChar* xmlPrefix, const xmlChar* xmlURI, int nb_namespaces,
    121                             const xmlChar** namespaces, int nb_attributes, int nb_defaulted, const xmlChar** libxmlAttributes);
    122         void endElementNs();
    123         void characters(const xmlChar* s, int len);
    124         void processingInstruction(const xmlChar* target, const xmlChar* data);
    125         void cdataBlock(const xmlChar* s, int len);
    126         void comment(const xmlChar* s);
    127         void startDocument(const xmlChar* version, const xmlChar* encoding, int standalone);
    128         void internalSubset(const xmlChar* name, const xmlChar* externalID, const xmlChar* systemID);
    129         void endDocument();
     113public:
     114    // Callbacks from parser SAX, and other functions needed inside
     115    // the parser implementation, but outside this class.
    130116
    131         bool isParsingEntityDeclaration() const { return m_isParsingEntityDeclaration; }
    132         void setIsParsingEntityDeclaration(bool value) { m_isParsingEntityDeclaration = value; }
     117    void error(XMLErrors::ErrorType, const char* message, va_list args) WTF_ATTRIBUTE_PRINTF(3, 0);
     118    void startElementNs(const xmlChar* xmlLocalName, const xmlChar* xmlPrefix, const xmlChar* xmlURI,
     119        int numNamespaces, const xmlChar** namespaces,
     120        int numAttributes, int numDefaulted, const xmlChar** libxmlAttributes);
     121    void endElementNs();
     122    void characters(const xmlChar*, int length);
     123    void processingInstruction(const xmlChar* target, const xmlChar* data);
     124    void cdataBlock(const xmlChar*, int length);
     125    void comment(const xmlChar*);
     126    void startDocument(const xmlChar* version, const xmlChar* encoding, int standalone);
     127    void internalSubset(const xmlChar* name, const xmlChar* externalID, const xmlChar* systemID);
     128    void endDocument();
    133129
    134         int depthTriggeringEntityExpansion() const { return m_depthTriggeringEntityExpansion; }
    135         void setDepthTriggeringEntityExpansion(int depth) { m_depthTriggeringEntityExpansion = depth; }
     130    bool isParsingEntityDeclaration() const { return m_isParsingEntityDeclaration; }
     131    void setIsParsingEntityDeclaration(bool value) { m_isParsingEntityDeclaration = value; }
    136132
    137     private:
    138         void initializeParserContext(const CString& chunk = CString());
     133    int depthTriggeringEntityExpansion() const { return m_depthTriggeringEntityExpansion; }
     134    void setDepthTriggeringEntityExpansion(int depth) { m_depthTriggeringEntityExpansion = depth; }
    139135
    140         void pushCurrentNode(ContainerNode*);
    141         void popCurrentNode();
    142         void clearCurrentNodeStack();
     136private:
     137    void initializeParserContext(const CString& chunk = CString());
    143138
    144         void insertErrorMessageBlock();
     139    void pushCurrentNode(ContainerNode*);
     140    void popCurrentNode();
     141    void clearCurrentNodeStack();
    145142
    146         void createLeafTextNode();
    147         bool updateLeafTextNode();
     143    void insertErrorMessageBlock();
    148144
    149         void doWrite(const String&);
    150         void doEnd();
     145    void createLeafTextNode();
     146    bool updateLeafTextNode();
    151147
    152         FrameView* m_view;
     148    void doWrite(const String&);
     149    void doEnd();
    153150
    154         SegmentedString m_originalSourceForTransform;
     151    xmlParserCtxtPtr context() const { return m_context ? m_context->context() : nullptr; };
    155152
    156         xmlParserCtxtPtr context() const { return m_context ? m_context->context() : nullptr; };
    157         RefPtr<XMLParserContext> m_context;
    158         std::unique_ptr<PendingCallbacks> m_pendingCallbacks;
    159         Vector<xmlChar> m_bufferedText;
    160         int m_depthTriggeringEntityExpansion;
    161         bool m_isParsingEntityDeclaration;
     153    FrameView* m_view { nullptr };
    162154
    163         ContainerNode* m_currentNode;
    164         Vector<ContainerNode*> m_currentNodeStack;
     155    SegmentedString m_originalSourceForTransform;
    165156
    166         RefPtr<Text> m_leafTextNode;
     157    RefPtr<XMLParserContext> m_context;
     158    std::unique_ptr<PendingCallbacks> m_pendingCallbacks;
     159    Vector<xmlChar> m_bufferedText;
     160    int m_depthTriggeringEntityExpansion { -1 };
     161    bool m_isParsingEntityDeclaration { false };
    167162
    168         bool m_sawError;
    169         bool m_sawCSS;
    170         bool m_sawXSLTransform;
    171         bool m_sawFirstElement;
    172         bool m_isXHTMLDocument;
    173         bool m_parserPaused;
    174         bool m_requestingScript;
    175         bool m_finishCalled;
     163    ContainerNode* m_currentNode { nullptr };
     164    Vector<ContainerNode*> m_currentNodeStack;
    176165
    177         std::unique_ptr<XMLErrors> m_xmlErrors;
     166    RefPtr<Text> m_leafTextNode;
    178167
    179         RefPtr<PendingScript> m_pendingScript;
    180         TextPosition m_scriptStartPosition;
     168    bool m_sawError { false };
     169    bool m_sawCSS { false };
     170    bool m_sawXSLTransform { false };
     171    bool m_sawFirstElement { false };
     172    bool m_isXHTMLDocument { false };
     173    bool m_parserPaused { false };
     174    bool m_requestingScript { false };
     175    bool m_finishCalled { false };
    181176
    182         bool m_parsingFragment;
    183         AtomicString m_defaultNamespaceURI;
     177    std::unique_ptr<XMLErrors> m_xmlErrors;
    184178
    185         typedef HashMap<AtomicString, AtomicString> PrefixForNamespaceMap;
    186         PrefixForNamespaceMap m_prefixToNamespaceMap;
    187         SegmentedString m_pendingSrc;
    188     };
     179    RefPtr<PendingScript> m_pendingScript;
     180    TextPosition m_scriptStartPosition;
     181
     182    bool m_parsingFragment { false };
     183    AtomicString m_defaultNamespaceURI;
     184
     185    HashMap<AtomicString, AtomicString> m_prefixToNamespaceMap;
     186    SegmentedString m_pendingSrc;
     187};
    189188
    190189#if ENABLE(XSLT)
  • trunk/Source/WebCore/xml/parser/XMLDocumentParserLibxml2.cpp

    r208840 r209058  
    11/*
    22 * Copyright (C) 2000 Peter Kelly <pmk@post.com>
    3  * Copyright (C) 2005, 2006, 2008 Apple Inc. All rights reserved.
     3 * Copyright (C) 2005-2016 Apple Inc. All rights reserved.
    44 * Copyright (C) 2006 Alexey Proskuryakov <ap@webkit.org>
    55 * Copyright (C) 2007 Samuel Weinig <sam@webkit.org>
     
    3636#include "DocumentType.h"
    3737#include "Frame.h"
    38 #include "FrameLoader.h"
    39 #include "FrameView.h"
    4038#include "HTMLEntityParser.h"
    4139#include "HTMLHtmlElement.h"
    42 #include "HTMLLinkElement.h"
    43 #include "HTMLNames.h"
    44 #include "HTMLStyleElement.h"
    4540#include "HTMLTemplateElement.h"
    46 #include "LoadableClassicScript.h"
    4741#include "Page.h"
    4842#include "PendingScript.h"
    4943#include "ProcessingInstruction.h"
    5044#include "ResourceError.h"
    51 #include "ResourceRequest.h"
    5245#include "ResourceResponse.h"
    5346#include "ScriptElement.h"
    5447#include "ScriptSourceCode.h"
    55 #include "SecurityOrigin.h"
    5648#include "Settings.h"
    5749#include "StyleScope.h"
    58 #include "TextResourceDecoder.h"
    5950#include "TransformSource.h"
    6051#include "XMLNSNames.h"
    6152#include "XMLDocumentParserScope.h"
    6253#include <libxml/parserInternals.h>
    63 #include <wtf/Ref.h>
    6454#include <wtf/StringExtras.h>
    65 #include <wtf/Threading.h>
    66 #include <wtf/Vector.h>
    6755#include <wtf/unicode/UTF8.h>
    6856
     
    7563
    7664#if ENABLE(XSLT)
    77 static inline bool hasNoStyleInformation(Document* document)
    78 {
    79     if (document->sawElementsInKnownNamespaces())
     65
     66static inline bool shouldRenderInXMLTreeViewerMode(Document& document)
     67{
     68    if (document.sawElementsInKnownNamespaces())
    8069        return false;
    8170
    82     if (document->transformSourceDocument())
     71    if (document.transformSourceDocument())
    8372        return false;
    8473
    85     if (!document->frame() || !document->frame()->page())
     74    auto* frame = document.frame();
     75    if (!frame)
    8676        return false;
    8777
    88     if (!document->frame()->page()->settings().developerExtrasEnabled())
     78    if (!frame->settings().developerExtrasEnabled())
    8979        return false;
    9080
    91     if (document->frame()->tree().parent())
     81    if (frame->tree().parent())
    9282        return false; // This document is not in a top frame
    9383
    9484    return true;
    9585}
     86
    9687#endif
    9788
    9889class PendingCallbacks {
    99     WTF_MAKE_NONCOPYABLE(PendingCallbacks); WTF_MAKE_FAST_ALLOCATED;
     90    WTF_MAKE_FAST_ALLOCATED;
    10091public:
    101     PendingCallbacks() = default;
    102 
    10392    void appendStartElementNSCallback(const xmlChar* xmlLocalName, const xmlChar* xmlPrefix, const xmlChar* xmlURI, int numNamespaces, const xmlChar** namespaces, int numAttributes, int numDefaulted, const xmlChar** attributes)
    10493    {
     
    576565    : ScriptableDocumentParser(document)
    577566    , m_view(frameView)
    578     , m_context(nullptr)
    579567    , m_pendingCallbacks(std::make_unique<PendingCallbacks>())
    580     , m_depthTriggeringEntityExpansion(-1)
    581     , m_isParsingEntityDeclaration(false)
    582568    , m_currentNode(&document)
    583     , m_sawError(false)
    584     , m_sawCSS(false)
    585     , m_sawXSLTransform(false)
    586     , m_sawFirstElement(false)
    587     , m_isXHTMLDocument(false)
    588     , m_parserPaused(false)
    589     , m_requestingScript(false)
    590     , m_finishCalled(false)
    591569    , m_scriptStartPosition(TextPosition::belowRangePosition())
    592     , m_parsingFragment(false)
    593570{
    594571}
     
    596573XMLDocumentParser::XMLDocumentParser(DocumentFragment& fragment, Element* parentElement, ParserContentPolicy parserContentPolicy)
    597574    : ScriptableDocumentParser(fragment.document(), parserContentPolicy)
    598     , m_view(nullptr)
    599     , m_context(nullptr)
    600575    , m_pendingCallbacks(std::make_unique<PendingCallbacks>())
    601     , m_depthTriggeringEntityExpansion(-1)
    602     , m_isParsingEntityDeclaration(false)
    603576    , m_currentNode(&fragment)
    604     , m_sawError(false)
    605     , m_sawCSS(false)
    606     , m_sawXSLTransform(false)
    607     , m_sawFirstElement(false)
    608     , m_isXHTMLDocument(false)
    609     , m_parserPaused(false)
    610     , m_requestingScript(false)
    611     , m_finishCalled(false)
    612577    , m_scriptStartPosition(TextPosition::belowRangePosition())
    613578    , m_parsingFragment(true)
     
    11951160{
    11961161    const char* originalTarget = target;
    1197     WTF::Unicode::ConversionResult conversionResult = WTF::Unicode::convertUTF16ToUTF8(&utf16Entity,
    1198         utf16Entity + numberOfCodeUnits, &target, target + targetSize);
     1162    auto conversionResult = WTF::Unicode::convertUTF16ToUTF8(&utf16Entity, utf16Entity + numberOfCodeUnits, &target, target + targetSize);
    11991163    if (conversionResult != WTF::Unicode::conversionOK)
    12001164        return 0;
     
    13651329
    13661330#if ENABLE(XSLT)
    1367     bool xmlViewerMode = !m_sawError && !m_sawCSS && !m_sawXSLTransform && hasNoStyleInformation(document());
     1331    bool xmlViewerMode = !m_sawError && !m_sawCSS && !m_sawXSLTransform && shouldRenderInXMLTreeViewerMode(*document());
    13681332    if (xmlViewerMode) {
    13691333        XMLTreeViewer xmlTreeViewer(*document());
     
    14511415    }
    14521416
    1453     // Then, write any pending data
    1454     SegmentedString rest = m_pendingSrc;
    1455     m_pendingSrc.clear();
    14561417    // There is normally only one string left, so toString() shouldn't copy.
    14571418    // In any case, the XML parser runs on the main thread and it's OK if
    14581419    // the passed string has more than one reference.
    1459     append(rest.toString().impl());
     1420    auto rest = m_pendingSrc.toString();
     1421    m_pendingSrc.clear();
     1422    append(rest.impl());
    14601423
    14611424    // Finally, if finish() has been called and write() didn't result
Note: See TracChangeset for help on using the changeset viewer.