Context Navigation

← Previous Changeset
Next Changeset →

Changeset 10867 in webkit

Timestamp:

Oct 17, 2005, 8:15:31 PM (20 years ago)

Author:

mjs

Message:

JavaScriptCore:

Reviewed by Geoff. Code changes by Darin.

some micro-optimizations to FastMalloc to reduce math and branches.

kxmlcore/FastMalloc.cpp: (KXMLCore::TCMalloc_Central_FreeList::Populate): (KXMLCore::fastMallocRegisterThread): (KXMLCore::TCMalloc_ThreadCache::GetCache): (KXMLCore::TCMalloc_ThreadCache::GetCacheIfPresent):

WebCore:

Reviewed by Geoff.

Speed up the tokenizer by keeping more state on the stack instead of in the object,
to avoid load-store traffic. About a .5% speedup.

khtml/html/htmltokenizer.cpp: (khtml::HTMLTokenizer::HTMLTokenizer): (khtml::HTMLTokenizer::reset): (khtml::HTMLTokenizer::begin): (khtml::HTMLTokenizer::setForceSynchronous): (khtml::HTMLTokenizer::processListing): (khtml::HTMLTokenizer::parseSpecial): (khtml::HTMLTokenizer::scriptHandler): (khtml::HTMLTokenizer::scriptExecution): (khtml::HTMLTokenizer::parseComment): (khtml::HTMLTokenizer::parseServer): (khtml::HTMLTokenizer::parseProcessingInstruction): (khtml::HTMLTokenizer::parseText): (khtml::HTMLTokenizer::parseEntity): (khtml::HTMLTokenizer::parseTag): (khtml::HTMLTokenizer::continueProcessing): (khtml::HTMLTokenizer::write): (khtml::HTMLTokenizer::allDataProcessed): (khtml::HTMLTokenizer::end): (khtml::HTMLTokenizer::finish): (khtml::HTMLTokenizer::notifyFinished): (khtml::HTMLTokenizer::isWaitingForScripts):
khtml/html/htmltokenizer.h: (khtml::HTMLTokenizer::): (khtml::HTMLTokenizer::State::State): (khtml::HTMLTokenizer::State::tagState): (khtml::HTMLTokenizer::State::setTagState): (khtml::HTMLTokenizer::State::entityState): (khtml::HTMLTokenizer::State::setEntityState): (khtml::HTMLTokenizer::State::inScript): (khtml::HTMLTokenizer::State::setInScript): (khtml::HTMLTokenizer::State::inStyle): (khtml::HTMLTokenizer::State::setInStyle): (khtml::HTMLTokenizer::State::inSelect): (khtml::HTMLTokenizer::State::setInSelect): (khtml::HTMLTokenizer::State::inXmp): (khtml::HTMLTokenizer::State::setInXmp): (khtml::HTMLTokenizer::State::inTitle): (khtml::HTMLTokenizer::State::setInTitle): (khtml::HTMLTokenizer::State::inPlainText): (khtml::HTMLTokenizer::State::setInPlainText): (khtml::HTMLTokenizer::State::inProcessingInstruction): (khtml::HTMLTokenizer::State::setInProcessingInstruction): (khtml::HTMLTokenizer::State::inComment): (khtml::HTMLTokenizer::State::setInComment): (khtml::HTMLTokenizer::State::inTextArea): (khtml::HTMLTokenizer::State::setInTextArea): (khtml::HTMLTokenizer::State::escaped): (khtml::HTMLTokenizer::State::setEscaped): (khtml::HTMLTokenizer::State::inServer): (khtml::HTMLTokenizer::State::setInServer): (khtml::HTMLTokenizer::State::skipLF): (khtml::HTMLTokenizer::State::setSkipLF): (khtml::HTMLTokenizer::State::startTag): (khtml::HTMLTokenizer::State::setStartTag): (khtml::HTMLTokenizer::State::discardLF): (khtml::HTMLTokenizer::State::setDiscardLF): (khtml::HTMLTokenizer::State::allowYield): (khtml::HTMLTokenizer::State::setAllowYield): (khtml::HTMLTokenizer::State::loadingExtScript): (khtml::HTMLTokenizer::State::setLoadingExtScript): (khtml::HTMLTokenizer::State::forceSynchronous): (khtml::HTMLTokenizer::State::setForceSynchronous): (khtml::HTMLTokenizer::State::inAnySpecial): (khtml::HTMLTokenizer::State::hasTagState): (khtml::HTMLTokenizer::State::hasEntityState): (khtml::HTMLTokenizer::State::): (khtml::HTMLTokenizer::State::setBit): (khtml::HTMLTokenizer::State::testBit):
khtml/rendering/bidi.cpp: (khtml::RenderBlock::checkLinesForTextOverflow):
khtml/rendering/render_block.cpp: (khtml::RenderBlock::updateFirstLetter):
khtml/rendering/render_flow.cpp: (RenderFlow::caretRect):
khtml/rendering/render_line.cpp: (khtml::EllipsisBox::paint):
khtml/rendering/render_object.cpp: (RenderObject::firstLineStyle):
khtml/rendering/render_object.h: (khtml::RenderObject::style):

Location:

trunk

Files:

: 11 edited

JavaScriptCore/ChangeLog (modified) (1 diff)
JavaScriptCore/kxmlcore/FastMalloc.cpp (modified) (3 diffs)
WebCore/ChangeLog-2005-12-19 (modified) (1 diff)
WebCore/khtml/html/htmltokenizer.cpp (modified) (72 diffs)
WebCore/khtml/html/htmltokenizer.h (modified) (7 diffs)
WebCore/khtml/rendering/bidi.cpp (modified) (1 diff)
WebCore/khtml/rendering/render_block.cpp (modified) (2 diffs)
WebCore/khtml/rendering/render_flow.cpp (modified) (1 diff)
WebCore/khtml/rendering/render_line.cpp (modified) (1 diff)
WebCore/khtml/rendering/render_object.cpp (modified) (1 diff)
WebCore/khtml/rendering/render_object.h (modified) (1 diff)

Legend:

: Unmodified
: Added
: Removed

trunk/JavaScriptCore/ChangeLog

-              r10857
+              r10867
+-10-17  Maciej Stachowiak  <mjs@apple.com>
+        Reviewed by Geoff. Code changes by Darin.
+        - some micro-optimizations to FastMalloc to reduce math and branches.
+        * kxmlcore/FastMalloc.cpp:
+        (KXMLCore::TCMalloc_Central_FreeList::Populate):
+        (KXMLCore::fastMallocRegisterThread):
+        (KXMLCore::TCMalloc_ThreadCache::GetCache):
+        (KXMLCore::TCMalloc_ThreadCache::GetCacheIfPresent):
 -10-15  Maciej Stachowiak  <mjs@apple.com>

trunk/JavaScriptCore/kxmlcore/FastMalloc.cpp

-              r10703
+              r10867
   const size_t size = ByteSizeForClass(size_class_);
   int num = 0;
+  while (ptr + size <= limit) {
+  char* nptr;
+  while ((nptr = ptr + size) <= limit) {
     *tail = ptr;
     tail = reinterpret_cast<void**>(ptr);
     ptr += size;
+    ptr = nptr;
     num++;
+  }
 …
         // And other threads can't get it wrong because they must have gone through
         // this function before allocating so they've synchronized.
+        // Also, mainThreadCache is only set when isMultiThreaded is false,
+        // to save a branchin some cases.
         SpinLockHolder lock(&multiThreadedLock);
         isMultiThreaded = true;
+    }
+}
+inline TCMalloc_ThreadCache* TCMalloc_ThreadCache::GetCache() {
+        mainThreadCache = 0;
+    }
+}
+ALWAYS_INLINE TCMalloc_ThreadCache* TCMalloc_ThreadCache::GetCache() {
   void* ptr = NULL;
   if (!tsd_inited) {
     InitModule();
   } else {
       if (!isMultiThreaded)
+      if (mainThreadCache)
           ptr = mainThreadCache;
       else
 …
 // already cleaned up the cache for this thread.
 inline TCMalloc_ThreadCache* TCMalloc_ThreadCache::GetCacheIfPresent() {
   if (!isMultiThreaded)
+  if (mainThreadCache)
       return mainThreadCache;
   if (!tsd_inited) return NULL;

trunk/WebCore/ChangeLog-2005-12-19

-              r10866
+              r10867
+-10-17  Maciej Stachowiak  <mjs@apple.com>
+        Reviewed by Geoff.
+        Speed up the tokenizer by keeping more state on the stack instead of in the object,
+        to avoid load-store traffic. About a .5% speedup.
+        * khtml/html/htmltokenizer.cpp:
+        (khtml::HTMLTokenizer::HTMLTokenizer):
+        (khtml::HTMLTokenizer::reset):
+        (khtml::HTMLTokenizer::begin):
+        (khtml::HTMLTokenizer::setForceSynchronous):
+        (khtml::HTMLTokenizer::processListing):
+        (khtml::HTMLTokenizer::parseSpecial):
+        (khtml::HTMLTokenizer::scriptHandler):
+        (khtml::HTMLTokenizer::scriptExecution):
+        (khtml::HTMLTokenizer::parseComment):
+        (khtml::HTMLTokenizer::parseServer):
+        (khtml::HTMLTokenizer::parseProcessingInstruction):
+        (khtml::HTMLTokenizer::parseText):
+        (khtml::HTMLTokenizer::parseEntity):
+        (khtml::HTMLTokenizer::parseTag):
+        (khtml::HTMLTokenizer::continueProcessing):
+        (khtml::HTMLTokenizer::write):
+        (khtml::HTMLTokenizer::allDataProcessed):
+        (khtml::HTMLTokenizer::end):
+        (khtml::HTMLTokenizer::finish):
+        (khtml::HTMLTokenizer::notifyFinished):
+        (khtml::HTMLTokenizer::isWaitingForScripts):
+        * khtml/html/htmltokenizer.h:
+        (khtml::HTMLTokenizer::):
+        (khtml::HTMLTokenizer::State::State):
+        (khtml::HTMLTokenizer::State::tagState):
+        (khtml::HTMLTokenizer::State::setTagState):
+        (khtml::HTMLTokenizer::State::entityState):
+        (khtml::HTMLTokenizer::State::setEntityState):
+        (khtml::HTMLTokenizer::State::inScript):
+        (khtml::HTMLTokenizer::State::setInScript):
+        (khtml::HTMLTokenizer::State::inStyle):
+        (khtml::HTMLTokenizer::State::setInStyle):
+        (khtml::HTMLTokenizer::State::inSelect):
+        (khtml::HTMLTokenizer::State::setInSelect):
+        (khtml::HTMLTokenizer::State::inXmp):
+        (khtml::HTMLTokenizer::State::setInXmp):
+        (khtml::HTMLTokenizer::State::inTitle):
+        (khtml::HTMLTokenizer::State::setInTitle):
+        (khtml::HTMLTokenizer::State::inPlainText):
+        (khtml::HTMLTokenizer::State::setInPlainText):
+        (khtml::HTMLTokenizer::State::inProcessingInstruction):
+        (khtml::HTMLTokenizer::State::setInProcessingInstruction):
+        (khtml::HTMLTokenizer::State::inComment):
+        (khtml::HTMLTokenizer::State::setInComment):
+        (khtml::HTMLTokenizer::State::inTextArea):
+        (khtml::HTMLTokenizer::State::setInTextArea):
+        (khtml::HTMLTokenizer::State::escaped):
+        (khtml::HTMLTokenizer::State::setEscaped):
+        (khtml::HTMLTokenizer::State::inServer):
+        (khtml::HTMLTokenizer::State::setInServer):
+        (khtml::HTMLTokenizer::State::skipLF):
+        (khtml::HTMLTokenizer::State::setSkipLF):
+        (khtml::HTMLTokenizer::State::startTag):
+        (khtml::HTMLTokenizer::State::setStartTag):
+        (khtml::HTMLTokenizer::State::discardLF):
+        (khtml::HTMLTokenizer::State::setDiscardLF):
+        (khtml::HTMLTokenizer::State::allowYield):
+        (khtml::HTMLTokenizer::State::setAllowYield):
+        (khtml::HTMLTokenizer::State::loadingExtScript):
+        (khtml::HTMLTokenizer::State::setLoadingExtScript):
+        (khtml::HTMLTokenizer::State::forceSynchronous):
+        (khtml::HTMLTokenizer::State::setForceSynchronous):
+        (khtml::HTMLTokenizer::State::inAnySpecial):
+        (khtml::HTMLTokenizer::State::hasTagState):
+        (khtml::HTMLTokenizer::State::hasEntityState):
+        (khtml::HTMLTokenizer::State::):
+        (khtml::HTMLTokenizer::State::setBit):
+        (khtml::HTMLTokenizer::State::testBit):
+        * khtml/rendering/bidi.cpp:
+        (khtml::RenderBlock::checkLinesForTextOverflow):
+        * khtml/rendering/render_block.cpp:
+        (khtml::RenderBlock::updateFirstLetter):
+        * khtml/rendering/render_flow.cpp:
+        (RenderFlow::caretRect):
+        * khtml/rendering/render_line.cpp:
+        (khtml::EllipsisBox::paint):
+        * khtml/rendering/render_object.cpp:
+        (RenderObject::firstLineStyle):
+        * khtml/rendering/render_object.h:
+        (khtml::RenderObject::style):
 -10-17  Maciej Stachowiak  <mjs@apple.com>

trunk/WebCore/khtml/html/htmltokenizer.cpp

-              r10854
+              r10867
     parser = new HTMLParser(_view, _doc, includesComments);
     m_executingScript = 0;
-    loadingExtScript = false;
     onHold = false;
     timerId = 0;
 …
     parser = new HTMLParser(i, _doc, includesComments);
     m_executingScript = 0;
-    loadingExtScript = false;
     onHold = false;
     timerId = 0;
 …
+    }
     if ( buffer )
+    if (buffer)
         KHTML_DELETE_QCHAR_VEC(buffer);
     buffer = dest = 0;
     size = 0;
     if ( scriptCode )
+    if (scriptCode)
         KHTML_DELETE_QCHAR_VEC(scriptCode);
     scriptCode = 0;
 …
+    }
     timerId = 0;
     allowYield = false;
     forceSynchronous = false;
+    m_state.setAllowYield(false);
+    m_state.setForceSynchronous(false);
     currToken.reset();
 …
+{
     m_executingScript = 0;
     loadingExtScript = false;
+    m_state.setLoadingExtScript(false);
     onHold = false;
     reset();
 …
     buffer = KHTML_ALLOC_QCHAR_VEC( 255 );
     dest = buffer;
-    tag = NoTag;
-    discard = NoneDiscard;
-    plaintext = false;
-    xmp = false;
-    processingInstruction = false;
-    script = false;
-    escaped = false;
-    style = false;
-    skipLF = false;
-    select = false;
-    comment = false;
-    server = false;
-    textarea = false;
-    title = false;
-    startTag = false;
     tquote = NoQuote;
     searchCount = 0;
+    Entity = NoEntity;
+    loadingExtScript = false;
+    m_state.setEntityState(NoEntity);
     scriptSrc = QString::null;
     pendingSrc.clear();
 …
     scriptStartLineno = 0;
     tagStartLineno = 0;
     forceSynchronous = false;
+    m_state.setForceSynchronous(false);
+}
 void HTMLTokenizer::setForceSynchronous(bool force)
+{
     forceSynchronous = force;
+}
 void HTMLTokenizer::processListing(TokenizerString list)
+    m_state.setForceSynchronous(force);
+}
+HTMLTokenizer::State HTMLTokenizer::processListing(TokenizerString list, State state)
+{
     // This function adds the listing 'list' as
 …
         checkBuffer();
         if (skipLF && *list != '\n')
             skipLF = false;
         if (skipLF) {
             skipLF = false;
+        if (state.skipLF() && *list != '\n')
+            state.setSkipLF(false);
+        if (state.skipLF()) {
+            state.setSkipLF(false);
             ++list;
         } else if (*list == '\n' || *list == '\r') {
             if (discard == LFDiscard)
+            if (state.discardLF())
                 // Ignore this LF
                 discard = NoneDiscard; // We have discarded 1 LF
+                state.setDiscardLF(false); // We have discarded 1 LF
             else
                 *dest++ = '\n';
 …
             /* Check for MS-DOS CRLF sequence */
             if (*list == '\r')
                 skipLF = true;
+                state.setSkipLF(true);
             ++list;
         } else {
             discard = NoneDiscard;
+            state.setDiscardLF(false);
             *dest++ = *list;
             ++list;
+        }
+    }
+}
+void HTMLTokenizer::parseSpecial(TokenizerString &src)
+{
+    assert( textarea || title || !Entity );
+    assert( !tag );
+    assert( xmp+textarea+title+style+script == 1 );
+    if (script)
+        scriptStartLineno = lineno+src.lineCount();
+    if ( comment ) parseComment( src );
+    return state;
+}
+HTMLTokenizer::State HTMLTokenizer::parseSpecial(TokenizerString &src, State state)
+{
+    assert(state.inTextArea() || state.inTitle() || !state.hasEntityState());
+    assert(!state.hasTagState());
+    assert(state.inXmp() + state.inTextArea() + state.inTitle() + state.inStyle() + state.inScript() == 1 );
+    if (state.inScript())
+        scriptStartLineno = lineno + src.lineCount();
+    if (state.inComment())
+        state = parseComment(src, state);
     while ( !src.isEmpty() ) {
         checkScriptBuffer();
         unsigned char ch = src->latin1();
         if ( !scriptCodeResync && !brokenComments && !textarea && !xmp && !title && ch == '-' && scriptCodeSize >= 3 && !src.escaped() && scriptCode[scriptCodeSize-3] == '<' && scriptCode[scriptCodeSize-2] == '!' && scriptCode[scriptCodeSize-1] == '-' ) {
             comment = true;
             parseComment( src );
+        if (!scriptCodeResync && !brokenComments && !state.inTextArea() && !state.inXmp() && !state.inTitle() && ch == '-' && scriptCodeSize >= 3 && !src.escaped() && scriptCode[scriptCodeSize-3] == '<' && scriptCode[scriptCodeSize-2] == '!' && scriptCode[scriptCodeSize-1] == '-') {
+            state.setInComment(true);
+            state = parseComment(src, state);
             continue;
+        }
 …
             scriptCodeResync = 0;
             scriptCode[ scriptCodeSize ] = scriptCode[ scriptCodeSize + 1 ] = 0;
             if ( script )
                 scriptHandler();
+            if (state.inScript())
+                state = scriptHandler(state);
             else {
                 processListing(TokenizerString(scriptCode, scriptCodeSize));
+                state = processListing(TokenizerString(scriptCode, scriptCodeSize), state);
                 processToken();
+                if ( style )         { currToken.tagName = styleTag.localName(); currToken.beginTag = false; }
+                else if ( textarea ) { currToken.tagName = textareaTag.localName(); currToken.beginTag = false; }
+                else if ( title ) { currToken.tagName = titleTag.localName(); currToken.beginTag = false; }
+                else if ( xmp )  { currToken.tagName = xmpTag.localName(); currToken.beginTag = false; }
+                if (state.inStyle()) {
+                    currToken.tagName = styleTag.localName();
+                    currToken.beginTag = false;
+                } else if (state.inTextArea()) {
+                    currToken.tagName = textareaTag.localName();
+                    currToken.beginTag = false;
+                } else if (state.inTitle()) {
+                    currToken.tagName = titleTag.localName();
+                    currToken.beginTag = false;
+                } else if (state.inXmp()) {
+                    currToken.tagName = xmpTag.localName();
+                    currToken.beginTag = false;
+                }
                 processToken();
+                style = script = style = textarea = title = xmp = false;
+                state.setInStyle(false);
+                state.setInScript(false);
+                state.setInTextArea(false);
+                state.setInTitle(false);
+                state.setInXmp(false);
                 tquote = NoQuote;
                 scriptCodeSize = scriptCodeResync = 0;
+            }
             return;
+            return state;
+        }
         // possible end of tagname, lets check.
         if ( !scriptCodeResync && !escaped && !src.escaped() && ( ch == '>' || ch == '/' || ch <= ' ' ) && ch &&
+        if ( !scriptCodeResync && !state.escaped() && !src.escaped() && ( ch == '>' || ch == '/' || ch <= ' ' ) && ch &&
              scriptCodeSize >= searchStopperLen &&
              tagMatch( searchStopper, scriptCode+scriptCodeSize-searchStopperLen, searchStopperLen )) {
 …
             continue;
+        }
         if ( scriptCodeResync && !escaped ) {
+        if ( scriptCodeResync && !state.escaped() ) {
             if(ch == '\"')
                 tquote = (tquote == NoQuote) ? DoubleQuote : ((tquote == SingleQuote) ? SingleQuote : NoQuote);
 …
                 tquote = NoQuote;
+        }
         escaped = ( !escaped && ch == '\\' );
         if (!scriptCodeResync && (textarea||title) && !src.escaped() && ch == '&') {
+        state.setEscaped(!state.escaped() && ch == '\\');
+        if (!scriptCodeResync && (state.inTextArea() || state.inTitle()) && !src.escaped() && ch == '&') {
             QChar *scriptCodeDest = scriptCode+scriptCodeSize;
             ++src;
             parseEntity(src,scriptCodeDest,true);
+            state = parseEntity(src, scriptCodeDest, state, m_cBufferPos, true, false);
             scriptCodeSize = scriptCodeDest-scriptCode;
+        }
 …
+        }
+    }
+}
+void HTMLTokenizer::scriptHandler()
+    return state;
+}
+HTMLTokenizer::State HTMLTokenizer::scriptHandler(State state)
+{
     // We are inside a <script>
 …
         doScriptExec = true;
+    }
     processListing(TokenizerString(scriptCode, scriptCodeSize));
+    state = processListing(TokenizerString(scriptCode, scriptCodeSize), state);
     QString exScript( buffer, dest-buffer );
     processToken();
 …
             setSrc(TokenizerString());
             scriptCodeSize = scriptCodeResync = 0;
+            // the ref() call below may call notifyFinished if the script is already in cache,
+            // and that mucks with the state directly, so we must write it back to the object.
+            m_state = state;
             cs->ref(this);
+            state = m_state;
             // will be 0 if script was already loaded and ref() executed it
             if (!pendingScripts.isEmpty())
                 loadingExtScript = true;
+                state.setLoadingExtScript(true);
+        }
         else if (view && doScriptExec && javascript ) {
 …
             //QTime dt;
             //dt.start();
             scriptExecution( exScript, QString::null, scriptStartLineno );
+            state = scriptExecution(exScript, state, QString::null, scriptStartLineno);
             //kdDebug( 6036 ) << "script execution time:" << dt.elapsed() << endl;
+        }
+    }
     script = false;
+    state.setInScript(false);
     scriptCodeSize = scriptCodeResync = 0;
     if ( !m_executingScript && !loadingExtScript ) {
+    if (!m_executingScript && !state.loadingExtScript()) {
         // kdDebug( 6036 ) << "adding pending Output to parsed string" << endl;
         src.append(pendingSrc);
 …
         // because we want to prepend to pendingSrc rather than appending
         // if there's no previous prependingSrc
         if (loadingExtScript) {
+        if (state.loadingExtScript()) {
             if (currentPrependingSrc) {
                 currentPrependingSrc->append(prependingSrc);
 …
+            }
         } else {
+            m_state = state;
             write(prependingSrc, false);
+            state = m_state;
+        }
+    }
     currentPrependingSrc = savedPrependingSrc;
+}
+void HTMLTokenizer::scriptExecution( const QString& str, QString scriptURL,
+                                     int baseLine)
+    return state;
+}
+HTMLTokenizer::State HTMLTokenizer::scriptExecution(const QString& str, State state,
+                                                    QString scriptURL, int baseLine)
+{
 #if APPLE_CHANGES
     if (!view || !view->part())
         return;
 #endif
     bool oldscript = script;
+        return state;
+#endif
+    bool oldscript = state.inScript();
     m_executingScript++;
     script = false;
+    state.setInScript(false);
     QString url;
     if (scriptURL.isNull())
 …
 #endif
+    m_state = state;
     view->part()->executeScript(url,baseLine,0,str);
+    allowYield = true;
+    state = m_state;
+    state.setAllowYield(true);
 #ifdef INSTRUMENT_LAYOUT_SCHEDULING
 …
     m_executingScript--;
     script = oldscript;
     if ( !m_executingScript && !loadingExtScript ) {
+    state.setInScript(oldscript);
+    if (!m_executingScript && !state.loadingExtScript()) {
         // kdDebug( 6036 ) << "adding pending Output to parsed string" << endl;
         src.append(pendingSrc);
 …
         // because we want to prepend to pendingSrc rather than appending
         // if there's no previous prependingSrc
         if (loadingExtScript) {
+        if (state.loadingExtScript()) {
             if (currentPrependingSrc) {
                 currentPrependingSrc->append(prependingSrc);
 …
+            }
         } else {
+            m_state = state;
             write(prependingSrc, false);
+            state = m_state;
+        }
+    }
     currentPrependingSrc = savedPrependingSrc;
+}
+void HTMLTokenizer::parseComment(TokenizerString &src)
+    return state;
+}
+HTMLTokenizer::State HTMLTokenizer::parseComment(TokenizerString &src, State state)
+{
     // FIXME: Why does this code even run for comments inside <script> and <style>? This seems bogus.
     bool strict = !parser->doc()->inCompatMode() && !script && !style;
+    bool strict = !parser->doc()->inCompatMode() && !state.inScript() && !state.inStyle();
     int delimiterCount = 0;
     bool canClose = false;
 …
         if ((!strict || canClose) && src->unicode() == '>') {
             bool handleBrokenComments = brokenComments && !(script || style);
+            bool handleBrokenComments = brokenComments && !(state.inScript() || state.inStyle());
             int endCharsCount = 1; // start off with one for the '>' character
             if (!strict) {
 …
             if (canClose || handleBrokenComments || endCharsCount > 1) {
                 ++src;
                 if (!( script || xmp || textarea || style)) {
+                if (!(state.inScript() || state.inXmp() || state.inTextArea() || state.inStyle())) {
                     if (includesCommentsInDOM) {
                         checkScriptBuffer();
 …
                         currToken.tagName = commentAtom;
                         currToken.beginTag = true;
                         processListing(TokenizerString(scriptCode, scriptCodeSize - endCharsCount));
+                        state = processListing(TokenizerString(scriptCode, scriptCodeSize - endCharsCount), state);
                         processToken();
                         currToken.tagName = commentAtom;
 …
                     scriptCodeSize = 0;
+                }
                 comment = false;
                 return; // Finished parsing comment
+                state.setInComment(false);
+                return state; // Finished parsing comment
+            }
+        }
         ++src;
+    }
+}
+void HTMLTokenizer::parseServer(TokenizerString &src)
+    return state;
+}
+HTMLTokenizer::State HTMLTokenizer::parseServer(TokenizerString& src, State state)
+{
     checkScriptBuffer(src.length());
     while ( !src.isEmpty() ) {
         scriptCode[ scriptCodeSize++ ] = *src;
+    while (!src.isEmpty()) {
+        scriptCode[scriptCodeSize++] = *src;
         if (src->unicode() == '>' &&
             scriptCodeSize > 1 && scriptCode[scriptCodeSize-2] == '%') {
             ++src;
             server = false;
+            state.setInServer(false);
             scriptCodeSize = 0;
             return; // Finished parsing server include
+            return state; // Finished parsing server include
+        }
         ++src;
+    }
+}
+void HTMLTokenizer::parseProcessingInstruction(TokenizerString &src)
+    return state;
+}
+HTMLTokenizer::State HTMLTokenizer::parseProcessingInstruction(TokenizerString &src, State state)
+{
     char oldchar = 0;
 …
+        {
             // We got a '?>' sequence
             processingInstruction = false;
+            state.setInProcessingInstruction(false);
             ++src;
             discard = LFDiscard;
             return; // Finished parsing comment!
+            state.setDiscardLF(true);
+            return state; // Finished parsing comment!
+        }
         ++src;
         oldchar = chbegin;
+    }
+}
+void HTMLTokenizer::parseText(TokenizerString &src)
+{
+    while ( !src.isEmpty() )
+    return state;
+}
+HTMLTokenizer::State HTMLTokenizer::parseText(TokenizerString &src, State state)
+{
+    while (!src.isEmpty())
+    {
         // do we need to enlarge the buffer?
 …
         unsigned char chbegin = src->latin1();
+        if (skipLF && ( chbegin != '\n' ))
+        if (state.skipLF() && (chbegin != '\n' ))
+            state.setSkipLF(false);
+        if (state.skipLF())
+        {
+            skipLF = false;
+        }
+        if (skipLF)
+        {
+            skipLF = false;
+            state.setSkipLF(false);
             ++src;
+        }
+        else if (( chbegin == '\n' ) || ( chbegin == '\r' ))
+        } else if ((chbegin == '\n') || (chbegin == '\r'))
+        {
             if (chbegin == '\r')
                 skipLF = true;
+                state.setSkipLF(true);
             *dest++ = '\n';
 …
+        }
+    }
+}
+void HTMLTokenizer::parseEntity(TokenizerString &src, QChar *&dest, bool start)
+{
+    if( start )
+    return state;
+}
+HTMLTokenizer::State HTMLTokenizer::parseEntity(TokenizerString &src, QChar *&dest, State state, unsigned &cBufferPos, bool start, bool parsingTag)
+{
+    if (start)
+    {
         cBufferPos = 0;
         Entity = SearchEntity;
+        state.setEntityState(SearchEntity);
         EntityUnicodeValue = 0;
+    }
     while( !src.isEmpty() )
+    while(!src.isEmpty())
+    {
         ushort cc = src->unicode();
         switch(Entity) {
+        switch(state.entityState()) {
         case NoEntity:
             assert(Entity != NoEntity);
             return;
+            assert(state.entityState() != NoEntity);
+            return state;
         case SearchEntity:
 …
                 cBuffer[cBufferPos++] = cc;
                 ++src;
                 Entity = NumericSearch;
+                state.setEntityState(NumericSearch);
+            }
             else
                 Entity = EntityName;
+                state.setEntityState(EntityName);
             break;
 …
                 cBuffer[cBufferPos++] = cc;
                 ++src;
                 Entity = Hexadecimal;
+                state.setEntityState(Hexadecimal);
+            }
             else if(cc >= '0' && cc <= '9')
                 Entity = Decimal;
+                state.setEntityState(Decimal);
             else
                 Entity = SearchSemicolon;
+                state.setEntityState(SearchSemicolon);
             break;
 …
                 if(csrc.row() || !((cc >= '0' && cc <= '9') || (cc >= 'a' && cc <= 'f'))) {
                     Entity = SearchSemicolon;
+                    state.setEntityState(SearchSemicolon);
                     break;
+                }
 …
                 ++src;
+            }
+            if(cBufferPos == 10)  Entity = SearchSemicolon;
+            if (cBufferPos == 10)
+                state.setEntityState(SearchSemicolon);
             break;
+        }
 …
                 if(src->row() || !(cc >= '0' && cc <= '9')) {
                     Entity = SearchSemicolon;
+                    state.setEntityState(SearchSemicolon);
                     break;
+                }
 …
                 ++src;
+            }
+            if(cBufferPos == 9)  Entity = SearchSemicolon;
+            if (cBufferPos == 9)
+                state.setEntityState(SearchSemicolon);
             break;
+        }
 …
                 if(csrc.row() || !((cc >= 'a' && cc <= 'z') ||
                                    (cc >= '0' && cc <= '9') || (cc >= 'A' && cc <= 'Z'))) {
                     Entity = SearchSemicolon;
+                    state.setEntityState(SearchSemicolon);
                     break;
+                }
 …
                 ++src;
+            }
+            if(cBufferPos == 9) Entity = SearchSemicolon;
+            if(Entity == SearchSemicolon) {
+            if (cBufferPos == 9)
+                state.setEntityState(SearchSemicolon);
+            if (state.entityState() == SearchSemicolon) {
                 if(cBufferPos > 1) {
                     const entity *e = findEntity(cBuffer, cBufferPos);
 …
                     // be IE compatible
                     if(tag && EntityUnicodeValue > 255 && *src != ';')
+                    if(parsingTag && EntityUnicodeValue > 255 && *src != ';')
                         EntityUnicodeValue = 0;
+                }
 …
+            }
+            Entity = NoEntity;
+            return;
+        }
+    }
+}
+void HTMLTokenizer::parseTag(TokenizerString &src)
+{
+    assert(!Entity );
+    while ( !src.isEmpty() )
+            state.setEntityState(NoEntity);
+            return state;
+        }
+    }
+    return state;
+}
+HTMLTokenizer::State HTMLTokenizer::parseTag(TokenizerString &src, State state)
+{
+    assert(!state.hasEntityState());
+    unsigned cBufferPos = m_cBufferPos;
+    while (!src.isEmpty())
+    {
         checkBuffer();
 …
                QConstString((QChar*)src.operator->(), l).qstring().latin1(), tquote);
 #endif
         switch(tag) {
+        switch(state.tagState()) {
         case NoTag:
+        {
+            return;
+            m_cBufferPos = cBufferPos;
+            return state;
+        }
         case TagName:
 …
                         ++src;
                         dest = buffer; // ignore the previous part of this tag
                         comment = true;
                         tag = NoTag;
+                        state.setInComment(true);
+                        state.setTagState(NoTag);
                         // Fix bug 34302 at kde.bugs.org.  Go ahead and treat
 …
                         // can handle this case.  Only do this in quirks mode. -dwh
                         if (!src.isEmpty() && *src == '>' && parser->doc()->inCompatMode()) {
                           comment = false;
+                          state.setInComment(false);
                           ++src;
                           if (!src.isEmpty())
 …
+                        }
                         else
+                          parseComment(src);
+                        return; // Finished parsing tag!
+                          state = parseComment(src, state);
+                        m_cBufferPos = cBufferPos;
+                        return state; // Finished parsing tag!
+                    }
                     // cuts of high part, is okay
 …
+                }
                 dest = buffer;
                 tag = SearchAttribute;
+                state.setTagState(SearchAttribute);
                 cBufferPos = 0;
+            }
 …
                 if (curchar > ' ' && curchar != '\'' && curchar != '"') {
                     if (curchar == '<' || curchar == '>')
                         tag = SearchEnd;
+                        state.setTagState(SearchEnd);
                     else
                         tag = AttributeName;
+                        state.setTagState(AttributeName);
                     cBufferPos = 0;
 …
                     dest = buffer;
                     *dest++ = 0;
                     tag = SearchEqual;
+                    state.setTagState(SearchEqual);
                     // This is a deliberate quirk to match Mozilla and Opera.  We have to do this
                     // since sites that use the "standards-compliant" path sometimes send
 …
                 dest = buffer;
                 *dest++ = 0;
                 tag = SearchEqual;
+                state.setTagState(SearchEqual);
+            }
             break;
 …
                         kdDebug(6036) << "found equal" << endl;
 #endif
                         tag = SearchValue;
+                        state.setTagState(SearchValue);
                         ++src;
+                    }
 …
                         currToken.addAttribute(parser->docPtr()->document(), attrName, emptyAtom);
                         dest = buffer;
                         tag = SearchAttribute;
+                        state.setTagState(SearchAttribute);
+                    }
                     break;
 …
                     if(( curchar == '\'' || curchar == '\"' )) {
                         tquote = curchar == '\"' ? DoubleQuote : SingleQuote;
                         tag = QuotedValue;
+                        state.setTagState(QuotedValue);
                         ++src;
                     } else
                         tag = Value;
+                        state.setTagState(Value);
                     break;
 …
                     attrName = v; // Just make the name/value match. (FIXME: Is this some WinIE quirk?)
                     currToken.addAttribute(parser->docPtr()->document(), attrName, v);
                     tag = SearchAttribute;
+                    state.setTagState(SearchAttribute);
                     dest = buffer;
                     tquote = NoQuote;
 …
+                    {
                         ++src;
                         parseEntity(src, dest, true);
+                        state = parseEntity(src, dest, state, cBufferPos, true, true);
                         break;
+                    }
 …
                         dest = buffer;
                         tag = SearchAttribute;
+                        state.setTagState(SearchAttribute);
                         tquote = NoQuote;
                         ++src;
 …
+                    {
                         ++src;
                         parseEntity(src, dest, true);
+                        state = parseEntity(src, dest, state, cBufferPos, true, true);
                         break;
+                    }
 …
                         currToken.addAttribute(parser->docPtr()->document(), attrName, v);
                         dest = buffer;
                         tag = SearchAttribute;
+                        state.setTagState(SearchAttribute);
                         break;
+                    }
 …
             searchCount = 0; // Stop looking for '<!--' sequence
             tag = NoTag;
+            state.setTagState(NoTag);
             tquote = NoQuote;
 …
                 ++src;
+            if (currToken.tagName == nullAtom) //stop if tag is unknown
+                return;
+            if (currToken.tagName == nullAtom) { //stop if tag is unknown
+                m_cBufferPos = cBufferPos;
+                return state;
+            }
             AtomicString tagName = currToken.tagName;
 …
             if (tagName == preTag) {
                 discard = LFDiscard; // Discard the first LF after we open a pre.
+                state.setDiscardLF(true); // Discard the first LF after we open a pre.
             } else if (tagName == scriptTag) {
                 if (beginTag) {
                     searchStopper = scriptEnd;
                     searchStopperLen = 8;
                     script = true;
                     parseSpecial(src);
+                    state.setInScript(true);
+                    state = parseSpecial(src, state);
                 } else if (isSelfClosingScript) { // Handle <script src="foo"/>
                     script = true;
                     scriptHandler();
+                    state.setInScript(true);
+                    state = scriptHandler(state);
+                }
             } else if (tagName == styleTag) {
 …
                     searchStopper = styleEnd;
                     searchStopperLen = 7;
                     style = true;
                     parseSpecial(src);
+                    state.setInStyle(true);
+                    state = parseSpecial(src, state);
+                }
             } else if (tagName == textareaTag) {
 …
                     searchStopper = textareaEnd;
                     searchStopperLen = 10;
                     textarea = true;
                     parseSpecial(src);
+                    state.setInTextArea(true);
+                    state = parseSpecial(src, state);
+                }
             } else if (tagName == titleTag) {
 …
                     searchStopper = titleEnd;
                     searchStopperLen = 7;
                     title = true;
                     parseSpecial(src);
+                    state.setInTitle(true);
+                    state = parseSpecial(src, state);
+                }
             } else if (tagName == xmpTag) {
 …
                     searchStopper = xmpEnd;
                     searchStopperLen = 5;
                     xmp = true;
                     parseSpecial(src);
+                    state.setInXmp(true);
+                    state = parseSpecial(src, state);
+                }
             } else if (tagName == selectTag)
                 select = beginTag;
+                state.setInSelect(beginTag);
             else if (tagName == plaintextTag)
+                plaintext = beginTag;
+            return; // Finished parsing tag!
+                state.setInPlainText(beginTag);
+            m_cBufferPos = cBufferPos;
+            return state; // Finished parsing tag!
+        }
         } // end switch
+    }
+    return;
+}
+void HTMLTokenizer::write(const TokenizerString &str, bool appendData)
+{
+#ifdef TOKEN_DEBUG
+    kdDebug( 6036 ) << this << " Tokenizer::write(\"" << str.toString() << "\"," << appendData << ")" << endl;
+#endif
+    if (!buffer)
+        return;
+    if (loadStopped)
+        return;
+    if ( ( m_executingScript && appendData ) || !pendingScripts.isEmpty() ) {
+        // don't parse; we will do this later
+        if (currentPrependingSrc) {
+            currentPrependingSrc->append(str);
+        } else {
+            pendingSrc.append(str);
+        }
+        return;
+    }
+    if ( onHold ) {
+        src.append(str);
+        return;
+    }
+    if (!src.isEmpty())
+        src.append(str);
+    else
+        setSrc(str);
+    // Once a timer is set, it has control of when the tokenizer continues.
+    if (timerId)
+        return;
+    bool wasInWrite = inWrite;
+    inWrite = true;
+#ifdef INSTRUMENT_LAYOUT_SCHEDULING
+    if (!parser->doc()->ownerElement())
+        printf("Beginning write at time %d\n", parser->doc()->elapsedTime());
+#endif
+//     if (Entity)
+//         parseEntity(src, dest);
+    int processedCount = 0;
+    QTime startTime;
+    startTime.start();
+    KWQUIEventTime eventTime;
+    KHTMLPart* part = parser->doc()->part();
+    while (!src.isEmpty() && (!part || !part->isScheduledLocationChangePending())) {
+        if (!continueProcessing(processedCount, startTime, eventTime))
+            break;
+        // do we need to enlarge the buffer?
+        checkBuffer();
+        ushort cc = src->unicode();
+        if (skipLF && (cc != '\n'))
+            skipLF = false;
+        if (skipLF) {
+            skipLF = false;
+            ++src;
+        }
+        else if ( Entity )
+            parseEntity( src, dest );
+        else if ( plaintext )
+            parseText( src );
+        else if (script)
+            parseSpecial(src);
+        else if (style)
+            parseSpecial(src);
+        else if (xmp)
+            parseSpecial(src);
+        else if (textarea)
+            parseSpecial(src);
+        else if (title)
+            parseSpecial(src);
+        else if (comment)
+            parseComment(src);
+        else if (server)
+            parseServer(src);
+        else if (processingInstruction)
+            parseProcessingInstruction(src);
+        else if (tag)
+            parseTag(src);
+        else if ( startTag )
+        {
+            startTag = false;
+            switch(cc) {
+            case '/':
+                break;
+            case '!':
+            {
+                // <!-- comment -->
+                searchCount = 1; // Look for '<!--' sequence to start comment
+                break;
+            }
+            case '?':
+            {
+                // xml processing instruction
+                processingInstruction = true;
+                tquote = NoQuote;
+                parseProcessingInstruction(src);
+                continue;
+                break;
+            }
+            case '%':
+                if (!brokenServer) {
+                    // <% server stuff, handle as comment %>
+                    server = true;
+                    tquote = NoQuote;
+                    parseServer(src);
+                    continue;
+                }
+                // else fall through
+            default:
+            {
+                if( ((cc >= 'a') && (cc <= 'z')) || ((cc >= 'A') && (cc <= 'Z')))
+                {
+                    // Start of a Start-Tag
+                }
+                else
+                {
+                    // Invalid tag
+                    // Add as is
+                    *dest = '<';
+                    dest++;
+                    continue;
+                }
+            }
+            }; // end case
+            processToken();
+            cBufferPos = 0;
+            tag = TagName;
+            parseTag(src);
+        }
+        else if ( cc == '&' && !src.escaped())
+        {
+            ++src;
+            parseEntity(src, dest, true);
+        }
+        else if ( cc == '<' && !src.escaped())
+        {
+            tagStartLineno = lineno+src.lineCount();
+            ++src;
+            startTag = true;
+        }
+        else if (cc == '\n' || cc == '\r') {
+            if (discard == LFDiscard)
+                // Ignore this LF
+                discard = NoneDiscard; // We have discarded 1 LF
+            else
+                // Process this LF
+                *dest++ = '\n';
+            /* Check for MS-DOS CRLF sequence */
+            if (cc == '\r')
+                skipLF = true;
+            ++src;
+        } else {
+            discard = NoneDiscard;
+#if QT_VERSION < 300
+            unsigned char row = src->row();
+            if ( row > 0x05 && row < 0x10 || row > 0xfd )
+                    currToken.complexText = true;
+#endif
+            *dest = *src;
+            fixUpChar(*dest);
+            ++dest;
+            ++src;
+        }
+    }
+#ifdef INSTRUMENT_LAYOUT_SCHEDULING
+    if (!parser->doc()->ownerElement())
+        printf("Ending write at time %d\n", parser->doc()->elapsedTime());
+#endif
+    inWrite = wasInWrite;
+    if (noMoreData && !inWrite && !loadingExtScript && !m_executingScript && !timerId)
+        end(); // this actually causes us to be deleted
+}
+void HTMLTokenizer::stopped()
+{
+    if (timerId) {
+        killTimer(timerId);
+        timerId = 0;
+    }
+}
+bool HTMLTokenizer::processingData() const
+{
+    return timerId != 0;
+}
+bool HTMLTokenizer::continueProcessing(int& processedCount, const QTime& startTime, const KWQUIEventTime& eventTime)
+    m_cBufferPos = cBufferPos;
+    return state;
+}
+inline bool HTMLTokenizer::continueProcessing(int& processedCount, const QTime& startTime, const KWQUIEventTime& eventTime, State &state)
+{
     // We don't want to be checking elapsed time with every character, so we only check after we've
     // processed a certain number of characters.
     bool allowedYield = allowYield;
     allowYield = false;
     if (!loadingExtScript && !forceSynchronous && !m_executingScript && (processedCount > TOKENIZER_CHUNK_SIZE || allowedYield)) {
+    bool allowedYield = state.allowYield();
+    state.setAllowYield(false);
+    if (!state.loadingExtScript() && !state.forceSynchronous() && !m_executingScript && (processedCount > TOKENIZER_CHUNK_SIZE || allowedYield)) {
         processedCount = 0;
         if (startTime.elapsed() > TOKENIZER_TIME_DELAY) {
 …
+}
+void HTMLTokenizer::write(const TokenizerString &str, bool appendData)
+{
+#ifdef TOKEN_DEBUG
+    kdDebug( 6036 ) << this << " Tokenizer::write(\"" << str.toString() << "\"," << appendData << ")" << endl;
+#endif
+    if (!buffer)
+        return;
+    if (loadStopped)
+        return;
+    if ( ( m_executingScript && appendData ) || !pendingScripts.isEmpty() ) {
+        // don't parse; we will do this later
+        if (currentPrependingSrc) {
+            currentPrependingSrc->append(str);
+        } else {
+            pendingSrc.append(str);
+        }
+        return;
+    }
+    if (onHold) {
+        src.append(str);
+        return;
+    }
+    if (!src.isEmpty())
+        src.append(str);
+    else
+        setSrc(str);
+    // Once a timer is set, it has control of when the tokenizer continues.
+    if (timerId)
+        return;
+    bool wasInWrite = inWrite;
+    inWrite = true;
+#ifdef INSTRUMENT_LAYOUT_SCHEDULING
+    if (!parser->doc()->ownerElement())
+        printf("Beginning write at time %d\n", parser->doc()->elapsedTime());
+#endif
+    int processedCount = 0;
+    QTime startTime;
+    startTime.start();
+    KWQUIEventTime eventTime;
+    KHTMLPart* part = parser->doc()->part();
+    State state = m_state;
+    while (!src.isEmpty() && (!part || !part->isScheduledLocationChangePending())) {
+        if (!continueProcessing(processedCount, startTime, eventTime, state))
+            break;
+        // do we need to enlarge the buffer?
+        checkBuffer();
+        ushort cc = src->unicode();
+        bool wasSkipLF = state.skipLF();
+        if (wasSkipLF)
+            state.setSkipLF(false);
+        if (wasSkipLF && (cc == '\n'))
+            ++src;
+        else if (state.needsSpecialWriteHandling()) {
+            // it's important to keep needsSpecialWriteHandling with the flags this block tests
+            if (state.hasEntityState())
+                state = parseEntity(src, dest, state, m_cBufferPos, false, state.hasTagState());
+            else if (state.inPlainText())
+                state = parseText(src, state);
+            else if (state.inAnySpecial())
+                state = parseSpecial(src, state);
+            else if (state.inComment())
+                state = parseComment(src, state);
+            else if (state.inServer())
+                state = parseServer(src, state);
+            else if (state.inProcessingInstruction())
+                state = parseProcessingInstruction(src, state);
+            else if (state.hasTagState())
+                state = parseTag(src, state);
+            else if (state.startTag()) {
+                state.setStartTag(false);
+                switch(cc) {
+                case '/':
+                    break;
+                case '!': {
+                    // <!-- comment -->
+                    searchCount = 1; // Look for '<!--' sequence to start comment
+                    break;
+                }
+                case '?': {
+                    // xml processing instruction
+                    state.setInProcessingInstruction(true);
+                    tquote = NoQuote;
+                    state = parseProcessingInstruction(src, state);
+                    continue;
+                    break;
+                }
+                case '%':
+                    if (!brokenServer) {
+                        // <% server stuff, handle as comment %>
+                        state.setInServer(true);
+                        tquote = NoQuote;
+                        state = parseServer(src, state);
+                        continue;
+                    }
+                    // else fall through
+                default: {
+                    if( ((cc >= 'a') && (cc <= 'z')) || ((cc >= 'A') && (cc <= 'Z'))) {
+                        // Start of a Start-Tag
+                    } else {
+                        // Invalid tag
+                        // Add as is
+                        *dest = '<';
+                        dest++;
+                        continue;
+                    }
+                }
+                }; // end case
+                processToken();
+                m_cBufferPos = 0;
+                state.setTagState(TagName);
+                state = parseTag(src, state);
+            }
+        } else if (cc == '&' && !src.escaped()) {
+            ++src;
+            state = parseEntity(src, dest, state, m_cBufferPos, true, state.hasTagState());
+        } else if (cc == '<' && !src.escaped()) {
+            tagStartLineno = lineno+src.lineCount();
+            ++src;
+            state.setStartTag(true);
+        } else if (cc == '\n' || cc == '\r') {
+            if (state.discardLF())
+                // Ignore this LF
+                state.setDiscardLF(false); // We have discarded 1 LF
+            else
+                // Process this LF
+                *dest++ = '\n';
+            /* Check for MS-DOS CRLF sequence */
+            if (cc == '\r')
+                state.setSkipLF(true);
+            ++src;
+        } else {
+            state.setDiscardLF(false);
+#if QT_VERSION < 300
+            unsigned char row = src->row();
+            if ( row > 0x05 && row < 0x10 || row > 0xfd )
+                    currToken.complexText = true;
+#endif
+            *dest = *src;
+            fixUpChar(*dest);
+            ++dest;
+            ++src;
+        }
+    }
+#ifdef INSTRUMENT_LAYOUT_SCHEDULING
+    if (!parser->doc()->ownerElement())
+        printf("Ending write at time %d\n", parser->doc()->elapsedTime());
+#endif
+    inWrite = wasInWrite;
+    m_state = state;
+    if (noMoreData && !inWrite && !state.loadingExtScript() && !m_executingScript && !timerId)
+        end(); // this actually causes us to be deleted
+}
+void HTMLTokenizer::stopped()
+{
+    if (timerId) {
+        killTimer(timerId);
+        timerId = 0;
+    }
+}
+bool HTMLTokenizer::processingData() const
+{
+    return timerId != 0;
+}
 void HTMLTokenizer::timerEvent(QTimerEvent* e)
+{
 …
 void HTMLTokenizer::allDataProcessed()
+{
     if (noMoreData && !inWrite && !loadingExtScript && !m_executingScript && !onHold && !timerId) {
+    if (noMoreData && !inWrite && !m_state.loadingExtScript() && !m_executingScript && !onHold && !timerId) {
         QGuardedPtr<KHTMLView> savedView = view;
         end();
 …
     // parseTag is using the buffer for different matters
     if ( !tag )
+    if (!m_state.hasTagState())
         processToken();
 …
+{
     // do this as long as we don't find matching comment ends
+    while((comment || server) && scriptCode && scriptCodeSize)
+    {
+    while((m_state.inComment() || m_state.inServer()) && scriptCode && scriptCodeSize) {
         // we've found an unmatched comment start
         if (comment)
+        if (m_state.inComment())
             brokenComments = true;
         else
             brokenServer = true;
         checkScriptBuffer();
         scriptCode[ scriptCodeSize ] = 0;
         scriptCode[ scriptCodeSize + 1 ] = 0;
+        scriptCode[scriptCodeSize] = 0;
+        scriptCode[scriptCodeSize + 1] = 0;
         int pos;
         QString food;
         if (script || style) {
+        if (m_state.inScript() || m_state.inStyle())
             food.setUnicode(scriptCode, scriptCodeSize);
+        }
+        else if (server) {
+        else if (m_state.inServer()) {
             food = "<";
             food += QString(scriptCode, scriptCodeSize);
+        }
+        else {
+        } else {
             pos = QConstString(scriptCode, scriptCodeSize).string().find('>');
             food.setUnicode(scriptCode+pos+1, scriptCodeSize-pos-1); // deep copy
 …
         scriptCode = 0;
         scriptCodeSize = scriptCodeMaxSize = scriptCodeResync = 0;
+        comment = server = false;
+        if ( !food.isEmpty() )
+        m_state.setInComment(false);
+        m_state.setInServer(false);
+        if (!food.isEmpty())
             write(food, true);
+    }
 …
     // an external script to load, we can't finish parsing until that is done
     noMoreData = true;
     if (!inWrite && !loadingExtScript && !m_executingScript && !onHold && !timerId)
+    if (!inWrite && !m_state.loadingExtScript() && !m_executingScript && !onHold && !timerId)
         end(); // this actually causes us to be deleted
+}
 …
 #endif
         scriptExecution( scriptSource.qstring(), cachedScriptUrl );
+        m_state = scriptExecution(scriptSource.qstring(), m_state, cachedScriptUrl);
         // The state of pendingScripts.isEmpty() can change inside the scriptExecution()
 …
         finished = pendingScripts.isEmpty();
         if (finished) {
             loadingExtScript = false;
+            m_state.setLoadingExtScript(false);
 #ifdef INSTRUMENT_LAYOUT_SCHEDULING
             if (!parser->doc()->ownerElement())
 …
+        }
         // 'script' is true when we are called synchronously from
+        // 'inScript' is true when we are called synchronously from
         // parseScript(). In that case parseScript() will take care
         // of 'scriptOutput'.
         if ( !script ) {
+        if (!m_state.inScript()) {
             TokenizerString rest = pendingSrc;
             pendingSrc.clear();
 …
 bool HTMLTokenizer::isWaitingForScripts() const
+{
     return loadingExtScript;
+    return m_state.loadingExtScript();
+}

trunk/WebCore/khtml/html/htmltokenizer.h

-              r10843
+              r10867
 protected:
+    class State;
+    // Where we are in parsing a tag
     void begin();
     void end();
 …
     void reset();
     void processToken();
+    void processListing(TokenizerString list);
+    void parseComment(TokenizerString &str);
+    void parseServer(TokenizerString &str);
+    void parseText(TokenizerString &str);
+    void parseListing(TokenizerString &str);
+    void parseSpecial(TokenizerString &str);
+    void parseTag(TokenizerString &str);
+    void parseEntity(TokenizerString &str, QChar *&dest, bool start = false);
+    void parseProcessingInstruction(TokenizerString &str);
+    void scriptHandler();
+    void scriptExecution(const QString& script, QString scriptURL = QString(),
+                         int baseLine = 0);
+    State processListing(TokenizerString, State);
+    State parseComment(TokenizerString&, State);
+    State parseServer(TokenizerString&, State);
+    State parseText(TokenizerString&, State);
+    State parseSpecial(TokenizerString&, State);
+    State parseTag(TokenizerString&, State);
+    State parseEntity(TokenizerString &, QChar*& dest, State, unsigned& _cBufferPos, bool start, bool parsingTag);
+    State parseProcessingInstruction(TokenizerString&, State);
+    State scriptHandler(State);
+    State scriptExecution(const QString& script, State state, QString scriptURL = QString(), int baseLine = 0);
     void setSrc(const TokenizerString &source);
 …
     void enlargeScriptBuffer(int len);
     bool continueProcessing(int& processedCount, const QTime& startTime, const KWQUIEventTime& eventTime);
+    bool continueProcessing(int& processedCount, const QTime& startTime, const KWQUIEventTime& eventTime, State &state);
     void timerEvent(QTimerEvent*);
     void allDataProcessed();
 …
     } tquote;
+    // Discard line breaks immediately after <pre> tags
+    enum
+    {
+        NoneDiscard = 0,
+        LFDiscard
+    } discard;
+    // Discard the LF part of CRLF sequence
+    bool skipLF;
+    // Flag to say that we have the '<' but not the character following it.
+    bool startTag;
+    // Flag to say, we are just parsing a tag, meaning, we are in the middle
+    // of <tag...
+    enum {
+    // Are we in a &... character entity description?
+    enum EntityState {
+        NoEntity = 0,
+        SearchEntity = 1,
+        NumericSearch = 2,
+        Hexadecimal = 3,
+        Decimal = 4,
+        EntityName = 5,
+        SearchSemicolon = 6
+    };
+    unsigned EntityUnicodeValue;
+    enum TagState {
         NoTag = 0,
+        TagName,
+        SearchAttribute,
+        AttributeName,
+        SearchEqual,
+        SearchValue,
+        QuotedValue,
+        Value,
+        SearchEnd
+    } tag;
+    // Are we in a &... character entity description?
+    enum {
+        NoEntity = 0,
+        SearchEntity,
+        NumericSearch,
+        Hexadecimal,
+        Decimal,
+        EntityName,
+        SearchSemicolon
+    } Entity;
+    unsigned EntityUnicodeValue;
+    // are we in a <script> ... </script block
+    bool script;
+    // Are we in a <style> ... </style> block
+    bool style;
+    // Are we in a <select> ... </select> block
+    bool select;
+    // Are we in a <xmp> ... </xmp> block
+    bool xmp;
+    // Are we in a <title> ... </title> block
+    bool title;
+    // Are we in plain textmode ?
+    bool plaintext;
+    // XML processing instructions. Ignored at the moment
+    bool processingInstruction;
+    // Area we in a <!-- comment --> block
+    bool comment;
+    // Are we in a <textarea> ... </textarea> block
+    bool textarea;
+    // was the previous character escaped ?
+    bool escaped;
+    // are we in a server includes statement?
+    bool server;
+        TagName = 1,
+        SearchAttribute = 2,
+        AttributeName = 3,
+        SearchEqual = 4,
+        SearchValue = 5,
+        QuotedValue = 6,
+        Value = 7,
+        SearchEnd = 8
+    };
+    class State {
+    public:
+        State() : m_bits(0) {}
+        TagState tagState() const { return static_cast<TagState>(m_bits & TagMask); }
+        void setTagState(TagState t) { m_bits = (m_bits & ~TagMask) | t; }
+        EntityState entityState() const { return static_cast<EntityState>((m_bits & EntityMask) >> EntityShift); }
+        void setEntityState(EntityState e) { m_bits = (m_bits & ~EntityMask) | (e << EntityShift); }
+        bool inScript() const { return testBit(InScript); }
+        void setInScript(bool v) { setBit(InScript, v); }
+        bool inStyle() const { return testBit(InStyle); }
+        void setInStyle(bool v) { setBit(InStyle, v); }
+        bool inSelect() const { return testBit(InSelect); }
+        void setInSelect(bool v) { setBit(InSelect, v); }
+        bool inXmp() const { return testBit(InXmp); }
+        void setInXmp(bool v) { setBit(InXmp, v); }
+        bool inTitle() const { return testBit(InTitle); }
+        void setInTitle(bool v) { setBit(InTitle, v); }
+        bool inPlainText() const { return testBit(InPlainText); }
+        void setInPlainText(bool v) { setBit(InPlainText, v); }
+        bool inProcessingInstruction() const { return testBit(InProcessingInstruction); }
+        void setInProcessingInstruction(bool v) { return setBit(InProcessingInstruction, v); }
+        bool inComment() const { return testBit(InComment); }
+        void setInComment(bool v) { setBit(InComment, v); }
+        bool inTextArea() const { return testBit(InTextArea); }
+        void setInTextArea(bool v) { setBit(InTextArea, v); }
+        bool escaped() const { return testBit(Escaped); }
+        void setEscaped(bool v) { setBit(Escaped, v); }
+        bool inServer() const { return testBit(InServer); }
+        void setInServer(bool v) { setBit(InServer, v); }
+        bool skipLF() const { return testBit(SkipLF); }
+        void setSkipLF(bool v) { setBit(SkipLF, v); }
+        bool startTag() const { return testBit(StartTag); }
+        void setStartTag(bool v) { setBit(StartTag, v); }
+        bool discardLF() const { return testBit(DiscardLF); }
+        void setDiscardLF(bool v) { setBit(DiscardLF, v); }
+        bool allowYield() const { return testBit(AllowYield); }
+        void setAllowYield(bool v) { setBit(AllowYield, v); }
+        bool loadingExtScript() const { return testBit(LoadingExtScript); }
+        void setLoadingExtScript(bool v) { setBit(LoadingExtScript, v); }
+        bool forceSynchronous() const { return testBit(ForceSynchronous); }
+        void setForceSynchronous(bool v) { setBit(ForceSynchronous, v); }
+        bool inAnySpecial() const { return m_bits & (InScript | InStyle | InXmp | InTextArea | InTitle); }
+        bool hasTagState() const { return m_bits & TagMask; }
+        bool hasEntityState() const { return m_bits & EntityMask; }
+        bool needsSpecialWriteHandling() const { return m_bits & (InScript | InStyle | InXmp | InTextArea | InTitle | TagMask | EntityMask | InPlainText | InComment | InServer | InProcessingInstruction | StartTag); }
+    private:
+        static const int EntityShift = 4;
+        enum StateBits {
+            TagMask = (1 << 4) - 1,
+            EntityMask = (1 << 7) - (1 << 4),
+            InScript = 1 << 7,
+            InStyle = 1 << 8,
+            InSelect = 1 << 9,
+            InXmp = 1 << 10,
+            InTitle = 1 << 11,
+            InPlainText = 1 << 12,
+            InProcessingInstruction = 1 << 13,
+            InComment = 1 << 14,
+            InTextArea = 1 << 15,
+            Escaped = 1 << 16,
+            InServer = 1 << 17,
+            SkipLF = 1 << 18,
+            StartTag = 1 << 19,
+            DiscardLF = 1 << 20, // FIXME: should clarify difference between skip and discard
+            AllowYield = 1 << 21,
+            LoadingExtScript = 1 << 22,
+            ForceSynchronous = 1 << 23,
+        };
+        void setBit(StateBits bit, bool value)
+        {
+            if (value)
+                m_bits |= bit;
+            else
+                m_bits &= ~bit;
+        }
+        bool testBit(StateBits bit) const { return m_bits & bit; }
+        unsigned m_bits;
+    };
+    State m_state;
     bool brokenServer;
 …
     // the stopper len
     int searchStopperLen;
-    // true if we are waiting for an external script (<SCRIPT SRC=...) to load, i.e.
-    // we don't do any parsing while this is true
-    bool loadingExtScript;
     // if no more data is coming, just parse what we have (including ext scripts that
     // may be still downloading) and finish
 …
     // The timer for continued processing.
     int timerId;
-    bool allowYield;
-    bool forceSynchronous;  // disables yielding
     bool includesCommentsInDOM;
 …
 #define CBUFLEN 1024
     char cBuffer[CBUFLEN+2];
     unsigned int cBufferPos;
+    unsigned int m_cBufferPos;
     TokenizerString src;

trunk/WebCore/khtml/rendering/bidi.cpp

r10845	r10867
2524	2524	QChar ellipsis = 0x2026; // FIXME: CSS3 says this is configurable, also need to use 0x002E (FULL STOP) if 0x2026 not renderable
2525	2525	static AtomicString ellipsisStr(ellipsis);
2526		const Font& firstLineFont = ~~style(true~~)->htmlFont();
	2526	const Font& firstLineFont = firstLineStyle()->htmlFont();
2527	2527	const Font& font = style()->htmlFont();
2528	2528	int firstLineEllipsisWidth = firstLineFont.width(&ellipsis, 1, 0, 0);

trunk/WebCore/khtml/rendering/render_block.cpp

-              r10774
+              r10867
     if (currChild->style()->styleType() == RenderStyle::FIRST_LETTER) {
         RenderStyle* pseudo = firstLetterBlock->getPseudoStyle(RenderStyle::FIRST_LETTER,
                                                                     firstLetterContainer->style(true));
+                                                               firstLetterContainer->firstLineStyle());
         currChild->setStyle(pseudo);
         for (RenderObject* genChild = currChild->firstChild(); genChild; genChild = genChild->nextSibling()) {
 …
         // Create our pseudo style now that we have our firstLetterContainer determined.
         RenderStyle* pseudoStyle = firstLetterBlock->getPseudoStyle(RenderStyle::FIRST_LETTER,
                                                                     firstLetterContainer->style(true));
+                                                                    firstLetterContainer->firstLineStyle());
         // Force inline display (except for floating first-letters)

trunk/WebCore/khtml/rendering/render_flow.cpp

r10866	r10867
601	601	// the caret size of an empty :first-line'd block is wrong, but I think we
602	602	// can live with that.
603		RenderStyle *currentStyle = ~~style(true~~);
	603	RenderStyle *currentStyle = firstLineStyle();
604	604	//height = currentStyle->fontMetrics().height();
605	605	height = lineHeight(true);

trunk/WebCore/khtml/rendering/render_line.cpp

r10755	r10867
1068	1068	{
1069	1069	QPainter* p = i.p;
1070		RenderStyle* _style = m_firstLine ? m_object->~~style(true~~) : m_object->style();
	1070	RenderStyle* _style = m_firstLine ? m_object->firstLineStyle() : m_object->style();
1071	1071	if (_style->font() != p->font())
1072	1072	p->setFont(_style->font());

trunk/WebCore/khtml/rendering/render_object.cpp

-              r10755
+              r10867
+}
+RenderStyle* RenderObject::style(bool firstLine) const {
+    RenderStyle *s = m_style;
+    if (firstLine) {
+        const RenderObject* obj = isText() ? parent() : this;
+        if (obj->isBlockFlow()) {
+            RenderBlock* firstLineBlock = obj->firstLineBlock();
+            if (firstLineBlock)
+                s = firstLineBlock->getPseudoStyle(RenderStyle::FIRST_LINE, style());
+        }
+        else if (!obj->isAnonymous() && obj->isInlineFlow()) {
+            RenderStyle* parentStyle = obj->parent()->style(true);
+            if (parentStyle != obj->parent()->style()) {
+                // A first-line style is in effect. We need to cache a first-line style
+                // for ourselves.
+                style()->setHasPseudoStyle(RenderStyle::FIRST_LINE_INHERITED);
+                s = obj->getPseudoStyle(RenderStyle::FIRST_LINE_INHERITED, parentStyle);
+            }
+RenderStyle* RenderObject::firstLineStyle() const
+{
+    RenderStyle *s = m_style;
+    const RenderObject* obj = isText() ? parent() : this;
+    if (obj->isBlockFlow()) {
+        RenderBlock* firstLineBlock = obj->firstLineBlock();
+        if (firstLineBlock)
+            s = firstLineBlock->getPseudoStyle(RenderStyle::FIRST_LINE, style());
+    } else if (!obj->isAnonymous() && obj->isInlineFlow()) {
+        RenderStyle* parentStyle = obj->parent()->firstLineStyle();
+        if (parentStyle != obj->parent()->style()) {
+            // A first-line style is in effect. We need to cache a first-line style
+            // for ourselves.
+            style()->setHasPseudoStyle(RenderStyle::FIRST_LINE_INHERITED);
+            s = obj->getPseudoStyle(RenderStyle::FIRST_LINE_INHERITED, parentStyle);
+        }
+    }

trunk/WebCore/khtml/rendering/render_object.h

-              r10755
+              r10867
     RenderStyle* style() const { return m_style; }
+    RenderStyle* style( bool firstLine ) const;
+    RenderStyle* firstLineStyle() const;
+    RenderStyle* style(bool firstLine) const { return firstLine ? firstLineStyle() : style(); }
     void getTextDecorationColors(int decorations, QColor& underline, QColor& overline,

Note: See TracChangeset for help on using the changeset viewer.

Context Navigation

Changeset 10867 in webkit

Legend:

Download in other formats: