Changeset 142829 in webkit


Ignore:
Timestamp:
Feb 13, 2013 5:25:19 PM (11 years ago)
Author:
tonyg@chromium.org
Message:

Fix svg/in-html/script-write.html with threaded HTML parser
https://bugs.webkit.org/show_bug.cgi?id=109495

Reviewed by Eric Seidel.

Source/WebCore:

This patch makes the background parser's simulateTreeBuilder() more realistic.

  1. The HTMLTreeBuilder does not call the updateStateFor() setState()s when in foreign content mode so we shouldn't do it when simulating the tree builder.
  2. HTMLTreeBuilder::processTokenInForeignContent has a list of tags which exit foreign content mode. We need to respect those.
  3. Support the <foreignObject> tag which enters and leaves foreign content mode.
  4. The tree builder sets state to DataState upon a </script> tag when not in foreign content mode. We need to do the same.

This involved creating a namespace stack where we push upon entering each namespace and pop upon leaving.
We are in foreign content if the topmost namespace is SVG or MathML.

This fixes svg/in-html/script-write.html and likely others.

  • html/parser/BackgroundHTMLParser.cpp:

(WebCore::BackgroundHTMLParser::simulateTreeBuilder):

  • html/parser/BackgroundHTMLParser.h:

(BackgroundHTMLParser):

  • html/parser/CompactHTMLToken.cpp:

(WebCore::CompactHTMLToken::getAttributeItem): Returns the attribute of the given name. Necessary to test for <font> attributes in simulateTreeBuilder.
(WebCore):

  • html/parser/CompactHTMLToken.h:

(WebCore):
(CompactHTMLToken):

LayoutTests:

Added 3 new test cases:

  1. Test the behavior of a plaintext tag inside an svg foreignObject. It applies to the remainder of the document. This behavior seems a little wonky, but it matches our current behavior and Firefox's behavior.
  2. Test that we don't blindly go into HTML mode after </foreignObject>.
  3. Test that unmatched </foreignObject>s are ignored.
  • html5lib/resources/webkit02.dat:
Location:
trunk
Files:
7 edited

Legend:

Unmodified
Added
Removed
  • trunk/LayoutTests/ChangeLog

    r142824 r142829  
     12013-02-13  Tony Gentilcore  <tonyg@chromium.org>
     2
     3        Fix svg/in-html/script-write.html with threaded HTML parser
     4        https://bugs.webkit.org/show_bug.cgi?id=109495
     5
     6        Reviewed by Eric Seidel.
     7
     8        Added 3 new test cases:
     9        1. Test the behavior of a plaintext tag inside an svg foreignObject. It applies to the remainder of the document. This behavior seems a little wonky, but it matches our current behavior and Firefox's behavior.
     10        2. Test that we don't blindly go into HTML mode after </foreignObject>.
     11        3. Test that unmatched </foreignObject>s are ignored.
     12
     13        * html5lib/resources/webkit02.dat:
     14
    1152013-02-13  Emil A Eklund  <eae@chromium.org>
    216
  • trunk/LayoutTests/html5lib/resources/webkit02.dat

    r115763 r142829  
    158158#document
    159159| <option>
     160
     161#data
     162<svg><foreignObject><div>foo</div><plaintext></foreignObject></svg><div>bar</div>
     163#errors
     164#document
     165| <html>
     166|   <head>
     167|   <body>
     168|     <svg svg>
     169|       <svg foreignObject>
     170|         <div>
     171|           "foo"
     172|         <plaintext>
     173|           "</foreignObject></svg><div>bar</div>"
     174
     175#data
     176<svg><foreignObject></foreignObject><title></svg>foo
     177#errors
     178#document
     179| <html>
     180|   <head>
     181|   <body>
     182|     <svg svg>
     183|       <svg foreignObject>
     184|       <svg title>
     185|     "foo"
     186
     187#data
     188</foreignObject><plaintext><div>foo</div>
     189#errors
     190#document
     191| <html>
     192|   <head>
     193|   <body>
     194|     <plaintext>
     195|       "<div>foo</div>"
  • trunk/Source/WebCore/ChangeLog

    r142827 r142829  
     12013-02-13  Tony Gentilcore  <tonyg@chromium.org>
     2
     3        Fix svg/in-html/script-write.html with threaded HTML parser
     4        https://bugs.webkit.org/show_bug.cgi?id=109495
     5
     6        Reviewed by Eric Seidel.
     7
     8        This patch makes the background parser's simulateTreeBuilder() more realistic.
     9        1. The HTMLTreeBuilder does not call the updateStateFor() setState()s when in foreign content mode so we shouldn't do it when simulating the tree builder.
     10        2. HTMLTreeBuilder::processTokenInForeignContent has a list of tags which exit foreign content mode. We need to respect those.
     11        3. Support the <foreignObject> tag which enters and leaves foreign content mode.
     12        4. The tree builder sets state to DataState upon a </script> tag when not in foreign content mode. We need to do the same.
     13
     14        This involved creating a namespace stack where we push upon entering each namespace and pop upon leaving.
     15        We are in foreign content if the topmost namespace is SVG or MathML.
     16
     17        This fixes svg/in-html/script-write.html and likely others.
     18
     19        * html/parser/BackgroundHTMLParser.cpp:
     20        (WebCore::BackgroundHTMLParser::simulateTreeBuilder):
     21        * html/parser/BackgroundHTMLParser.h:
     22        (BackgroundHTMLParser):
     23        * html/parser/CompactHTMLToken.cpp:
     24        (WebCore::CompactHTMLToken::getAttributeItem): Returns the attribute of the given name. Necessary to test for <font> attributes in simulateTreeBuilder.
     25        (WebCore):
     26        * html/parser/CompactHTMLToken.h:
     27        (WebCore):
     28        (CompactHTMLToken):
     29
    1302013-02-13  Andreas Kling  <akling@apple.com>
    231
  • trunk/Source/WebCore/html/parser/BackgroundHTMLParser.cpp

    r142673 r142829  
    5656#endif
    5757
     58static inline bool tokenExitsForeignContent(const CompactHTMLToken& token)
     59{
     60    // FIXME: This is copied from HTMLTreeBuilder::processTokenInForeignContent and changed to use threadSafeMatch.
     61    const String& tagName = token.data();
     62    return threadSafeMatch(tagName, bTag)
     63        || threadSafeMatch(tagName, bigTag)
     64        || threadSafeMatch(tagName, blockquoteTag)
     65        || threadSafeMatch(tagName, bodyTag)
     66        || threadSafeMatch(tagName, brTag)
     67        || threadSafeMatch(tagName, centerTag)
     68        || threadSafeMatch(tagName, codeTag)
     69        || threadSafeMatch(tagName, ddTag)
     70        || threadSafeMatch(tagName, divTag)
     71        || threadSafeMatch(tagName, dlTag)
     72        || threadSafeMatch(tagName, dtTag)
     73        || threadSafeMatch(tagName, emTag)
     74        || threadSafeMatch(tagName, embedTag)
     75        || threadSafeMatch(tagName, h1Tag)
     76        || threadSafeMatch(tagName, h2Tag)
     77        || threadSafeMatch(tagName, h3Tag)
     78        || threadSafeMatch(tagName, h4Tag)
     79        || threadSafeMatch(tagName, h5Tag)
     80        || threadSafeMatch(tagName, h6Tag)
     81        || threadSafeMatch(tagName, headTag)
     82        || threadSafeMatch(tagName, hrTag)
     83        || threadSafeMatch(tagName, iTag)
     84        || threadSafeMatch(tagName, imgTag)
     85        || threadSafeMatch(tagName, liTag)
     86        || threadSafeMatch(tagName, listingTag)
     87        || threadSafeMatch(tagName, menuTag)
     88        || threadSafeMatch(tagName, metaTag)
     89        || threadSafeMatch(tagName, nobrTag)
     90        || threadSafeMatch(tagName, olTag)
     91        || threadSafeMatch(tagName, pTag)
     92        || threadSafeMatch(tagName, preTag)
     93        || threadSafeMatch(tagName, rubyTag)
     94        || threadSafeMatch(tagName, sTag)
     95        || threadSafeMatch(tagName, smallTag)
     96        || threadSafeMatch(tagName, spanTag)
     97        || threadSafeMatch(tagName, strongTag)
     98        || threadSafeMatch(tagName, strikeTag)
     99        || threadSafeMatch(tagName, subTag)
     100        || threadSafeMatch(tagName, supTag)
     101        || threadSafeMatch(tagName, tableTag)
     102        || threadSafeMatch(tagName, ttTag)
     103        || threadSafeMatch(tagName, uTag)
     104        || threadSafeMatch(tagName, ulTag)
     105        || threadSafeMatch(tagName, varTag)
     106        || (threadSafeMatch(tagName, fontTag) && (token.getAttributeItem(colorAttr) || token.getAttributeItem(faceAttr) || token.getAttributeItem(sizeAttr)));
     107}
     108
    58109// FIXME: Tune this constant based on a benchmark. The current value was choosen arbitrarily.
    59110static const size_t pendingTokenLimit = 4000;
    60111
    61112BackgroundHTMLParser::BackgroundHTMLParser(PassRefPtr<WeakReference<BackgroundHTMLParser> > reference, const HTMLParserOptions& options, const WeakPtr<HTMLDocumentParser>& parser, PassOwnPtr<XSSAuditor> xssAuditor)
    62     : m_inForeignContent(false)
    63     , m_weakFactory(reference, this)
     113    : m_weakFactory(reference, this)
    64114    , m_token(adoptPtr(new HTMLToken))
    65115    , m_tokenizer(HTMLTokenizer::create(options))
     
    69119    , m_xssAuditor(xssAuditor)
    70120{
     121    m_namespaceStack.append(HTML);
    71122}
    72123
     
    116167    if (token.type() == HTMLToken::StartTag) {
    117168        const String& tagName = token.data();
    118         if (threadSafeMatch(tagName, SVGNames::svgTag)
    119             || threadSafeMatch(tagName, MathMLNames::mathTag))
    120             m_inForeignContent = true;
    121 
    122         // FIXME: This is just a copy of Tokenizer::updateStateFor which uses threadSafeMatches.
    123         if (threadSafeMatch(tagName, textareaTag) || threadSafeMatch(tagName, titleTag))
    124             m_tokenizer->setState(HTMLTokenizer::RCDATAState);
    125         else if (threadSafeMatch(tagName, plaintextTag))
    126             m_tokenizer->setState(HTMLTokenizer::PLAINTEXTState);
    127         else if (threadSafeMatch(tagName, scriptTag))
    128             m_tokenizer->setState(HTMLTokenizer::ScriptDataState);
    129         else if (threadSafeMatch(tagName, styleTag)
    130             || threadSafeMatch(tagName, iframeTag)
    131             || threadSafeMatch(tagName, xmpTag)
    132             || (threadSafeMatch(tagName, noembedTag) && m_options.pluginsEnabled)
    133             || threadSafeMatch(tagName, noframesTag)
    134             || (threadSafeMatch(tagName, noscriptTag) && m_options.scriptEnabled))
    135             m_tokenizer->setState(HTMLTokenizer::RAWTEXTState);
     169        if (threadSafeMatch(tagName, SVGNames::svgTag))
     170            m_namespaceStack.append(SVG);
     171        if (threadSafeMatch(tagName, MathMLNames::mathTag))
     172            m_namespaceStack.append(MathML);
     173        if (inForeignContent() && tokenExitsForeignContent(token))
     174            m_namespaceStack.removeLast();
     175        // FIXME: Support tags that exit MathML.
     176        if (m_namespaceStack.last() == SVG && equalIgnoringCase(tagName, SVGNames::foreignObjectTag.localName()))
     177            m_namespaceStack.append(HTML);
     178        if (!inForeignContent()) {
     179            // FIXME: This is just a copy of Tokenizer::updateStateFor which uses threadSafeMatches.
     180            if (threadSafeMatch(tagName, textareaTag) || threadSafeMatch(tagName, titleTag))
     181                m_tokenizer->setState(HTMLTokenizer::RCDATAState);
     182            else if (threadSafeMatch(tagName, plaintextTag))
     183                m_tokenizer->setState(HTMLTokenizer::PLAINTEXTState);
     184            else if (threadSafeMatch(tagName, scriptTag))
     185                m_tokenizer->setState(HTMLTokenizer::ScriptDataState);
     186            else if (threadSafeMatch(tagName, styleTag)
     187                || threadSafeMatch(tagName, iframeTag)
     188                || threadSafeMatch(tagName, xmpTag)
     189                || (threadSafeMatch(tagName, noembedTag) && m_options.pluginsEnabled)
     190                || threadSafeMatch(tagName, noframesTag)
     191                || (threadSafeMatch(tagName, noscriptTag) && m_options.scriptEnabled))
     192                m_tokenizer->setState(HTMLTokenizer::RAWTEXTState);
     193        }
    136194    }
    137195
    138196    if (token.type() == HTMLToken::EndTag) {
    139197        const String& tagName = token.data();
    140         if (threadSafeMatch(tagName, SVGNames::svgTag) || threadSafeMatch(tagName, MathMLNames::mathTag))
    141             m_inForeignContent = false;
    142         if (threadSafeMatch(tagName, scriptTag))
     198        // FIXME: Support tags that exit MathML.
     199        if ((m_namespaceStack.last() == SVG && threadSafeMatch(tagName, SVGNames::svgTag))
     200            || (m_namespaceStack.last() == MathML && threadSafeMatch(tagName, MathMLNames::mathTag))
     201            || (m_namespaceStack.contains(SVG) && m_namespaceStack.last() == HTML && equalIgnoringCase(tagName, SVGNames::foreignObjectTag.localName())))
     202            m_namespaceStack.removeLast();
     203        if (threadSafeMatch(tagName, scriptTag)) {
     204            if (!inForeignContent())
     205                m_tokenizer->setState(HTMLTokenizer::DataState);
    143206            return false;
     207        }
    144208    }
    145209
    146210    // FIXME: Need to set setForceNullCharacterReplacement based on m_inForeignContent as well.
    147     m_tokenizer->setShouldAllowCDATA(m_inForeignContent);
     211    m_tokenizer->setShouldAllowCDATA(inForeignContent());
    148212    return true;
    149213}
  • trunk/Source/WebCore/html/parser/BackgroundHTMLParser.h

    r142673 r142829  
    3737#include <wtf/PassOwnPtr.h>
    3838#include <wtf/RefPtr.h>
     39#include <wtf/Vector.h>
    3940#include <wtf/WeakPtr.h>
    4041
     
    7071
    7172private:
     73    enum Namespace {
     74        HTML,
     75        SVG,
     76        MathML
     77    };
     78
    7279    BackgroundHTMLParser(PassRefPtr<WeakReference<BackgroundHTMLParser> >, const HTMLParserOptions&, const WeakPtr<HTMLDocumentParser>&, PassOwnPtr<XSSAuditor>);
    7380
     
    7784
    7885    void sendTokensToMainThread();
     86    bool inForeignContent() const { return m_namespaceStack.last() != HTML; }
    7987
    80     bool m_inForeignContent; // FIXME: We need a stack of foreign content markers.
     88    Vector<Namespace, 1> m_namespaceStack;
    8189    WeakPtrFactory<BackgroundHTMLParser> m_weakFactory;
    8290    BackgroundHTMLInputStream m_input;
  • trunk/Source/WebCore/html/parser/CompactHTMLToken.cpp

    r142712 r142829  
    3030#include "CompactHTMLToken.h"
    3131
     32#include "HTMLParserIdioms.h"
    3233#include "HTMLToken.h"
     34#include "QualifiedName.h"
    3335#include "XSSAuditorDelegate.h"
    3436
     
    101103}
    102104
     105const CompactAttribute* CompactHTMLToken::getAttributeItem(const QualifiedName& name) const
     106{
     107    for (unsigned i = 0; i < m_attributes.size(); ++i) {
     108        if (threadSafeMatch(m_attributes.at(i).name(), name))
     109            return &m_attributes.at(i);
     110    }
     111    return 0;
     112}
     113
    103114bool CompactHTMLToken::isSafeToSendToAnotherThread() const
    104115{
  • trunk/Source/WebCore/html/parser/CompactHTMLToken.h

    r142641 r142829  
    4040namespace WebCore {
    4141
     42class QualifiedName;
    4243class XSSInfo;
    4344
     
    7071    bool isAll8BitData() const { return m_isAll8BitData; }
    7172    const Vector<CompactAttribute>& attributes() const { return m_attributes; }
     73    const CompactAttribute* getAttributeItem(const QualifiedName&) const;
    7274    const TextPosition& textPosition() const { return m_textPosition; }
    7375
Note: See TracChangeset for help on using the changeset viewer.