Changeset 228352 in webkit


Ignore:
Timestamp:
Feb 9, 2018 9:07:56 PM (6 years ago)
Author:
rniwa@webkit.org
Message:

REGRESSION (r223440): Copying & pasting a list from Microsoft Word to TinyMCE fails
https://bugs.webkit.org/show_bug.cgi?id=182564

Reviewed by Wenson Hsieh.

Source/WebCore:

Turns out that Microsoft Word generates p and span elements with special styles instead of standard
ul and ol elements when copying a list items, and TinyMCE has a specialized code path to process
this proprietary format of Microsoft Word. The regression was caused by WebKit's sanitization code
stripping away these non-standard CSS rules and inline styles.

To preseve pre-r223440 behavior in TinyMCE, we preserve the following in a HTML markup:

  1. The "html" element at the beginning with xmlns content attributes
  2. @list rules in a style element starting with "/* List Definitions */" comment
  3. inline style content attribute with "mso-list" property
  4. comments conditional sections with "[if !supportLists]" and "[endif]"

(1) is needed for TinyMCE to trigger the specialized code path for Microsoft Word. (2) contains
the information about the structure of list items. (3) is needed to associate each p element with
a rule in (2). (4) is needed to strip away the content generated as list markers (e.g. dots).

We enable this "MSO list quirks" when the content comes from a non-WebKit client or a WebKit client
that doesn't enable custom pasteboard data (detected by the content origin being null), and the HTML
markup starts with a specific sequence of characters generated by Microsoft Word.

Test: http/tests/security/clipboard/copy-paste-html-across-origin-strips-mso-list.html

PasteHTML.PreservesMSOList
PasteHTML.StripsMSOListWhenMissingMSOHTMLElement
PasteWebArchive.PreservesMSOList
PasteWebArchive.StripsMSOListWhenMissingMSOHTMLElement

  • editing/MarkupAccumulator.cpp:

(WebCore::MarkupAccumulator::appendTextSubstring): Added.

  • editing/MarkupAccumulator.h:
  • editing/WebContentReader.cpp:

(WebCore::FrameWebContentReader::msoListQuirksForMarkup const): Added. Enables the MSO list quirks
if the content origin is null. The content origin specifies the pasteboard content's origin if it's
copied in WebKit with custom pasteboard data types enabled. In all other applications, it would be
set to null.

  • editing/WebContentReader.h:
  • editing/cocoa/WebContentReaderCocoa.mm:

(WebCore::markupForFragmentInDocument): Moved to markup.cpp as sanitizedMarkupForFragmentInDocument.
(WebCore::sanitizeMarkupWithArchive):
(WebCore::WebContentReader::readWebArchive): Always disables MSO list quirks since this code path is
only used by WebKit's native code to paste content.
(WebCore::WebContentMarkupReader::readWebArchive): Calls msoListQuirksForMarkup since this is the code
path used by DataTransfer.
(WebCore::WebContentReader::readHTML): Always disables MSO list quirks since this code path is only
used by WebKit's native code to paste content.
(WebCore::WebContentMarkupReader::readHTML): Calls msoListQuirksForMarkup since this is the code path
used by DataTransfer.

  • editing/markup.cpp:

(WebCore::sanitizeMarkup): Use sanitizedMarkupForFragmentInDocument to share code.
(WebCore::MSOListMode): Added. Set to Preserve if the sanitized markup is the one generated by
Microsoft Word, and MSO list quirks should actually kick in. This is unlike MSOListQuirks, which is
set to Enable whenever the content COULD be the one generated by Microsoft Word.
(WebCore::StyledMarkupAccumulator): Added a special MSO list preservation mode enabled by MSOListMode.
(WebCore::StyledMarkupAccumulator::StyledMarkupAccumulator):
(WebCore::StyledMarkupAccumulator::appendElement): Preseve (3). Unfortunately, TinyMCE only recognizes
mso-list and related properties only if they appear on their own. But we also need to preserve
the inline style generated using the computed style since we would lose the inline styles of the text
otherwise (e.g. red text and bold font). To workaround this, we generate two style content attributes,
one containing computed styles and another one containing mso-list. Luckily, the HTML parsing algorithm
dictates that the first attribute always wins when more than one attributes of the same name appears,
so we place the computed style's style attribute first so that the pasted content in non-TinyMCE
environment will continue to work.
(WebCore::StyledMarkupAccumulator::traverseNodesForSerialization):
(WebCore::StyledMarkupAccumulator::appendNodeToPreserveMSOList): Added. Generates special markup for
the conditional statements and the special style element with @list rules.
(WebCore::createMarkupInternal):
(WebCore::createMarkup):
(WebCore::sanitizedMarkupForFragmentInDocument): Moved from WebContentReaderCocoa.mm. If MSOListQuirks
is set to Enable, and the markup starts with a specific sequence of characters, generate the markup
with the newly added quirks code in StyledMarkupAccumulator, and wrap it in a special "html" element
TinyMCE recognizes.

  • editing/markup.h:

(WebCore::MSOListQuirks): Added. Set to CheckIfNeeded if the content COULD require MSO list quirks.

Tools:

Added tests for pasting HTML with list items generated by Microsoft Word as well as HTML which looks like
the one generated by Microsoft Word but missing a proper "html" element at the beginning.

  • TestWebKitAPI/TestWebKitAPI.xcodeproj/project.pbxproj:
  • TestWebKitAPI/Tests/WebKitCocoa/PasteHTML.mm: Added test cases.
  • TestWebKitAPI/Tests/WebKitCocoa/PasteWebArchive.mm: Added test cases.

(msoListMarkupWithoutProperHTMLElement): Added.

  • TestWebKitAPI/Tests/WebKitCocoa/mso-list.html: Added.

LayoutTests:

Added a test to make sure special Microsoft Word quirks would not get triggered
when pasting content copied within WebKit.

  • http/tests/security/clipboard/copy-paste-html-across-origin-strips-mso-list-expected.txt: Added.
  • http/tests/security/clipboard/copy-paste-html-across-origin-strips-mso-list.html: Added.
  • http/tests/security/clipboard/resources/copy-mso-list.html: Added.
Location:
trunk
Files:
4 added
14 edited

Legend:

Unmodified
Added
Removed
  • trunk/LayoutTests/ChangeLog

    r228348 r228352  
     12018-02-08  Ryosuke Niwa  <rniwa@webkit.org>
     2
     3        REGRESSION (r223440): Copying & pasting a list from Microsoft Word to TinyMCE fails
     4        https://bugs.webkit.org/show_bug.cgi?id=182564
     5
     6        Reviewed by Wenson Hsieh.
     7
     8        Added a test to make sure special Microsoft Word quirks would not get triggered
     9        when pasting content copied within WebKit.
     10
     11        * http/tests/security/clipboard/copy-paste-html-across-origin-strips-mso-list-expected.txt: Added.
     12        * http/tests/security/clipboard/copy-paste-html-across-origin-strips-mso-list.html: Added.
     13        * http/tests/security/clipboard/resources/copy-mso-list.html: Added.
     14
    1152018-02-09  Ryan Haddad  <ryanhaddad@apple.com>
    216
  • trunk/LayoutTests/platform/win/TestExpectations

    r228295 r228352  
    11591159http/tests/security/clipboard/copy-paste-html-cross-origin-iframe-across-origin.html [ Skip ]
    11601160http/tests/security/clipboard/copy-paste-html-cross-origin-iframe-in-same-origin.html [ Skip ]
     1161http/tests/security/clipboard/copy-paste-html-across-origin-strips-mso-list.html [ Skip ]
    11611162
    11621163webkit.org/b/140783 [ Release ] editing/pasteboard/copy-standalone-image.html [ Failure ImageOnlyFailure ]
  • trunk/Source/WebCore/ChangeLog

    r228349 r228352  
     12018-02-08  Ryosuke Niwa  <rniwa@webkit.org>
     2
     3        REGRESSION (r223440): Copying & pasting a list from Microsoft Word to TinyMCE fails
     4        https://bugs.webkit.org/show_bug.cgi?id=182564
     5
     6        Reviewed by Wenson Hsieh.
     7
     8        Turns out that Microsoft Word generates p and span elements with special styles instead of standard
     9        ul and ol elements when copying a list items, and TinyMCE has a specialized code path to process
     10        this proprietary format of Microsoft Word. The regression was caused by WebKit's sanitization code
     11        stripping away these non-standard CSS rules and inline styles.
     12
     13        To preseve pre-r223440 behavior in TinyMCE, we preserve the following in a HTML markup:
     14
     15        1. The "html" element at the beginning with xmlns content attributes
     16        2. @list rules in a style element starting with "/* List Definitions */" comment
     17        3. inline style content attribute with "mso-list" property
     18        4. comments conditional sections with "[if !supportLists]" and "[endif]"
     19
     20        (1) is needed for TinyMCE to trigger the specialized code path for Microsoft Word. (2) contains
     21        the information about the structure of list items. (3) is needed to associate each p element with
     22        a rule in (2). (4) is needed to strip away the content generated as list markers (e.g. dots).
     23
     24        We enable this "MSO list quirks" when the content comes from a non-WebKit client or a WebKit client
     25        that doesn't enable custom pasteboard data (detected by the content origin being null), and the HTML
     26        markup starts with a specific sequence of characters generated by Microsoft Word.
     27
     28        Test: http/tests/security/clipboard/copy-paste-html-across-origin-strips-mso-list.html
     29              PasteHTML.PreservesMSOList
     30              PasteHTML.StripsMSOListWhenMissingMSOHTMLElement
     31              PasteWebArchive.PreservesMSOList
     32              PasteWebArchive.StripsMSOListWhenMissingMSOHTMLElement
     33
     34        * editing/MarkupAccumulator.cpp:
     35        (WebCore::MarkupAccumulator::appendTextSubstring): Added.
     36        * editing/MarkupAccumulator.h:
     37        * editing/WebContentReader.cpp:
     38        (WebCore::FrameWebContentReader::msoListQuirksForMarkup const): Added. Enables the MSO list quirks
     39        if the content origin is null. The content origin specifies the pasteboard content's origin if it's
     40        copied in WebKit with custom pasteboard data types enabled. In all other applications, it would be
     41        set to null.
     42        * editing/WebContentReader.h:
     43        * editing/cocoa/WebContentReaderCocoa.mm:
     44        (WebCore::markupForFragmentInDocument): Moved to markup.cpp as sanitizedMarkupForFragmentInDocument.
     45        (WebCore::sanitizeMarkupWithArchive):
     46        (WebCore::WebContentReader::readWebArchive): Always disables MSO list quirks since this code path is
     47        only used by WebKit's native code to paste content.
     48        (WebCore::WebContentMarkupReader::readWebArchive): Calls msoListQuirksForMarkup since this is the code
     49        path used by DataTransfer.
     50        (WebCore::WebContentReader::readHTML): Always disables MSO list quirks since this code path is only
     51        used by WebKit's native code to paste content.
     52        (WebCore::WebContentMarkupReader::readHTML): Calls msoListQuirksForMarkup since this is the code path
     53        used by DataTransfer.
     54        * editing/markup.cpp:
     55        (WebCore::sanitizeMarkup): Use sanitizedMarkupForFragmentInDocument to share code.
     56        (WebCore::MSOListMode): Added. Set to Preserve if the sanitized markup is the one generated by
     57        Microsoft Word, and MSO list quirks should actually kick in. This is unlike MSOListQuirks, which is
     58        set to Enable whenever the content COULD be the one generated by Microsoft Word.
     59        (WebCore::StyledMarkupAccumulator): Added a special MSO list preservation mode enabled by MSOListMode.
     60        (WebCore::StyledMarkupAccumulator::StyledMarkupAccumulator):
     61        (WebCore::StyledMarkupAccumulator::appendElement): Preseve (3). Unfortunately, TinyMCE only recognizes
     62        mso-list and related properties only if they appear on their own. But we also need to preserve
     63        the inline style generated using the computed style since we would lose the inline styles of the text
     64        otherwise (e.g. red text and bold font). To workaround this, we generate two style content attributes,
     65        one containing computed styles and another one containing mso-list. Luckily, the HTML parsing algorithm
     66        dictates that the first attribute always wins when more than one attributes of the same name appears,
     67        so we place the computed style's style attribute first so that the pasted content in non-TinyMCE
     68        environment will continue to work.
     69        (WebCore::StyledMarkupAccumulator::traverseNodesForSerialization):
     70        (WebCore::StyledMarkupAccumulator::appendNodeToPreserveMSOList): Added. Generates special markup for
     71        the conditional statements and the special style element with @list rules.
     72        (WebCore::createMarkupInternal):
     73        (WebCore::createMarkup):
     74        (WebCore::sanitizedMarkupForFragmentInDocument): Moved from WebContentReaderCocoa.mm. If MSOListQuirks
     75        is set to Enable, and the markup starts with a specific sequence of characters, generate the markup
     76        with the newly added quirks code in StyledMarkupAccumulator, and wrap it in a special "html" element
     77        TinyMCE recognizes.
     78        * editing/markup.h:
     79        (WebCore::MSOListQuirks): Added. Set to CheckIfNeeded if the content COULD require MSO list quirks.
     80
    1812018-02-09  Dean Jackson  <dino@apple.com>
    282
  • trunk/Source/WebCore/editing/MarkupAccumulator.cpp

    r224213 r228352  
    200200}
    201201
     202void MarkupAccumulator::appendTextSubstring(const Text& text, unsigned start, unsigned length)
     203{
     204    ASSERT(start + length <= text.data().length());
     205    appendCharactersReplacingEntities(m_markup, text.data(), start, length, entityMaskForText(text));
     206}
     207
    202208size_t MarkupAccumulator::totalLength(const Vector<String>& strings)
    203209{
  • trunk/Source/WebCore/editing/MarkupAccumulator.h

    r215648 r228352  
    8787    void appendStartTag(const Node&, Namespaces* = nullptr);
    8888
     89    void appendTextSubstring(const Text&, unsigned start, unsigned length);
     90
    8991    void appendOpenTag(StringBuilder&, const Element&, Namespaces*);
    9092    void appendCloseTag(StringBuilder&, const Element&);
  • trunk/Source/WebCore/editing/WebContentReader.cpp

    r228240 r228352  
    4646}
    4747
     48MSOListQuirks FrameWebContentReader::msoListQuirksForMarkup() const
     49{
     50    return contentOrigin.isNull() ? MSOListQuirks::CheckIfNeeded : MSOListQuirks::Disabled;
    4851}
    4952
     53}
     54
  • trunk/Source/WebCore/editing/WebContentReader.h

    r228240 r228352  
    2929#include "Pasteboard.h"
    3030#include "Range.h"
     31#include "markup.h"
    3132
    3233namespace WebCore {
     
    4647protected:
    4748    bool shouldSanitize() const;
     49    MSOListQuirks msoListQuirksForMarkup() const;
    4850};
    4951
  • trunk/Source/WebCore/editing/cocoa/WebContentReaderCocoa.mm

    r228240 r228352  
    376376}
    377377
    378 static String markupForFragmentInDocument(Ref<DocumentFragment>&& fragment, Document& document)
    379 {
    380     auto* bodyElement = document.body();
    381     ASSERT(bodyElement);
    382     bodyElement->appendChild(WTFMove(fragment));
    383 
    384     auto range = Range::create(document);
    385     range->selectNodeContents(*bodyElement);
    386     return createMarkup(range.get(), nullptr, AnnotateForInterchange, false, ResolveNonLocalURLs);
    387 }
    388 
    389 static String sanitizeMarkupWithArchive(Document& destinationDocument, MarkupAndArchive& markupAndArchive, const std::function<bool(const String)>& canShowMIMETypeAsHTML)
     378static String sanitizeMarkupWithArchive(Document& destinationDocument, MarkupAndArchive& markupAndArchive, MSOListQuirks msoListQuirks, const std::function<bool(const String)>& canShowMIMETypeAsHTML)
    390379{
    391380    auto page = createPageForSanitizingWebContent();
     
    399388    if (shouldReplaceRichContentWithAttachments()) {
    400389        replaceRichContentWithAttachments(fragment, unreplacedResources);
    401         return markupForFragmentInDocument(WTFMove(fragment), *stagingDocument);
     390        return sanitizedMarkupForFragmentInDocument(WTFMove(fragment), *stagingDocument, msoListQuirks, markupAndArchive.markup);
    402391    }
    403392
     
    428417        MarkupAndArchive subframeContent = { String::fromUTF8(subframeMainResource->data().data(), subframeMainResource->data().size()),
    429418            subframeMainResource.releaseNonNull(), subframeArchive.copyRef() };
    430         auto subframeMarkup = sanitizeMarkupWithArchive(destinationDocument, subframeContent, canShowMIMETypeAsHTML);
     419        auto subframeMarkup = sanitizeMarkupWithArchive(destinationDocument, subframeContent, MSOListQuirks::Disabled, canShowMIMETypeAsHTML);
    431420
    432421        CString utf8 = subframeMarkup.utf8();
     
    442431    replaceSubresourceURLs(fragment.get(), WTFMove(blobURLMap));
    443432
    444     return markupForFragmentInDocument(WTFMove(fragment), *stagingDocument);
     433    return sanitizedMarkupForFragmentInDocument(WTFMove(fragment), *stagingDocument, msoListQuirks, markupAndArchive.markup);
    445434}
    446435
     
    469458    }
    470459
    471     String sanitizedMarkup = sanitizeMarkupWithArchive(*frame.document(), *result, [&] (const String& type) {
     460    String sanitizedMarkup = sanitizeMarkupWithArchive(*frame.document(), *result, MSOListQuirks::Disabled, [&] (const String& type) {
    472461        return frame.loader().client().canShowMIMETypeAsHTML(type);
    473462    });
     
    496485    }
    497486
    498     markup = sanitizeMarkupWithArchive(*frame.document(), *result, [&] (const String& type) {
     487    markup = sanitizeMarkupWithArchive(*frame.document(), *result, msoListQuirksForMarkup(), [&] (const String& type) {
    499488        return frame.loader().client().canShowMIMETypeAsHTML(type);
    500489    });
     
    530519    String markup;
    531520    if (RuntimeEnabledFeatures::sharedFeatures().customPasteboardDataEnabled() && shouldSanitize()) {
    532         markup = sanitizeMarkup(stringOmittingMicrosoftPrefix, WTF::Function<void (DocumentFragment&)> { [] (DocumentFragment& fragment) {
     521        markup = sanitizeMarkup(stringOmittingMicrosoftPrefix, MSOListQuirks::Disabled, WTF::Function<void (DocumentFragment&)> { [] (DocumentFragment& fragment) {
    533522            removeSubresourceURLAttributes(fragment, [] (const URL& url) {
    534523                return shouldReplaceSubresourceURL(url);
     
    549538    String rawHTML = stripMicrosoftPrefix(string);
    550539    if (shouldSanitize()) {
    551         markup = sanitizeMarkup(rawHTML, WTF::Function<void (DocumentFragment&)> { [] (DocumentFragment& fragment) {
     540        markup = sanitizeMarkup(rawHTML, msoListQuirksForMarkup(), WTF::Function<void (DocumentFragment&)> { [] (DocumentFragment& fragment) {
    552541            removeSubresourceURLAttributes(fragment, [] (const URL& url) {
    553542                return shouldReplaceSubresourceURL(url);
  • trunk/Source/WebCore/editing/markup.cpp

    r227351 r228352  
    3737#include "CacheStorageProvider.h"
    3838#include "ChildListMutationScope.h"
     39#include "Comment.h"
    3940#include "DocumentFragment.h"
    4041#include "DocumentLoader.h"
     
    5657#include "HTMLImageElement.h"
    5758#include "HTMLNames.h"
     59#include "HTMLStyleElement.h"
    5860#include "HTMLTableElement.h"
    5961#include "HTMLTextAreaElement.h"
     
    197199}
    198200
    199 
    200 String sanitizeMarkup(const String& rawHTML, std::optional<WTF::Function<void(DocumentFragment&)>> fragmentSanitizer)
     201String sanitizeMarkup(const String& rawHTML, MSOListQuirks msoListQuirks, std::optional<WTF::Function<void(DocumentFragment&)>> fragmentSanitizer)
    201202{
    202203    auto page = createPageForSanitizingWebContent();
    203204    Document* stagingDocument = page->mainFrame().document();
    204205    ASSERT(stagingDocument);
    205     auto* bodyElement = stagingDocument->body();
    206     ASSERT(bodyElement);
    207206
    208207    auto fragment = createFragmentFromMarkup(*stagingDocument, rawHTML, emptyString(), DisallowScriptingAndPluginContent);
     
    211210        (*fragmentSanitizer)(fragment);
    212211
    213     bodyElement->appendChild(fragment.get());
    214 
    215     auto range = Range::create(*stagingDocument);
    216     range->selectNodeContents(*bodyElement);
    217     return createMarkup(range.get(), nullptr, AnnotateForInterchange, false, ResolveNonLocalURLs);
    218 }
    219 
    220    
     212    return sanitizedMarkupForFragmentInDocument(WTFMove(fragment), *stagingDocument, msoListQuirks, rawHTML);
     213}
     214
     215enum class MSOListMode { Preserve, DoNotPreserve };
    221216class StyledMarkupAccumulator final : public MarkupAccumulator {
    222217public:
    223218    enum RangeFullySelectsNode { DoesFullySelectNode, DoesNotFullySelectNode };
    224219
    225     StyledMarkupAccumulator(Vector<Node*>* nodes, EAbsoluteURLs, EAnnotateForInterchange, const Range*, bool needsPositionStyleConversion, Node* highestNodeToBeSerialized = nullptr);
     220    StyledMarkupAccumulator(Vector<Node*>* nodes, EAbsoluteURLs, EAnnotateForInterchange, MSOListMode, const Range*, bool needsPositionStyleConversion, Node* highestNodeToBeSerialized = nullptr);
    226221
    227222    Node* serializeNodes(Node* startNode, Node* pastEnd);
     
    254249    Node* traverseNodesForSerialization(Node* startNode, Node* pastEnd, NodeTraversalMode);
    255250
     251    bool appendNodeToPreserveMSOList(Node&);
     252
    256253    bool shouldAnnotate()
    257254    {
     
    271268    bool m_needsPositionStyleConversion;
    272269    bool m_needClearingDiv;
     270    bool m_shouldPreserveMSOList;
     271    bool m_inMSOList { false };
    273272};
    274273
    275 inline StyledMarkupAccumulator::StyledMarkupAccumulator(Vector<Node*>* nodes, EAbsoluteURLs shouldResolveURLs, EAnnotateForInterchange shouldAnnotate, const Range* range, bool needsPositionStyleConversion, Node* highestNodeToBeSerialized)
     274inline StyledMarkupAccumulator::StyledMarkupAccumulator(Vector<Node*>* nodes, EAbsoluteURLs shouldResolveURLs, EAnnotateForInterchange shouldAnnotate, MSOListMode msoListMode, const Range* range, bool needsPositionStyleConversion, Node* highestNodeToBeSerialized)
    276275    : MarkupAccumulator(nodes, shouldResolveURLs, range)
    277276    , m_shouldAnnotate(shouldAnnotate)
     
    280279    , m_needsPositionStyleConversion(needsPositionStyleConversion)
    281280    , m_needClearingDiv(false)
     281    , m_shouldPreserveMSOList(msoListMode == MSOListMode::Preserve)
    282282{
    283283}
     
    431431    const bool shouldAnnotateOrForceInline = element.isHTMLElement() && (shouldAnnotate() || addDisplayInline);
    432432    const bool shouldOverrideStyleAttr = shouldAnnotateOrForceInline || shouldApplyWrappingStyle(element);
     433    bool containsMSOList = false;
    433434    if (element.hasAttributes()) {
    434435        for (const Attribute& attribute : element.attributesIterator()) {
    435436            // We'll handle the style attribute separately, below.
    436             if (attribute.name() == styleAttr && shouldOverrideStyleAttr)
     437            if (attribute.name() == styleAttr && shouldOverrideStyleAttr) {
     438                if (m_shouldPreserveMSOList && attribute.value().contains(";mso-list:"))
     439                    containsMSOList = true;
    437440                continue;
     441            }
    438442            if (element.isEventHandlerAttribute(attribute) || element.isJavaScriptURLAttribute(attribute))
    439443                continue;
     
    478482            out.append('\"');
    479483        }
     484
     485        if (containsMSOList) {
     486            ASSERT(m_shouldPreserveMSOList);
     487            // Unfortunately, TinyMCE doesn't recognize mso-list inline style if the style attribute contains properties.
     488            // Generate a separate, second style attribute if newInlineStyle is not empty above.
     489            // The inline style is preserved because the first attribute always wins but TinyMCE can still recognize the list.
     490            out.appendLiteral(" style=\"");
     491            appendAttributeValue(out, element.getAttribute(styleAttr), documentIsHTML);
     492            out.append('\"');
     493        }
    480494    }
    481495
     
    502516    Node* next;
    503517    Node* lastClosed = nullptr;
     518    m_inMSOList = false;
    504519    for (Node* n = startNode; n != pastEnd; n = next) {
    505520        // According to <rdar://problem/5730668>, it is possible for n to blow
     
    519534        }
    520535
    521         if (!n->renderer() && !enclosingElementWithTag(firstPositionInOrBeforeNode(n), selectTag)) {
     536        bool shouldSkipNode = !n->renderer() && !enclosingElementWithTag(firstPositionInOrBeforeNode(n), selectTag);
     537        if (UNLIKELY(m_shouldPreserveMSOList) && shouldEmit)
     538            shouldSkipNode = appendNodeToPreserveMSOList(*n) || shouldSkipNode;
     539
     540        if (shouldSkipNode) {
    522541            next = NodeTraversal::nextSkippingChildren(*n);
    523542            // Don't skip over pastEnd.
     
    576595}
    577596
     597bool StyledMarkupAccumulator::appendNodeToPreserveMSOList(Node& node)
     598{
     599    if (is<Comment>(node)) {
     600        auto& commentNode = downcast<Comment>(node);
     601        if (!m_inMSOList && commentNode.data() == "[if !supportLists]")
     602            m_inMSOList = true;
     603        else if (m_inMSOList && commentNode.data() == "[endif]")
     604            m_inMSOList = false;
     605        else
     606            return false;
     607        appendStartTag(commentNode);
     608        return true;
     609    }
     610    if (is<HTMLStyleElement>(node)) {
     611        auto* firstChild = node.firstChild();
     612        if (!is<Text>(firstChild))
     613            return false;
     614
     615        auto& textChild = downcast<Text>(*firstChild);
     616        auto& styleContent = textChild.data();
     617
     618        const auto msoListDefinitionsStart = styleContent.find("/* List Definitions */");
     619        const auto lastListItem = styleContent.reverseFind("\n@list");
     620        if (msoListDefinitionsStart == notFound || lastListItem == notFound)
     621            return false;
     622
     623        const auto msoListDefinitionsEnd = styleContent.find(";}\n", lastListItem);
     624        if (msoListDefinitionsEnd == notFound || msoListDefinitionsStart >= msoListDefinitionsEnd)
     625            return false;
     626
     627        appendStartTag(node);
     628        appendTextSubstring(textChild, msoListDefinitionsStart, msoListDefinitionsEnd - msoListDefinitionsStart + 1);
     629        appendEndTag(node);
     630        return true;
     631    }
     632    return false;
     633}
     634
    578635static Node* ancestorToRetainStructureAndAppearanceForBlock(Node* commonAncestorBlock)
    579636{
     
    686743// FIXME: At least, annotation and style info should probably not be included in range.markupString()
    687744static String createMarkupInternal(Document& document, const Range& range, Vector<Node*>* nodes,
    688     EAnnotateForInterchange shouldAnnotate, bool convertBlocksToInlines, EAbsoluteURLs shouldResolveURLs)
     745    EAnnotateForInterchange shouldAnnotate, bool convertBlocksToInlines, EAbsoluteURLs shouldResolveURLs, MSOListMode msoListMode)
    689746{
    690747    static NeverDestroyed<const String> interchangeNewlineString(MAKE_STATIC_STRING_IMPL("<br class=\"" AppleInterchangeNewline "\">"));
     
    709766    bool needsPositionStyleConversion = body && fullySelectedRoot == body
    710767        && document.settings().shouldConvertPositionStyleOnCopy();
    711     StyledMarkupAccumulator accumulator(nodes, shouldResolveURLs, shouldAnnotate, &range, needsPositionStyleConversion, specialCommonAncestor);
     768    StyledMarkupAccumulator accumulator(nodes, shouldResolveURLs, shouldAnnotate, msoListMode, &range, needsPositionStyleConversion, specialCommonAncestor);
    712769    Node* pastEnd = range.pastLastNode();
    713770
     
    780837String createMarkup(const Range& range, Vector<Node*>* nodes, EAnnotateForInterchange shouldAnnotate, bool convertBlocksToInlines, EAbsoluteURLs shouldResolveURLs)
    781838{
    782     return createMarkupInternal(range.ownerDocument(), range, nodes, shouldAnnotate, convertBlocksToInlines, shouldResolveURLs);
     839    return createMarkupInternal(range.ownerDocument(), range, nodes, shouldAnnotate, convertBlocksToInlines, shouldResolveURLs, MSOListMode::DoNotPreserve);
     840}
     841
     842String sanitizedMarkupForFragmentInDocument(Ref<DocumentFragment>&& fragment, Document& document, MSOListQuirks msoListQuirks, const String& originalMarkup)
     843{
     844    MSOListMode msoListMode = MSOListMode::DoNotPreserve;
     845    if (msoListQuirks == MSOListQuirks::CheckIfNeeded && originalMarkup.startsWith("<html xmlns:o=\"urn:schemas-microsoft-com:office:office\""))
     846        msoListMode = MSOListMode::Preserve;
     847
     848    auto bodyElement = makeRefPtr(document.body());
     849    ASSERT(bodyElement);
     850    bodyElement->appendChild(WTFMove(fragment));
     851
     852    auto range = Range::create(document);
     853    range->selectNodeContents(*bodyElement);
     854    auto result = createMarkupInternal(document, range.get(), nullptr, AnnotateForInterchange, false, ResolveNonLocalURLs, msoListMode);
     855
     856    if (msoListMode == MSOListMode::Preserve) {
     857        StringBuilder builder;
     858        builder.appendLiteral("<html xmlns:o=\"urn:schemas-microsoft-com:office:office\"\n"
     859            "xmlns:w=\"urn:schemas-microsoft-com:office:word\"\n"
     860            "xmlns:m=\"http://schemas.microsoft.com/office/2004/12/omml\"\n"
     861            "xmlns=\"http://www.w3.org/TR/REC-html40\">");
     862        builder.append(result);
     863        builder.appendLiteral("</html>");
     864        return builder.toString();
     865    }
     866
     867    return result;
    783868}
    784869
  • trunk/Source/WebCore/editing/markup.h

    r227351 r228352  
    5151void removeSubresourceURLAttributes(Ref<DocumentFragment>&&, WTF::Function<bool(const URL&)> shouldRemoveURL);
    5252
     53enum class MSOListQuirks { CheckIfNeeded, Disabled };
    5354std::unique_ptr<Page> createPageForSanitizingWebContent();
    54 String sanitizeMarkup(const String&, std::optional<WTF::Function<void(DocumentFragment&)>> fragmentSanitizer = std::nullopt);
     55String sanitizeMarkup(const String&, MSOListQuirks = MSOListQuirks::Disabled, std::optional<WTF::Function<void(DocumentFragment&)>> fragmentSanitizer = std::nullopt);
     56String sanitizedMarkupForFragmentInDocument(Ref<DocumentFragment>&&, Document&, MSOListQuirks, const String& originalMarkup);
    5557
    5658enum EChildrenOnly { IncludeNode, ChildrenOnly };
  • trunk/Tools/ChangeLog

    r228347 r228352  
     12018-02-08  Ryosuke Niwa  <rniwa@webkit.org>
     2
     3        REGRESSION (r223440): Copying & pasting a list from Microsoft Word to TinyMCE fails
     4        https://bugs.webkit.org/show_bug.cgi?id=182564
     5
     6        Reviewed by Wenson Hsieh.
     7
     8        Added tests for pasting HTML with list items generated by Microsoft Word as well as HTML which looks like
     9        the one generated by Microsoft Word but missing a proper "html" element at the beginning.
     10
     11        * TestWebKitAPI/TestWebKitAPI.xcodeproj/project.pbxproj:
     12        * TestWebKitAPI/Tests/WebKitCocoa/PasteHTML.mm: Added test cases.
     13        * TestWebKitAPI/Tests/WebKitCocoa/PasteWebArchive.mm: Added test cases.
     14        (msoListMarkupWithoutProperHTMLElement): Added.
     15        * TestWebKitAPI/Tests/WebKitCocoa/mso-list.html: Added.
     16
    1172018-02-09  Don Olmstead  <don.olmstead@sony.com>
    218
  • trunk/Tools/TestWebKitAPI/TestWebKitAPI.xcodeproj/project.pbxproj

    r228340 r228352  
    594594                9BDCCD871F7D0B0700009A18 /* PasteImage.mm in Sources */ = {isa = PBXBuildFile; fileRef = 9BDCCD851F7D0B0700009A18 /* PasteImage.mm */; };
    595595                9BDD95581F83683600D20C60 /* PasteRTFD.mm in Sources */ = {isa = PBXBuildFile; fileRef = 9BDD95561F83683600D20C60 /* PasteRTFD.mm */; };
     596                9BF356CD202D458500F71160 /* mso-list.html in Copy Resources */ = {isa = PBXBuildFile; fileRef = 9BF356CC202D44F200F71160 /* mso-list.html */; };
    596597                9C64DC321D76198A004B598E /* YouTubePluginReplacement.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 9C64DC311D76198A004B598E /* YouTubePluginReplacement.cpp */; };
    597598                A10F047E1E3AD29C00C95E19 /* NSFileManagerExtras.mm in Sources */ = {isa = PBXBuildFile; fileRef = A10F047C1E3AD29C00C95E19 /* NSFileManagerExtras.mm */; };
     
    10211022                                7A1458FC1AD5C07000E06772 /* mouse-button-listener.html in Copy Resources */,
    10221023                                33E79E06137B5FD900E32D99 /* mouse-move-listener.html in Copy Resources */,
     1024                                9BF356CD202D458500F71160 /* mso-list.html in Copy Resources */,
    10231025                                5797FE331EB15AB100B2F4A0 /* navigation-client-default-crypto.html in Copy Resources */,
    10241026                                C99B675F1E39736F00FC6C80 /* no-autoplay-with-controls.html in Copy Resources */,
     
    16161618                9BDCCD851F7D0B0700009A18 /* PasteImage.mm */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.cpp.objcpp; path = PasteImage.mm; sourceTree = "<group>"; };
    16171619                9BDD95561F83683600D20C60 /* PasteRTFD.mm */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.cpp.objcpp; path = PasteRTFD.mm; sourceTree = "<group>"; };
     1620                9BF356CC202D44F200F71160 /* mso-list.html */ = {isa = PBXFileReference; lastKnownFileType = text.html; path = "mso-list.html"; sourceTree = "<group>"; };
    16181621                9C64DC311D76198A004B598E /* YouTubePluginReplacement.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; path = YouTubePluginReplacement.cpp; sourceTree = "<group>"; };
    16191622                A10F047C1E3AD29C00C95E19 /* NSFileManagerExtras.mm */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.objcpp; path = NSFileManagerExtras.mm; sourceTree = "<group>"; };
     
    24482451                                46C519E41D35629600DAA51A /* LocalStorageNullEntries.localstorage-shm */,
    24492452                                7A6A2C711DCCFB0200C0D085 /* LocalStorageQuirkEnabled.html */,
     2453                                9BF356CC202D44F200F71160 /* mso-list.html */,
    24502454                                93E2D2751ED7D51700FA76F6 /* offscreen-iframe-of-media-document.html */,
    24512455                                7CCB99221D3B44E7003922F6 /* open-multiple-external-url.html */,
  • trunk/Tools/TestWebKitAPI/Tests/WebKitCocoa/PasteHTML.mm

    r227351 r228352  
    157157}
    158158
     159TEST(PasteHTML, PreservesMSOList)
     160{
     161    writeHTMLToPasteboard([NSString stringWithContentsOfFile:[[NSBundle mainBundle] pathForResource:@"mso-list" ofType:@"html" inDirectory:@"TestWebKitAPI.resources"]
     162        encoding:NSUTF8StringEncoding error:NULL]);
     163
     164    auto webView = createWebViewWithCustomPasteboardDataSetting(true);
     165    [webView synchronouslyLoadTestPageNamed:@"paste-rtfd"];
     166    [webView paste:nil];
     167
     168    EXPECT_WK_STREQ("[\"text/html\"]", [webView stringByEvaluatingJavaScript:@"JSON.stringify(clipboardData.types)"]);
     169    [webView stringByEvaluatingJavaScript:@"window.htmlInDataTransfer = clipboardData.values[0]"];
     170    [webView stringByEvaluatingJavaScript:@"window.pastedHTML = editor.innerHTML"];
     171
     172    EXPECT_FALSE([webView stringByEvaluatingJavaScript:@"pastedHTML.startsWith('<html xmlns:o=\"urn:schemas-microsoft-com:office:office\"')"].boolValue);
     173    EXPECT_FALSE([webView stringByEvaluatingJavaScript:@"pastedHTML.includes('/* List Definitions */')"].boolValue);
     174    EXPECT_FALSE([webView stringByEvaluatingJavaScript:@"pastedHTML.includes('@list l0:level1')"].boolValue);
     175    EXPECT_FALSE([webView stringByEvaluatingJavaScript:@"pastedHTML.includes('[if !supportLists]')"].boolValue);
     176    EXPECT_FALSE([webView stringByEvaluatingJavaScript:@"pastedHTML.includes('[endif]')"].boolValue);
     177    EXPECT_FALSE([webView stringByEvaluatingJavaScript:@"pastedHTML.includes(' style=\"text-indent:-.25in;mso-list:l0 level1 lfo1\">')"].boolValue);
     178    EXPECT_FALSE([webView stringByEvaluatingJavaScript:@"pastedHTML.includes('/* Style Definitions */')"].boolValue);
     179    EXPECT_FALSE([webView stringByEvaluatingJavaScript:@"pastedHTML.includes('/Users/webkitten/Library/')"].boolValue);
     180    [webView stringByEvaluatingJavaScript:@"getSelection().setPosition(document.querySelector('.MsoListParagraphCxSpLast'));"];
     181    [webView stringByEvaluatingJavaScript:@"getSelection().modify('move', 'forward', 'lineboundary');"];
     182    EXPECT_WK_STREQ("rgb(255, 0, 0)", [webView stringByEvaluatingJavaScript:@"document.queryCommandValue('foreColor')"]);
     183
     184    EXPECT_TRUE([webView stringByEvaluatingJavaScript:@"htmlInDataTransfer.startsWith('<html xmlns:o=\"urn:schemas-microsoft-com:office:office\"')"].boolValue);
     185    EXPECT_TRUE([webView stringByEvaluatingJavaScript:@"htmlInDataTransfer.includes('/* List Definitions */')"].boolValue);
     186    EXPECT_TRUE([webView stringByEvaluatingJavaScript:@"htmlInDataTransfer.includes('@list l0:level1')"].boolValue);
     187    EXPECT_TRUE([webView stringByEvaluatingJavaScript:@"htmlInDataTransfer.includes('[if !supportLists]')"].boolValue);
     188    EXPECT_TRUE([webView stringByEvaluatingJavaScript:@"htmlInDataTransfer.includes('[endif]')"].boolValue);
     189    EXPECT_TRUE([webView stringByEvaluatingJavaScript:@"htmlInDataTransfer.includes(' style=\"text-indent:-.25in;mso-list:l0 level1 lfo1\">')"].boolValue);
     190    EXPECT_FALSE([webView stringByEvaluatingJavaScript:@"htmlInDataTransfer.includes('/* Style Definitions */')"].boolValue);
     191    EXPECT_FALSE([webView stringByEvaluatingJavaScript:@"htmlInDataTransfer.includes('/Users/webkitten/Library/')"].boolValue);
     192
     193    [webView stringByEvaluatingJavaScript:@"editor.innerHTML = htmlInDataTransfer"];
     194    [webView stringByEvaluatingJavaScript:@"getSelection().setPosition(document.querySelector('.MsoListParagraphCxSpLast'));"];
     195    [webView stringByEvaluatingJavaScript:@"getSelection().modify('move', 'forward', 'lineboundary');"];
     196    EXPECT_WK_STREQ("rgb(255, 0, 0)", [webView stringByEvaluatingJavaScript:@"document.queryCommandValue('foreColor')"]);
     197}
     198
     199TEST(PasteHTML, StripsMSOListWhenMissingMSOHTMLElement)
     200{
     201    auto *markup = [NSString stringWithContentsOfFile:[[NSBundle mainBundle] pathForResource:@"mso-list" ofType:@"html" inDirectory:@"TestWebKitAPI.resources"] encoding:NSUTF8StringEncoding error:NULL];
     202
     203    writeHTMLToPasteboard([markup substringFromIndex:[markup rangeOfString:@">"].location + 1]);
     204
     205    auto webView = createWebViewWithCustomPasteboardDataSetting(true);
     206    [webView synchronouslyLoadTestPageNamed:@"paste-rtfd"];
     207    [webView paste:nil];
     208
     209    EXPECT_WK_STREQ("[\"text/html\"]", [webView stringByEvaluatingJavaScript:@"JSON.stringify(clipboardData.types)"]);
     210    [webView stringByEvaluatingJavaScript:@"window.htmlInDataTransfer = clipboardData.values[0]"];
     211    [webView stringByEvaluatingJavaScript:@"window.pastedHTML = editor.innerHTML"];
     212
     213    EXPECT_FALSE([webView stringByEvaluatingJavaScript:@"pastedHTML.startsWith('<html xmlns:o=\"urn:schemas-microsoft-com:office:office\"')"].boolValue);
     214    EXPECT_FALSE([webView stringByEvaluatingJavaScript:@"pastedHTML.includes('/* List Definitions */')"].boolValue);
     215    EXPECT_FALSE([webView stringByEvaluatingJavaScript:@"pastedHTML.includes('@list l0:level1')"].boolValue);
     216    EXPECT_FALSE([webView stringByEvaluatingJavaScript:@"pastedHTML.includes('[if !supportLists]')"].boolValue);
     217    EXPECT_FALSE([webView stringByEvaluatingJavaScript:@"pastedHTML.includes('[endif]')"].boolValue);
     218    EXPECT_FALSE([webView stringByEvaluatingJavaScript:@"pastedHTML.includes(' style=\"text-indent:-.25in;mso-list:l0 level1 lfo1\">')"].boolValue);
     219    EXPECT_FALSE([webView stringByEvaluatingJavaScript:@"pastedHTML.includes('/* Style Definitions */')"].boolValue);
     220    EXPECT_FALSE([webView stringByEvaluatingJavaScript:@"pastedHTML.includes('/Users/webkitten/Library/')"].boolValue);
     221    [webView stringByEvaluatingJavaScript:@"getSelection().setPosition(document.querySelector('.MsoListParagraphCxSpLast'));"];
     222    [webView stringByEvaluatingJavaScript:@"getSelection().modify('move', 'forward', 'lineboundary');"];
     223    EXPECT_WK_STREQ("rgb(255, 0, 0)", [webView stringByEvaluatingJavaScript:@"document.queryCommandValue('foreColor')"]);
     224
     225    EXPECT_FALSE([webView stringByEvaluatingJavaScript:@"htmlInDataTransfer.includes('/* List Definitions */')"].boolValue);
     226    EXPECT_FALSE([webView stringByEvaluatingJavaScript:@"htmlInDataTransfer.includes('@list l0:level1')"].boolValue);
     227    EXPECT_FALSE([webView stringByEvaluatingJavaScript:@"htmlInDataTransfer.includes('[if !supportLists]')"].boolValue);
     228    EXPECT_FALSE([webView stringByEvaluatingJavaScript:@"htmlInDataTransfer.includes('[endif]')"].boolValue);
     229    EXPECT_FALSE([webView stringByEvaluatingJavaScript:@"htmlInDataTransfer.includes(' style=\"text-indent:-.25in;mso-list:l0 level1 lfo1\">')"].boolValue);
     230    EXPECT_FALSE([webView stringByEvaluatingJavaScript:@"htmlInDataTransfer.includes('/* Style Definitions */')"].boolValue);
     231    EXPECT_FALSE([webView stringByEvaluatingJavaScript:@"htmlInDataTransfer.includes('/Users/webkitten/Library/')"].boolValue);
     232
     233    [webView stringByEvaluatingJavaScript:@"editor.innerHTML = htmlInDataTransfer"];
     234    [webView stringByEvaluatingJavaScript:@"getSelection().setPosition(document.querySelector('.MsoListParagraphCxSpLast'));"];
     235    [webView stringByEvaluatingJavaScript:@"getSelection().modify('move', 'forward', 'lineboundary');"];
     236    EXPECT_WK_STREQ("rgb(255, 0, 0)", [webView stringByEvaluatingJavaScript:@"document.queryCommandValue('foreColor')"]);
     237}
     238
     239
    159240#endif // WK_API_ENABLED && PLATFORM(COCOA)
  • trunk/Tools/TestWebKitAPI/Tests/WebKitCocoa/PasteWebArchive.mm

    r223678 r228352  
    102102}
    103103
     104TEST(PasteWebArchive, PreservesMSOList)
     105{
     106    auto *url = [NSURL URLWithString:@"file:///some-file.html"];
     107    auto *markup = [NSData dataWithContentsOfFile:[[NSBundle mainBundle] pathForResource:@"mso-list" ofType:@"html" inDirectory:@"TestWebKitAPI.resources"]];
     108    auto mainResource = adoptNS([[WebResource alloc] initWithData:markup URL:url MIMEType:@"text/html" textEncodingName:@"utf-8" frameName:nil]);
     109    auto archive = adoptNS([[WebArchive alloc] initWithMainResource:mainResource.get() subresources:nil subframeArchives:nil]);
     110
     111    [[NSPasteboard generalPasteboard] declareTypes:@[WebArchivePboardType] owner:nil];
     112    [[NSPasteboard generalPasteboard] setData:[archive data] forType:WebArchivePboardType];
     113
     114    auto webView = createWebViewWithCustomPasteboardDataEnabled();
     115    [webView synchronouslyLoadTestPageNamed:@"paste-rtfd"];
     116    [webView paste:nil];
     117
     118    EXPECT_WK_STREQ("[\"text/html\"]", [webView stringByEvaluatingJavaScript:@"JSON.stringify(clipboardData.types)"]);
     119    [webView stringByEvaluatingJavaScript:@"window.htmlInDataTransfer = clipboardData.values[0]"];
     120    [webView stringByEvaluatingJavaScript:@"window.pastedHTML = editor.innerHTML"];
     121
     122    EXPECT_FALSE([webView stringByEvaluatingJavaScript:@"pastedHTML.startsWith('<html xmlns:o=\"urn:schemas-microsoft-com:office:office\"')"].boolValue);
     123    EXPECT_FALSE([webView stringByEvaluatingJavaScript:@"pastedHTML.includes('/* List Definitions */')"].boolValue);
     124    EXPECT_FALSE([webView stringByEvaluatingJavaScript:@"pastedHTML.includes('@list l0:level1')"].boolValue);
     125    EXPECT_FALSE([webView stringByEvaluatingJavaScript:@"pastedHTML.includes('[if !supportLists]')"].boolValue);
     126    EXPECT_FALSE([webView stringByEvaluatingJavaScript:@"pastedHTML.includes('[endif]')"].boolValue);
     127    EXPECT_FALSE([webView stringByEvaluatingJavaScript:@"pastedHTML.includes(' style=\"text-indent:-.25in;mso-list:l0 level1 lfo1\">')"].boolValue);
     128    EXPECT_FALSE([webView stringByEvaluatingJavaScript:@"pastedHTML.includes('/* Style Definitions */')"].boolValue);
     129    EXPECT_FALSE([webView stringByEvaluatingJavaScript:@"pastedHTML.includes('/Users/webkitten/Library/')"].boolValue);
     130    [webView stringByEvaluatingJavaScript:@"getSelection().setPosition(document.querySelector('.MsoListParagraphCxSpLast'));"];
     131    [webView stringByEvaluatingJavaScript:@"getSelection().modify('move', 'forward', 'lineboundary');"];
     132    EXPECT_WK_STREQ("rgb(255, 0, 0)", [webView stringByEvaluatingJavaScript:@"document.queryCommandValue('foreColor')"]);
     133
     134    EXPECT_TRUE([webView stringByEvaluatingJavaScript:@"htmlInDataTransfer.startsWith('<html xmlns:o=\"urn:schemas-microsoft-com:office:office\"')"].boolValue);
     135    EXPECT_TRUE([webView stringByEvaluatingJavaScript:@"htmlInDataTransfer.includes('/* List Definitions */')"].boolValue);
     136    EXPECT_TRUE([webView stringByEvaluatingJavaScript:@"htmlInDataTransfer.includes('@list l0:level1')"].boolValue);
     137    EXPECT_TRUE([webView stringByEvaluatingJavaScript:@"htmlInDataTransfer.includes('[if !supportLists]')"].boolValue);
     138    EXPECT_TRUE([webView stringByEvaluatingJavaScript:@"htmlInDataTransfer.includes('[endif]')"].boolValue);
     139    EXPECT_TRUE([webView stringByEvaluatingJavaScript:@"htmlInDataTransfer.includes(' style=\"text-indent:-.25in;mso-list:l0 level1 lfo1\">')"].boolValue);
     140    EXPECT_FALSE([webView stringByEvaluatingJavaScript:@"htmlInDataTransfer.includes('/* Style Definitions */')"].boolValue);
     141    EXPECT_FALSE([webView stringByEvaluatingJavaScript:@"htmlInDataTransfer.includes('/Users/webkitten/Library/')"].boolValue);
     142
     143    [webView stringByEvaluatingJavaScript:@"editor.innerHTML = htmlInDataTransfer"];
     144    [webView stringByEvaluatingJavaScript:@"getSelection().setPosition(document.querySelector('.MsoListParagraphCxSpLast'));"];
     145    [webView stringByEvaluatingJavaScript:@"getSelection().modify('move', 'forward', 'lineboundary');"];
     146    EXPECT_WK_STREQ("rgb(255, 0, 0)", [webView stringByEvaluatingJavaScript:@"document.queryCommandValue('foreColor')"]);
     147
     148}
     149
     150static NSData *msoListMarkupWithoutProperHTMLElement()
     151{
     152    auto *markup = [NSData dataWithContentsOfFile:[[NSBundle mainBundle] pathForResource:@"mso-list" ofType:@"html" inDirectory:@"TestWebKitAPI.resources"]];
     153    auto *markupBytes = (uint8_t *)markup.bytes;
     154    unsigned length = markup.length;
     155    for (unsigned i = 0; i < length; i++) {
     156        if (markupBytes[i] == '>')
     157            return [markup subdataWithRange:NSMakeRange(i + 1, length - i - 1)];
     158    }
     159    return nil;
     160}
     161
     162TEST(PasteWebArchive, StripsMSOListWhenMissingMSOHTMLElement)
     163{
     164    auto *url = [NSURL URLWithString:@"file:///some-file.html"];
     165    auto *markup = msoListMarkupWithoutProperHTMLElement();
     166
     167    auto mainResource = adoptNS([[WebResource alloc] initWithData:markup URL:url MIMEType:@"text/html" textEncodingName:@"utf-8" frameName:nil]);
     168    auto archive = adoptNS([[WebArchive alloc] initWithMainResource:mainResource.get() subresources:nil subframeArchives:nil]);
     169
     170    [[NSPasteboard generalPasteboard] declareTypes:@[WebArchivePboardType] owner:nil];
     171    [[NSPasteboard generalPasteboard] setData:[archive data] forType:WebArchivePboardType];
     172
     173    auto webView = createWebViewWithCustomPasteboardDataEnabled();
     174    [webView synchronouslyLoadTestPageNamed:@"paste-rtfd"];
     175    [webView paste:nil];
     176
     177    EXPECT_WK_STREQ("[\"text/html\"]", [webView stringByEvaluatingJavaScript:@"JSON.stringify(clipboardData.types)"]);
     178    [webView stringByEvaluatingJavaScript:@"window.htmlInDataTransfer = clipboardData.values[0]"];
     179    EXPECT_FALSE([webView stringByEvaluatingJavaScript:@"htmlInDataTransfer.startsWith('<html xmlns:o=\"urn:schemas-microsoft-com:office:office\"')"].boolValue);
     180    EXPECT_FALSE([webView stringByEvaluatingJavaScript:@"htmlInDataTransfer.includes('/* List Definitions */')"].boolValue);
     181    EXPECT_FALSE([webView stringByEvaluatingJavaScript:@"htmlInDataTransfer.includes('@list l0:level1')"].boolValue);
     182    EXPECT_FALSE([webView stringByEvaluatingJavaScript:@"htmlInDataTransfer.includes('[if !supportLists]')"].boolValue);
     183    EXPECT_FALSE([webView stringByEvaluatingJavaScript:@"htmlInDataTransfer.includes('[endif]')"].boolValue);
     184    EXPECT_FALSE([webView stringByEvaluatingJavaScript:@"htmlInDataTransfer.includes(' style=\"text-indent:-.25in;mso-list:l0 level1 lfo1\">')"].boolValue);
     185    EXPECT_FALSE([webView stringByEvaluatingJavaScript:@"htmlInDataTransfer.includes('/* Style Definitions */')"].boolValue);
     186    EXPECT_FALSE([webView stringByEvaluatingJavaScript:@"htmlInDataTransfer.includes('/Users/webkitten/Library/')"].boolValue);
     187
     188    [webView stringByEvaluatingJavaScript:@"editor.innerHTML = htmlInDataTransfer"];
     189    [webView stringByEvaluatingJavaScript:@"getSelection().setPosition(document.querySelector('.MsoListParagraphCxSpLast'));"];
     190    [webView stringByEvaluatingJavaScript:@"getSelection().modify('move', 'forward', 'lineboundary');"];
     191    EXPECT_WK_STREQ("rgb(255, 0, 0)", [webView stringByEvaluatingJavaScript:@"document.queryCommandValue('foreColor')"]);
     192}
     193
    104194#endif // WK_API_ENABLED && PLATFORM(MAC)
    105195
Note: See TracChangeset for help on using the changeset viewer.