Changeset 227351 in webkit


Ignore:
Timestamp:
Jan 22, 2018 1:13:37 PM (6 years ago)
Author:
rniwa@webkit.org
Message:

Blob conversion and sanitization doesn't work with Microsoft Word for Mac 2011
https://bugs.webkit.org/show_bug.cgi?id=181616
<rdar://problem/36484908>

Reviewed by Wenson Hsieh.

Source/WebCore:

The bug was caused by WebContentReader::readHTML and WebContentMarkupReader::readHTML not sanitizing plain HTML string
as done for web archives even when custom pasteboard data is enabled. Fixed the bug by doing the sanitization.

Unfortunately, we can't make file URLs available in this case because WebContent process doesn't have sandbox extensions
to access local files referenced by the HTML source in the clipboard, and we can't make WebContent process request for
a sandbox extension¸on an arbitrary local file, as it would defeat the whole point of sandboxing.

Instead, we strip away all HTML attributes referencing a URL whose scheme is not HTTP, HTTPS, or data when sanitizing
text/html from the clipboard to avoid exposing local file paths, which can reveal privacy & security sensitive data
such as the user's full name, and the location of private containers of other applications in the system.

Tests: PasteHTML.DoesNotSanitizeHTMLWhenCustomPasteboardDataIsDisabled

PasteHTML.DoesNotStripFileURLsWhenCustomPasteboardDataIsDisabled
PasteHTML.ExposesHTMLTypeInDataTransfer
PasteHTML.KeepsHTTPURLs
PasteHTML.SanitizesHTML
PasteHTML.StripsFileURLs

  • editing/cocoa/WebContentReaderCocoa.mm:

(WebCore::WebContentReader::readHTML): Fixed the bug by sanitizing the markup, and stripping away file URLs.
(WebCore::WebContentMarkupReader::readHTML): Ditto.

  • editing/markup.cpp:

(WebCore::removeSubresourceURLAttributes): Added.
(WebCore::sanitizeMarkup): Added.

  • editing/markup.h:

Tools:

Added tests to make sure we sanitize plain HTML, not just web archives,
when and only when custom pasteboard data is enabled.

  • TestWebKitAPI/TestWebKitAPI.xcodeproj/project.pbxproj:
  • TestWebKitAPI/Tests/WebKitCocoa/PasteHTML.mm: Added.

(writeHTMLToPasteboard): Added.
(createWebViewWithCustomPasteboardDataSetting): Added.

Location:
trunk
Files:
1 added
6 edited

Legend:

Unmodified
Added
Removed
  • trunk/Source/WebCore/ChangeLog

    r227350 r227351  
     12018-01-22  Ryosuke Niwa  <rniwa@webkit.org>
     2
     3        Blob conversion and sanitization doesn't work with Microsoft Word for Mac 2011
     4        https://bugs.webkit.org/show_bug.cgi?id=181616
     5        <rdar://problem/36484908>
     6
     7        Reviewed by Wenson Hsieh.
     8
     9        The bug was caused by WebContentReader::readHTML and WebContentMarkupReader::readHTML not sanitizing plain HTML string
     10        as done for web archives even when custom pasteboard data is enabled. Fixed the bug by doing the sanitization.
     11
     12        Unfortunately, we can't make file URLs available in this case because WebContent process doesn't have sandbox extensions
     13        to access local files referenced by the HTML source in the clipboard, and we can't make WebContent process request for
     14        a sandbox extension¸on an arbitrary local file, as it would defeat the whole point of sandboxing.
     15
     16        Instead, we strip away all HTML attributes referencing a URL whose scheme is not HTTP, HTTPS, or data when sanitizing
     17        text/html from the clipboard to avoid exposing local file paths, which can reveal privacy & security sensitive data
     18        such as the user's full name, and the location of private containers of other applications in the system.
     19
     20        Tests: PasteHTML.DoesNotSanitizeHTMLWhenCustomPasteboardDataIsDisabled
     21               PasteHTML.DoesNotStripFileURLsWhenCustomPasteboardDataIsDisabled
     22               PasteHTML.ExposesHTMLTypeInDataTransfer
     23               PasteHTML.KeepsHTTPURLs
     24               PasteHTML.SanitizesHTML
     25               PasteHTML.StripsFileURLs
     26
     27        * editing/cocoa/WebContentReaderCocoa.mm:
     28        (WebCore::WebContentReader::readHTML): Fixed the bug by sanitizing the markup, and stripping away file URLs.
     29        (WebCore::WebContentMarkupReader::readHTML): Ditto.
     30        * editing/markup.cpp:
     31        (WebCore::removeSubresourceURLAttributes): Added.
     32        (WebCore::sanitizeMarkup): Added.
     33        * editing/markup.h:
     34
    1352018-01-22  Chris Dumez  <cdumez@apple.com>
    236
  • trunk/Source/WebCore/editing/cocoa/WebContentReaderCocoa.mm

    r227282 r227351  
    528528        return false;
    529529
    530     addFragment(createFragmentFromMarkup(document, stringOmittingMicrosoftPrefix, emptyString(), DisallowScriptingAndPluginContent));
     530    String markup;
     531    if (RuntimeEnabledFeatures::sharedFeatures().customPasteboardDataEnabled() && shouldSanitize()) {
     532        markup = sanitizeMarkup(stringOmittingMicrosoftPrefix, WTF::Function<void (DocumentFragment&)> { [] (DocumentFragment& fragment) {
     533            removeSubresourceURLAttributes(fragment, [] (const URL& url) {
     534                return shouldReplaceSubresourceURL(url);
     535            });
     536        } });
     537    } else
     538        markup = stringOmittingMicrosoftPrefix;
     539
     540    addFragment(createFragmentFromMarkup(document, markup, emptyString(), DisallowScriptingAndPluginContent));
    531541    return true;
    532542}
     
    538548
    539549    String rawHTML = stripMicrosoftPrefix(string);
    540     if (shouldSanitize())
    541         markup = sanitizeMarkup(rawHTML);
    542     else
     550    if (shouldSanitize()) {
     551        markup = sanitizeMarkup(rawHTML, WTF::Function<void (DocumentFragment&)> { [] (DocumentFragment& fragment) {
     552            removeSubresourceURLAttributes(fragment, [] (const URL& url) {
     553                return shouldReplaceSubresourceURL(url);
     554            });
     555        } });
     556    } else
    543557        markup = rawHTML;
    544558
  • trunk/Source/WebCore/editing/markup.cpp

    r226539 r227351  
    7373#include "TypedElementDescendantIterator.h"
    7474#include "URL.h"
     75#include "URLParser.h"
    7576#include "VisibleSelection.h"
    7677#include "VisibleUnits.h"
     
    145146}
    146147
     148struct ElementAttribute {
     149    Ref<Element> element;
     150    QualifiedName attributeName;
     151};
     152
     153void removeSubresourceURLAttributes(Ref<DocumentFragment>&& fragment, WTF::Function<bool(const URL&)> shouldRemoveURL)
     154{
     155    Vector<ElementAttribute> attributesToRemove;
     156    for (auto& element : descendantsOfType<Element>(fragment)) {
     157        if (!element.hasAttributes())
     158            continue;
     159        for (const Attribute& attribute : element.attributesIterator()) {
     160            // FIXME: This won't work for srcset.
     161            if (element.attributeContainsURL(attribute) && !attribute.value().isEmpty()) {
     162                URL url = URLParser { attribute.value() }.result();
     163                if (shouldRemoveURL(url))
     164                    attributesToRemove.append({ element, attribute.name() });
     165            }
     166        }
     167    }
     168    for (auto& item : attributesToRemove)
     169        item.element->removeAttribute(item.attributeName);
     170}
     171
    147172std::unique_ptr<Page> createPageForSanitizingWebContent()
    148173{
     
    173198
    174199
    175 String sanitizeMarkup(const String& rawHTML)
     200String sanitizeMarkup(const String& rawHTML, std::optional<WTF::Function<void(DocumentFragment&)>> fragmentSanitizer)
    176201{
    177202    auto page = createPageForSanitizingWebContent();
     
    182207
    183208    auto fragment = createFragmentFromMarkup(*stagingDocument, rawHTML, emptyString(), DisallowScriptingAndPluginContent);
     209
     210    if (fragmentSanitizer)
     211        (*fragmentSanitizer)(fragment);
     212
    184213    bodyElement->appendChild(fragment.get());
    185214
  • trunk/Source/WebCore/editing/markup.h

    r223440 r227351  
    3030#include "HTMLInterchange.h"
    3131#include <wtf/Forward.h>
     32#include <wtf/Function.h>
    3233#include <wtf/HashMap.h>
    3334
     
    4849
    4950void replaceSubresourceURLs(Ref<DocumentFragment>&&, HashMap<AtomicString, AtomicString>&&);
     51void removeSubresourceURLAttributes(Ref<DocumentFragment>&&, WTF::Function<bool(const URL&)> shouldRemoveURL);
     52
    5053std::unique_ptr<Page> createPageForSanitizingWebContent();
    51 String sanitizeMarkup(const String&);
     54String sanitizeMarkup(const String&, std::optional<WTF::Function<void(DocumentFragment&)>> fragmentSanitizer = std::nullopt);
    5255
    5356enum EChildrenOnly { IncludeNode, ChildrenOnly };
  • trunk/Tools/ChangeLog

    r227342 r227351  
     12018-01-22  Ryosuke Niwa  <rniwa@webkit.org>
     2
     3        Blob conversion and sanitization doesn't work with Microsoft Word for Mac 2011
     4        https://bugs.webkit.org/show_bug.cgi?id=181616
     5        <rdar://problem/36484908>
     6
     7        Reviewed by Wenson Hsieh.
     8
     9        Added tests to make sure we sanitize plain HTML, not just web archives,
     10        when and only when custom pasteboard data is enabled.
     11
     12        * TestWebKitAPI/TestWebKitAPI.xcodeproj/project.pbxproj:
     13        * TestWebKitAPI/Tests/WebKitCocoa/PasteHTML.mm: Added.
     14        (writeHTMLToPasteboard): Added.
     15        (createWebViewWithCustomPasteboardDataSetting): Added.
     16
    1172018-01-22  Alexey Proskuryakov  <ap@apple.com>
    218
  • trunk/Tools/TestWebKitAPI/TestWebKitAPI.xcodeproj/project.pbxproj

    r227242 r227351  
    578578                9B7A37C41F8AEBA5004AA228 /* CopyURL.mm in Sources */ = {isa = PBXBuildFile; fileRef = 9B7A37C21F8AEBA5004AA228 /* CopyURL.mm */; };
    579579                9B7D740F1F8378770006C432 /* paste-rtfd.html in Copy Resources */ = {isa = PBXBuildFile; fileRef = 9B7D740E1F8377E60006C432 /* paste-rtfd.html */; };
     580                9BCB7C2820130600003E7C0C /* PasteHTML.mm in Sources */ = {isa = PBXBuildFile; fileRef = 9BCB7C2620130600003E7C0C /* PasteHTML.mm */; };
    580581                9BD4239A1E04BD9800200395 /* AttributedSubstringForProposedRangeWithImage.mm in Sources */ = {isa = PBXBuildFile; fileRef = 9BD423991E04BD9800200395 /* AttributedSubstringForProposedRangeWithImage.mm */; };
    581582                9BD4239C1E04C01C00200395 /* chinese-character-with-image.html in Copy Resources */ = {isa = PBXBuildFile; fileRef = 9BD4239B1E04BFD000200395 /* chinese-character-with-image.html */; };
     
    15901591                9B7A37C21F8AEBA5004AA228 /* CopyURL.mm */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.cpp.objcpp; path = CopyURL.mm; sourceTree = "<group>"; };
    15911592                9B7D740E1F8377E60006C432 /* paste-rtfd.html */ = {isa = PBXFileReference; lastKnownFileType = text.html; path = "paste-rtfd.html"; sourceTree = "<group>"; };
     1593                9BCB7C2620130600003E7C0C /* PasteHTML.mm */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.cpp.objcpp; path = PasteHTML.mm; sourceTree = "<group>"; };
    15921594                9BD423991E04BD9800200395 /* AttributedSubstringForProposedRangeWithImage.mm */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.objcpp; path = AttributedSubstringForProposedRangeWithImage.mm; sourceTree = "<group>"; };
    15931595                9BD4239B1E04BFD000200395 /* chinese-character-with-image.html */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = text.html; path = "chinese-character-with-image.html"; sourceTree = "<group>"; };
     
    21162118                                9BDCCD851F7D0B0700009A18 /* PasteImage.mm */,
    21172119                                9BDD95561F83683600D20C60 /* PasteRTFD.mm */,
     2120                                9BCB7C2620130600003E7C0C /* PasteHTML.mm */,
    21182121                                9B2346411F943A2400DB1D23 /* PasteWebArchive.mm */,
    21192122                                3FCC4FE41EC4E8520076E37C /* PictureInPictureDelegate.mm */,
     
    33973400                                2D51A0C71C8BF00C00765C45 /* DOMHTMLVideoElementWrapper.mm in Sources */,
    33983401                                46397B951DC2C850009A78AE /* DOMNode.mm in Sources */,
     3402                                9BCB7C2820130600003E7C0C /* PasteHTML.mm in Sources */,
    33993403                                7CCE7EBC1A411A7E00447C4C /* DOMNodeFromJSObject.mm in Sources */,
    34003404                                7CCE7EBD1A411A7E00447C4C /* DOMRangeOfString.mm in Sources */,
Note: See TracChangeset for help on using the changeset viewer.