Changeset 64799 in webkit
- Timestamp:
- Aug 5, 2010 5:12:15 PM (14 years ago)
- Location:
- trunk
- Files:
-
- 4 added
- 9 edited
Legend:
- Unmodified
- Added
- Removed
-
trunk/LayoutTests/ChangeLog
r64796 r64799 1 2010-08-05 Adam Barth <abarth@webkit.org> 2 3 Reviewed by Eric Seidel. 4 5 U+0000 is turned to U+FFFD (replacement character) 6 https://bugs.webkit.org/show_bug.cgi?id=42112 7 8 Update test results to show null stripping. These changes are mostly 9 going back to the old results we had before we added the FFFD 10 replacement. 11 12 * fast/dom/stripNullFromTextNodes-expected.txt: 13 * fast/tokenizer/null-in-text-expected.txt: Added. 14 * fast/tokenizer/null-in-text.html: Added. 15 * fast/tokenizer/null-xss-expected.txt: Added. 16 * fast/tokenizer/null-xss.html: Added. 17 - The main risk with stripping null characters is that they'll be 18 used in XSS attacks. This test shows that we don't strip null 19 characters from tag names. 20 * platform/mac/fast/text/stripNullFromText-expected.txt: 21 * svg/dom/fuzz-path-parser-expected.txt: 22 * svg/dom/rgb-color-parser-expected.txt: 23 1 24 2010-08-05 Adam Barth <abarth@webkit.org> 2 25 -
trunk/LayoutTests/fast/dom/stripNullFromTextNodes-expected.txt
r61282 r64799 1 ���������hell�����o2 The null characters should be stripped out of the string above and it should have a length of 5. And the DOM thinks the length is... 19 :-(1 hello 2 The null characters should be stripped out of the string above and it should have a length of 5. And the DOM thinks the length is...5! -
trunk/LayoutTests/platform/mac/fast/text/stripNullFromText-expected.txt
r61234 r64799 4 4 RenderBlock {HTML} at (0,0) size 800x600 5 5 RenderBody {BODY} at (8,8) size 784x584 6 RenderBlock {DIV} at (0,0) size 784x21 [border: (1px solid #FF0000)] 7 RenderText {#text} at (1,2) size 16x18 8 text run at (1,2) width 16: "\x{FFFD}" 6 RenderBlock {DIV} at (0,0) size 784x2 [border: (1px solid #FF0000)] -
trunk/LayoutTests/svg/dom/fuzz-path-parser-expected.txt
r61637 r64799 460 460 Could not parse: 461 461 Could not parse: M 462 Could not parse: M �462 Could not parse: M 463 463 Parsed as 2 command(s) [MZ]: M1,1Z0 464 464 PASS successfullyParsed is true -
trunk/LayoutTests/svg/dom/rgb-color-parser-expected.txt
r61637 r64799 255 255 Threw exception Error: SVG_INVALID_VALUE_ERR: DOM SVG Exception 1: rgb( 256 256 Threw exception Error: SVG_INVALID_VALUE_ERR: DOM SVG Exception 1: 257 Threw exception Error: SVG_INVALID_VALUE_ERR: DOM SVG Exception 1: �258 Threw exception Error: SVG_INVALID_VALUE_ERR: DOM SVG Exception 1: rgb( �)257 Threw exception Error: SVG_INVALID_VALUE_ERR: DOM SVG Exception 1: 258 Threw exception Error: SVG_INVALID_VALUE_ERR: DOM SVG Exception 1: rgb() 259 259 PASS successfullyParsed is true 260 260 -
trunk/WebCore/ChangeLog
r64798 r64799 1 2010-08-05 Adam Barth <abarth@webkit.org> 2 3 Reviewed by Eric Seidel. 4 5 U+0000 is turned to U+FFFD (replacement character) 6 https://bugs.webkit.org/show_bug.cgi?id=42112 7 8 This patch introduces an intentional parsing difference from the HTML5 9 parsing specificiation. The spec requires us to convert NULL 10 characters to U+FFFD, but doing so causes compatibility issues with a 11 number of sites, including US Bank. 12 13 In this patch, we strip the null characters instead in certain cases. 14 Firefox has made a corresponding change. After gathering compatability 15 data, we hope to convince the HTML WG to adopt this change. 16 17 Tests: fast/tokenizer/null-in-text.html 18 fast/tokenizer/null-xss.html 19 20 * html/HTMLTokenizer.cpp: 21 (WebCore::HTMLTokenizer::HTMLTokenizer): 22 (WebCore::HTMLTokenizer::reset): 23 * html/HTMLTokenizer.h: 24 (WebCore::HTMLTokenizer::setSkipLeadingNewLineForListing): 25 (WebCore::HTMLTokenizer::forceNullCharacterReplacement): 26 (WebCore::HTMLTokenizer::setForceNullCharacterReplacement): 27 (WebCore::HTMLTokenizer::shouldSkipNullCharacters): 28 (WebCore::HTMLTokenizer::InputStreamPreprocessor::InputStreamPreprocessor): 29 (WebCore::HTMLTokenizer::InputStreamPreprocessor::peek): 30 * html/HTMLTreeBuilder.cpp: 31 (WebCore::HTMLTreeBuilder::passTokenToLegacyParser): 32 (WebCore::HTMLTreeBuilder::constructTreeFromToken): 33 (WebCore::HTMLTreeBuilder::processStartTagForInBody): 34 1 35 2010-08-05 Andy Estes <aestes@apple.com> 2 36 -
trunk/WebCore/html/HTMLTokenizer.cpp
r64012 r64799 98 98 99 99 HTMLTokenizer::HTMLTokenizer() 100 : m_inputStreamPreprocessor(this) 100 101 { 101 102 reset(); … … 112 113 m_lineNumber = 0; 113 114 m_skipLeadingNewLineForListing = false; 115 m_forceNullCharacterReplacement = false; 114 116 m_additionalAllowedCharacter = '\0'; 115 117 } -
trunk/WebCore/html/HTMLTokenizer.h
r63987 r64799 133 133 // Hack to skip leading newline in <pre>/<listing> for authoring ease. 134 134 // http://www.whatwg.org/specs/web-apps/current-work/multipage/tokenization.html#parsing-main-inbody 135 void skipLeadingNewLineForListing() { m_skipLeadingNewLineForListing = true; } 135 void setSkipLeadingNewLineForListing(bool value) { m_skipLeadingNewLineForListing = value; } 136 137 bool forceNullCharacterReplacement() const { return m_forceNullCharacterReplacement; } 138 void setForceNullCharacterReplacement(bool value) { m_forceNullCharacterReplacement = value; } 139 140 bool shouldSkipNullCharacters() const 141 { 142 return !m_forceNullCharacterReplacement 143 && (m_state == DataState 144 || m_state == RCDATAState 145 || m_state == RAWTEXTState 146 || m_state == PLAINTEXTState); 147 } 136 148 137 149 private: … … 139 151 class InputStreamPreprocessor : public Noncopyable { 140 152 public: 141 InputStreamPreprocessor() 142 : m_nextInputCharacter('\0') 153 InputStreamPreprocessor(HTMLTokenizer* tokenizer) 154 : m_tokenizer(tokenizer) 155 , m_nextInputCharacter('\0') 143 156 , m_skipNextNewLine(false) 144 157 { … … 152 165 ALWAYS_INLINE bool peek(SegmentedString& source, int& lineNumber) 153 166 { 167 PeekAgain: 154 168 m_nextInputCharacter = *source; 155 169 … … 180 194 // by the replacement character. We suspect this is a problem with the spec as doing 181 195 // that filtering breaks surrogate pair handling and causes us not to match Minefield. 182 if (m_nextInputCharacter == '\0' && !shouldTreatNullAsEndOfFileMarker(source)) 196 if (m_nextInputCharacter == '\0' && !shouldTreatNullAsEndOfFileMarker(source)) { 197 if (m_tokenizer->shouldSkipNullCharacters()) { 198 source.advancePastNonNewline(); 199 if (source.isEmpty()) 200 return false; 201 goto PeekAgain; 202 } 183 203 m_nextInputCharacter = 0xFFFD; 204 } 184 205 } 185 206 return true; … … 202 223 return source.isClosed() && source.length() == 1; 203 224 } 225 226 HTMLTokenizer* m_tokenizer; 204 227 205 228 // http://www.whatwg.org/specs/web-apps/current-work/#next-input-character … … 243 266 244 267 bool m_skipLeadingNewLineForListing; 268 bool m_forceNullCharacterReplacement; 245 269 246 270 // http://www.whatwg.org/specs/web-apps/current-work/#temporary-buffer -
trunk/WebCore/html/HTMLTreeBuilder.cpp
r64712 r64799 470 470 m_lastScriptElementStartLine = m_tokenizer->lineNumber(); 471 471 } else if (oldStyleToken.tagName == preTag || oldStyleToken.tagName == listingTag) 472 m_tokenizer->s kipLeadingNewLineForListing();472 m_tokenizer->setSkipLeadingNewLineForListing(true); 473 473 else 474 474 m_tokenizer->setState(adjustedLexerState(m_tokenizer->state(), oldStyleToken.tagName, m_document->frame())); … … 510 510 AtomicHTMLToken token(rawToken); 511 511 processToken(token); 512 513 // Swallowing U+0000 characters isn't in the HTML5 spec, but turning all 514 // the U+0000 characters into replacement characters has compatibility 515 // problems. 516 m_tokenizer->setForceNullCharacterReplacement(m_insertionMode == TextMode || m_insertionMode == InForeignContentMode); 512 517 } 513 518 … … 849 854 processFakePEndTagIfPInButtonScope(); 850 855 m_tree.insertHTMLElement(token); 851 m_tokenizer->s kipLeadingNewLineForListing();856 m_tokenizer->setSkipLeadingNewLineForListing(true); 852 857 m_framesetOk = false; 853 858 return; … … 969 974 if (token.name() == textareaTag) { 970 975 m_tree.insertHTMLElement(token); 971 m_tokenizer->s kipLeadingNewLineForListing();976 m_tokenizer->setSkipLeadingNewLineForListing(true); 972 977 m_tokenizer->setState(HTMLTokenizer::RCDATAState); 973 978 m_originalInsertionMode = m_insertionMode;
Note: See TracChangeset
for help on using the changeset viewer.