Changeset 65351 in webkit
- Timestamp:
- Aug 13, 2010 8:18:16 PM (14 years ago)
- Location:
- trunk
- Files:
-
- 4 added
- 1 deleted
- 16 edited
Legend:
- Unmodified
- Added
- Removed
-
trunk/LayoutTests/ChangeLog
r65348 r65351 1 2010-08-12 Adam Barth <abarth@webkit.org> 2 3 Reviewed by Eric Seidel. 4 5 Add support for MathML entities 6 https://bugs.webkit.org/show_bug.cgi?id=43949 7 8 Test progression for proper entity support. 9 10 * html5lib/runner-expected-html5.txt: 11 * html5lib/runner-expected.txt: 12 1 13 2010-08-13 Mihai Parparita <mihaip@chromium.org> 2 14 -
trunk/LayoutTests/html5lib/runner-expected-html5.txt
r65213 r65351 119 119 resources/scriptdata01.dat: PASS 120 120 121 resources/html5test-com.dat: 122 7 123 9 124 10 125 11 126 127 Test 7 of 24 in resources/html5test-com.dat failed. Input: 128 ⟨⟩ 129 Got: 130 | <html> 131 | <head> 132 | <body> 133 | "〈〉" 134 Expected: 135 | <html> 136 | <head> 137 | <body> 138 | "⟨⟩" 139 140 Test 9 of 24 in resources/html5test-com.dat failed. Input: 141 ⅈ 142 Got: 143 | <html> 144 | <head> 145 | <body> 146 | "ⅈ" 147 Expected: 148 | <html> 149 | <head> 150 | <body> 151 | "ⅈ" 152 153 Test 10 of 24 in resources/html5test-com.dat failed. Input: 154 𝕂 155 Got: 156 | <html> 157 | <head> 158 | <body> 159 | "𝕂" 160 Expected: 161 | <html> 162 | <head> 163 | <body> 164 | "𝕂" 165 166 Test 11 of 24 in resources/html5test-com.dat failed. Input: 167 ∉ 168 Got: 169 | <html> 170 | <head> 171 | <body> 172 | "∉" 173 Expected: 174 | <html> 175 | <head> 176 | <body> 177 | "∉" 178 resources/entities01.dat: 179 2 180 5 181 182 Test 2 of 68 in resources/entities01.dat failed. Input: 183 FOO>BAR 184 Got: 185 | <html> 186 | <head> 187 | <body> 188 | "FOO>BAR" 189 Expected: 190 | <html> 191 | <head> 192 | <body> 193 | "FOO>BAR" 194 195 Test 5 of 68 in resources/entities01.dat failed. Input: 196 I'm ¬it; I tell you 197 Got: 198 | <html> 199 | <head> 200 | <body> 201 | "I'm ¬it; I tell you" 202 Expected: 203 | <html> 204 | <head> 205 | <body> 206 | "I'm ¬it; I tell you" 121 resources/html5test-com.dat: PASS 122 123 resources/entities01.dat: PASS 124 207 125 resources/entities02.dat: PASS 208 126 -
trunk/LayoutTests/html5lib/runner-expected.txt
r65006 r65351 192 192 resources/scriptdata01.dat: PASS 193 193 194 resources/html5test-com.dat: 195 7 196 9 197 10 198 11 199 200 Test 7 of 24 in resources/html5test-com.dat failed. Input: 201 ⟨⟩ 202 Got: 203 | <html> 204 | <head> 205 | <body> 206 | "〈〉" 207 Expected: 208 | <html> 209 | <head> 210 | <body> 211 | "⟨⟩" 212 213 Test 9 of 24 in resources/html5test-com.dat failed. Input: 214 ⅈ 215 Got: 216 | <html> 217 | <head> 218 | <body> 219 | "ⅈ" 220 Expected: 221 | <html> 222 | <head> 223 | <body> 224 | "ⅈ" 225 226 Test 10 of 24 in resources/html5test-com.dat failed. Input: 227 𝕂 228 Got: 229 | <html> 230 | <head> 231 | <body> 232 | "𝕂" 233 Expected: 234 | <html> 235 | <head> 236 | <body> 237 | "𝕂" 238 239 Test 11 of 24 in resources/html5test-com.dat failed. Input: 240 ∉ 241 Got: 242 | <html> 243 | <head> 244 | <body> 245 | "∉" 246 Expected: 247 | <html> 248 | <head> 249 | <body> 250 | "∉" 251 resources/entities01.dat: 252 2 253 5 254 255 Test 2 of 68 in resources/entities01.dat failed. Input: 256 FOO>BAR 257 Got: 258 | <html> 259 | <head> 260 | <body> 261 | "FOO>BAR" 262 Expected: 263 | <html> 264 | <head> 265 | <body> 266 | "FOO>BAR" 267 268 Test 5 of 68 in resources/entities01.dat failed. Input: 269 I'm ¬it; I tell you 270 Got: 271 | <html> 272 | <head> 273 | <body> 274 | "I'm ¬it; I tell you" 275 Expected: 276 | <html> 277 | <head> 278 | <body> 279 | "I'm ¬it; I tell you" 194 resources/html5test-com.dat: PASS 195 196 resources/entities01.dat: PASS 197 280 198 resources/entities02.dat: PASS 281 199 -
trunk/WebCore/CMakeLists.txt
r65336 r65351 972 972 html/HTMLElement.cpp 973 973 html/HTMLElementStack.cpp 974 html/HTMLEntitySearch.cpp 974 975 html/HTMLEmbedElement.cpp 975 976 html/HTMLFieldSetElement.cpp -
trunk/WebCore/ChangeLog
r65350 r65351 1 2010-08-09 Adam Barth <abarth@webkit.org> 2 3 Reviewed by Eric Seidel. 4 5 Add support for MathML entities 6 https://bugs.webkit.org/show_bug.cgi?id=43949 7 8 Implementing the HTML5 entity parsing algorithm require refactoring how 9 we search for entity names. Instead of using a perfect hash, we now 10 use a sorted list. As we advance through the input, we walk down a 11 binary search of the table looking for an entity. 12 13 Using this data structure lets us keep track of whether the current 14 string is a prefix of an existing entity, which we need for the 15 algorithm. In a future patch, I plan to add some indices to the 16 table, which should let us narrow down the range of interesting entries 17 more quickly. 18 19 The one nasty piece of the algorithm is if we walk too far down the 20 input and we need to back up to a previous match. In this patch, we 21 accomplish this by rewinding the input and consuming a known number of 22 characters to resync the source. 23 24 * WebCore.xcodeproj/project.pbxproj: 25 * html/HTMLEntityParser.cpp: 26 (WebCore::consumeHTMLEntity): 27 * html/HTMLEntitySearch.cpp: Added. 28 (WebCore::): 29 (WebCore::HTMLEntitySearch::HTMLEntitySearch): 30 (WebCore::HTMLEntitySearch::compare): 31 (WebCore::HTMLEntitySearch::findStart): 32 (WebCore::HTMLEntitySearch::findEnd): 33 (WebCore::HTMLEntitySearch::advance): 34 * html/HTMLEntitySearch.h: Added. 35 (WebCore::HTMLEntitySearch::isEntityPrefix): 36 (WebCore::HTMLEntitySearch::currentValue): 37 (WebCore::HTMLEntitySearch::lastMatch): 38 (WebCore::HTMLEntitySearch::): 39 (WebCore::HTMLEntitySearch::fail): 40 * html/HTMLEntityTable.h: Added. 41 (WebCore::HTMLEntityTableEntry::lastCharacter): 42 1 43 2010-08-13 Tony Gentilcore <tonyg@chromium.org> 2 44 -
trunk/WebCore/DerivedSources.make
r65218 r65351 506 506 DocTypeStrings.cpp \ 507 507 HTMLElementFactory.cpp \ 508 HTMLEntity Names.cpp \508 HTMLEntityTable.cpp \ 509 509 HTMLNames.cpp \ 510 510 WMLElementFactory.cpp \ … … 601 601 # HTML entity names 602 602 603 HTMLEntity Names.cpp : html/HTMLEntityNames.gperf $(WebCore)/make-hash-tools.pl604 p erl $(WebCore)/make-hash-tools.pl . $(WebCore)/html/HTMLEntityNames.gperf603 HTMLEntityTable.cpp : html/HTMLEntityNames.json $(WebCore)/../WebKitTools/Scripts/create-html-entity-table 604 python $(WebCore)/../WebKitTools/Scripts/create-html-entity-table -o HTMLEntityTable.cpp $(WebCore)/html/HTMLEntityNames.json 605 605 606 606 # -------- -
trunk/WebCore/GNUmakefile.am
r65312 r65351 93 93 DerivedSources/WebCore/HTMLElementFactory.cpp \ 94 94 DerivedSources/WebCore/HTMLElementFactory.h \ 95 DerivedSources/WebCore/HTMLEntity Names.cpp \95 DerivedSources/WebCore/HTMLEntityTable.cpp \ 96 96 DerivedSources/WebCore/HTMLNames.cpp \ 97 97 DerivedSources/WebCore/HTMLNames.h \ … … 1428 1428 WebCore/html/HTMLElementStack.cpp \ 1429 1429 WebCore/html/HTMLElementStack.h \ 1430 WebCore/html/HTMLEntitySearch.cpp \ 1431 WebCore/html/HTMLEntitySearch.h \ 1430 1432 WebCore/html/HTMLEmbedElement.cpp \ 1431 1433 WebCore/html/HTMLEmbedElement.h \ … … 4396 4398 4397 4399 # HTML entity names 4398 DerivedSources/WebCore/HTMLEntity Names.cpp : $(WebCore)/html/HTMLEntityNames.gperf $(WebCore)/make-hash-tools.pl4399 $(P ERL) $(WebCore)/make-hash-tools.pl $(GENSOURCES_WEBCORE) $(WebCore)/html/HTMLEntityNames.gperf4400 DerivedSources/WebCore/HTMLEntityTable.cpp : $(WebCore)/html/HTMLEntityNames.json $(WebCore)/../WebKitTools/Scripts/create-html-entity-table 4401 $(PYTHON) $(WebCore)/../WebKitTools/Scripts/create-html-entity-table -o $(GENSOURCES_WEBCORE)/HTMLEntityTable.cpp $(WebCore)/html/HTMLEntityNames.json 4400 4402 4401 4403 # color names -
trunk/WebCore/WebCore.gyp/WebCore.gyp
r64680 r65351 277 277 # gperf rule 278 278 '../html/DocTypeStrings.gperf', 279 '../html/HTMLEntityNames.gperf',280 279 '../platform/ColorData.gperf', 280 281 # json rule 282 '../html/HTMLEntityNames.json', 281 283 282 284 # idl rules … … 599 601 '<(SHARED_INTERMEDIATE_DIR)/webkit/<(RULE_INPUT_ROOT).cpp', 600 602 ], 601 ' dependencies': [603 'inputs': [ 602 604 '../make-hash-tools.pl', 603 605 ], … … 609 611 ], 610 612 'process_outputs_as_sources': 0, 613 }, 614 { 615 'rule_name': 'json', 616 'extension': 'json', 617 # 618 # json outputs are generated by WebKitTools/Scripts/create-html-entity-table 619 # 620 'outputs': [ 621 '<(SHARED_INTERMEDIATE_DIR)/webkit/HTMLEntityTable.cpp', 622 ], 623 'inputs': [ 624 '../../WebKitTools/Scripts/create-html-entity-table', 625 ], 626 'action': [ 627 'python', 628 '../../WebKitTools/Scripts/create-html-entity-table', 629 '-o', 630 '<(SHARED_INTERMEDIATE_DIR)/webkit/HTMLEntityTable.cpp', 631 '<(RULE_INPUT_PATH)', 632 ], 611 633 }, 612 634 # Rule to build generated JavaScript (V8) bindings from .idl source. -
trunk/WebCore/WebCore.gypi
r65312 r65351 1589 1589 'html/HTMLElementStack.cpp', 1590 1590 'html/HTMLElementStack.h', 1591 'html/HTMLEntitySearch.cpp', 1592 'html/HTMLEntitySearch.h', 1591 1593 'html/HTMLEmbedElement.cpp', 1592 1594 'html/HTMLEmbedElement.h', -
trunk/WebCore/WebCore.pri
r65070 r65351 30 30 XMLNS_NAMES = $$PWD/xml/xmlnsattrs.in 31 31 32 ENTITIES_GPERF = $$PWD/html/HTMLEntityNames.gperf 32 HTML_ENTITIES = $$PWD/html/HTMLEntityNames.json 33 33 34 34 COLORDATA_GPERF = $$PWD/platform/ColorData.gperf … … 591 591 592 592 # GENERATOR 8-A: 593 entities.output = $${WC_GENERATED_SOURCES_DIR}/HTMLEntity Names.cpp594 entities.input = ENTITIES_GPERF595 entities.wkScript = $$PWD/ make-hash-tools.pl596 entities.commands = p erl $$entities.wkScript $${WC_GENERATED_SOURCES_DIR} $$ENTITIES_GPERF593 entities.output = $${WC_GENERATED_SOURCES_DIR}/HTMLEntityTable.cpp 594 entities.input = HTML_ENTITIES 595 entities.wkScript = $$PWD/../WebKitTools/Scripts/create-html-entity-table 596 entities.commands = python $$entities.wkScript -o $${WC_GENERATED_SOURCES_DIR}/HTMLEntityTable.cpp $$HTML_ENTITIES 597 597 entities.clean = ${QMAKE_FILE_OUT} 598 entities.depends = $$PWD/ make-hash-tools.pl598 entities.depends = $$PWD/../WebKitTools/Scripts/create-html-entity-table 599 599 addExtraCompiler(entities) 600 600 -
trunk/WebCore/WebCore.pro
r65321 r65351 672 672 html/HTMLElement.cpp \ 673 673 html/HTMLElementStack.cpp \ 674 html/HTMLEntitySearch.cpp \ 674 675 html/HTMLEmbedElement.cpp \ 675 676 html/HTMLFieldSetElement.cpp \ -
trunk/WebCore/WebCore.vcproj/WebCore.vcproj
r65312 r65351 37638 37638 </File> 37639 37639 <File 37640 RelativePath="..\html\HTMLEntitySearch.cpp" 37641 > 37642 </File> 37643 <File 37644 RelativePath="..\html\HTMLEntitySearch.h" 37645 > 37646 </File> 37647 <File 37640 37648 RelativePath="..\html\HTMLEmbedElement.cpp" 37641 37649 > -
trunk/WebCore/WebCore.xcodeproj/project.pbxproj
r65349 r65351 3184 3184 A8A909AC0CBCD6B50029B807 /* RenderSVGTransformableContainer.h in Headers */ = {isa = PBXBuildFile; fileRef = A8A909AA0CBCD6B50029B807 /* RenderSVGTransformableContainer.h */; }; 3185 3185 A8A909AD0CBCD6B50029B807 /* RenderSVGTransformableContainer.cpp in Sources */ = {isa = PBXBuildFile; fileRef = A8A909AB0CBCD6B50029B807 /* RenderSVGTransformableContainer.cpp */; }; 3186 A8BC044E1214EB2A00B5F122 /* HTMLEntitySearch.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 970C4FDF1211266200C3D393 /* HTMLEntitySearch.cpp */; }; 3187 A8BC044F1214EB2B00B5F122 /* HTMLEntitySearch.h in Headers */ = {isa = PBXBuildFile; fileRef = 970C4FE01211266200C3D393 /* HTMLEntitySearch.h */; }; 3188 A8BC04921214F69600B5F122 /* HTMLEntityTable.cpp in Sources */ = {isa = PBXBuildFile; fileRef = A8BC04911214F69600B5F122 /* HTMLEntityTable.cpp */; }; 3186 3189 A8BCFD05120A046100B5F122 /* SVGPathSeg.cpp in Sources */ = {isa = PBXBuildFile; fileRef = A8BCFD04120A046100B5F122 /* SVGPathSeg.cpp */; }; 3187 3190 A8C2280E11D4A59700D5A7D3 /* DocumentParser.cpp in Sources */ = {isa = PBXBuildFile; fileRef = A8C2280D11D4A59700D5A7D3 /* DocumentParser.cpp */; }; … … 8481 8484 97059975107D975200A50A7C /* PolicyChecker.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; path = PolicyChecker.cpp; sourceTree = "<group>"; }; 8482 8485 97059976107D975200A50A7C /* PolicyChecker.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = PolicyChecker.h; sourceTree = "<group>"; }; 8486 970C4FDF1211266200C3D393 /* HTMLEntitySearch.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; path = HTMLEntitySearch.cpp; sourceTree = "<group>"; }; 8487 970C4FE01211266200C3D393 /* HTMLEntitySearch.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = HTMLEntitySearch.h; sourceTree = "<group>"; }; 8488 970C4FE11211266200C3D393 /* HTMLEntityTable.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; path = HTMLEntityTable.cpp; sourceTree = "<group>"; }; 8489 970C4FE21211266200C3D393 /* HTMLEntityTable.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = HTMLEntityTable.h; sourceTree = "<group>"; }; 8483 8490 9719AEFF11D09F2C00D45831 /* HTMLInputStream.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = HTMLInputStream.h; sourceTree = "<group>"; }; 8484 8491 9738899E116EA9DC00ADF313 /* DocumentWriter.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; path = DocumentWriter.cpp; sourceTree = "<group>"; }; … … 8866 8873 A8A909AA0CBCD6B50029B807 /* RenderSVGTransformableContainer.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = RenderSVGTransformableContainer.h; sourceTree = "<group>"; }; 8867 8874 A8A909AB0CBCD6B50029B807 /* RenderSVGTransformableContainer.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; path = RenderSVGTransformableContainer.cpp; sourceTree = "<group>"; }; 8875 A8BC04911214F69600B5F122 /* HTMLEntityTable.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; path = HTMLEntityTable.cpp; sourceTree = "<group>"; }; 8868 8876 A8BCFD04120A046100B5F122 /* SVGPathSeg.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; path = SVGPathSeg.cpp; sourceTree = "<group>"; }; 8869 8877 A8C2280D11D4A59700D5A7D3 /* DocumentParser.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; path = DocumentParser.cpp; sourceTree = "<group>"; }; … … 10946 10954 E406F3FA1198304D009D59D6 /* DocTypeStrings.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; path = DocTypeStrings.cpp; sourceTree = "<group>"; }; 10947 10955 E406F3FB1198307D009D59D6 /* ColorData.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; path = ColorData.cpp; sourceTree = "<group>"; }; 10948 E406F4021198329A009D59D6 /* HTMLEntityNames.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; path = HTMLEntityNames.cpp; sourceTree = "<group>"; };10949 10956 E415F10C0D9A05870033CE97 /* ElementTimeControl.idl */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = text; path = ElementTimeControl.idl; sourceTree = "<group>"; }; 10950 10957 E415F1680D9A165D0033CE97 /* DOMElementTimeControl.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = DOMElementTimeControl.h; sourceTree = "<group>"; }; … … 12293 12300 A17C81200F2A5CF7005DAAEB /* HTMLElementFactory.cpp */, 12294 12301 A17C81210F2A5CF7005DAAEB /* HTMLElementFactory.h */, 12295 E406F4021198329A009D59D6 /* HTMLEntityNames.cpp */,12302 A8BC04911214F69600B5F122 /* HTMLEntityTable.cpp */, 12296 12303 A8D06B380A265DCD005E7203 /* HTMLNames.cpp */, 12297 12304 A8D06B370A265DCD005E7203 /* HTMLNames.h */, … … 13991 13998 976E895E11C0CA3A00EA9CA9 /* HTMLEntityParser.cpp */, 13992 13999 976E895F11C0CA3A00EA9CA9 /* HTMLEntityParser.h */, 14000 970C4FDF1211266200C3D393 /* HTMLEntitySearch.cpp */, 14001 970C4FE01211266200C3D393 /* HTMLEntitySearch.h */, 14002 970C4FE11211266200C3D393 /* HTMLEntityTable.cpp */, 14003 970C4FE21211266200C3D393 /* HTMLEntityTable.h */, 13993 14004 A81369B9097374F500D74463 /* HTMLFieldSetElement.cpp */, 13994 14005 A81369B8097374F500D74463 /* HTMLFieldSetElement.h */, … … 20164 20175 CE172E011136E8CE0062A533 /* ZoomMode.h in Headers */, 20165 20176 2EED57FE1214A9C2007656BB /* ThreadableBlobRegistry.h in Headers */, 20177 A8BC044F1214EB2B00B5F122 /* HTMLEntitySearch.h in Headers */, 20166 20178 ); 20167 20179 runOnlyForDeploymentPostprocessing = 0; … … 22591 22603 97DD4D860FDF4D6E00ECF9A4 /* XSSAuditor.cpp in Sources */, 22592 22604 2EED57FD1214A9C2007656BB /* ThreadableBlobRegistry.cpp in Sources */, 22605 A8BC044E1214EB2A00B5F122 /* HTMLEntitySearch.cpp in Sources */, 22606 A8BC04921214F69600B5F122 /* HTMLEntityTable.cpp in Sources */, 22593 22607 ); 22594 22608 runOnlyForDeploymentPostprocessing = 0; -
trunk/WebCore/html/HTMLEntityParser.cpp
r65171 r65351 29 29 #include "HTMLEntityParser.h" 30 30 31 #include "HTMLEntitySearch.h" 32 #include "HTMLEntityTable.h" 31 33 #include <wtf/Vector.h> 32 33 #include "HTMLEntityNames.cpp"34 34 35 35 using namespace WTF; … … 103 103 unsigned result = 0; 104 104 Vector<UChar, 10> consumedCharacters; 105 Vector<char, 10> entityName;106 105 107 106 while (!source.isEmpty()) { … … 167 166 source.advancePastNonNewline(); 168 167 return legalEntityFor(result); 169 } else 168 } else 170 169 return legalEntityFor(result); 171 170 break; … … 182 181 } 183 182 case Named: { 184 // FIXME: This code is wrong. We need to find the longest matching entity. 185 // The examples from the spec are: 186 // I'm ¬it; I tell you 187 // I'm ∉ I tell you 188 // In the first case, "¬" is the entity. In the second 189 // case, "∉" is the entity. 190 // FIXME: Our list of HTML entities is incomplete. 191 // FIXME: The number 8 below is bogus. 192 while (!source.isEmpty() && entityName.size() <= 8) { 183 HTMLEntitySearch entitySearch; 184 while (!source.isEmpty()) { 193 185 cc = *source; 194 if (cc == ';') { 195 const Entity* entity = findEntity(entityName.data(), entityName.size()); 196 if (entity) { 197 source.advanceAndASSERT(';'); 198 return entity->code; 199 } 186 entitySearch.advance(cc); 187 if (!entitySearch.isEntityPrefix()) 200 188 break; 201 }202 if (!isAlphaNumeric(cc)) {203 const Entity* entity = findEntity(entityName.data(), entityName.size());204 if (entity) {205 // HTML5 tells us to ignore this entity, for historical reasons,206 // if the lookhead character is '='.207 if (additionalAllowedCharacter && cc == '=')208 break;209 // Some entities require a terminating semicolon, whereas other210 // entities do not. The HTML5 spec has a giant list:211 //212 // http://www.whatwg.org/specs/web-apps/current-work/multipage/named-character-references.html#named-character-references213 //214 // However, the list seems to boil down to this branch:215 if (entity->code > 255)216 break;217 return entity->code;218 }219 break;220 }221 entityName.append(cc);222 189 consumedCharacters.append(cc); 223 190 source.advanceAndASSERT(cc); 224 191 } 225 192 notEnoughCharacters = source.isEmpty(); 193 if (notEnoughCharacters) { 194 // We can't an entity because there might be a longer entity 195 // that we could match if we had more data. 196 unconsumeCharacters(source, consumedCharacters); 197 return 0; 198 } 199 if (!entitySearch.lastMatch()) { 200 ASSERT(!entitySearch.currentValue()); 201 unconsumeCharacters(source, consumedCharacters); 202 return 0; 203 } 204 if (entitySearch.lastMatch()->length != entitySearch.currentLength()) { 205 // We've consumed too many characters. We need to walk the 206 // source back to the point at which we had consumed an 207 // actual entity. 208 unconsumeCharacters(source, consumedCharacters); 209 consumedCharacters.clear(); 210 const int length = entitySearch.lastMatch()->length; 211 const UChar* reference = entitySearch.lastMatch()->entity; 212 for (int i = 0; i < length; ++i) { 213 cc = *source; 214 ASSERT_UNUSED(reference, cc == *reference++); 215 consumedCharacters.append(cc); 216 source.advanceAndASSERT(cc); 217 ASSERT(!source.isEmpty()); 218 } 219 cc = *source; 220 } 221 if (entitySearch.lastMatch()->lastCharacter() == ';') 222 return entitySearch.lastMatch()->value; 223 if (!additionalAllowedCharacter || !(isAlphaNumeric(cc) || cc == '=')) 224 return entitySearch.lastMatch()->value; 226 225 unconsumeCharacters(source, consumedCharacters); 227 226 return 0; … … 239 238 UChar decodeNamedEntity(const char* name) 240 239 { 241 const Entity* e = findEntity(name, strlen(name)); 242 return e ? e->code : 0; 240 HTMLEntitySearch search; 241 while (name && search.isEntityPrefix()) 242 search.advance(*name++); 243 search.advance(';'); 244 UChar32 entityValue = search.currentValue(); 245 if (U16_LENGTH(entityValue) != 1) { 246 // Callers need to move off this API if the entity table has values 247 // which do no fit in a 16 bit UChar! 248 ASSERT_NOT_REACHED(); 249 return 0; 250 } 251 return static_cast<UChar>(entityValue); 243 252 } 244 253 -
trunk/WebCore/make-hash-tools.pl
r61091 r65351 30 30 switch ($option) { 31 31 32 case "HTMLEntityNames" {33 34 my $htmlEntityNamesGenerated = "$outdir/HTMLEntityNames.cpp";35 my $htmlEntityNamesGperf = $ARGV[0];36 shift;37 38 system("gperf --key-positions=\"*\" -D -s 2 $htmlEntityNamesGperf > $htmlEntityNamesGenerated") == 0 || die "calling gperf failed: $?";39 40 } # case "HTMLEntityNames"41 42 32 case "DocTypeStrings" { 43 33 -
trunk/WebKitTools/ChangeLog
r65343 r65351 1 2010-08-12 Adam Barth <abarth@webkit.org> 2 3 Reviewed by Eric Seidel. 4 5 Add support for MathML entities 6 https://bugs.webkit.org/show_bug.cgi?id=43949 7 8 A script for generating the C++ state data structure describing all the 9 entities from a JSON description. 10 11 * Scripts/create-html-entity-table: Added. 12 1 13 2010-08-13 Dirk Pranke <dpranke@chromium.org> 2 14
Note: See TracChangeset
for help on using the changeset viewer.