Changeset 98624 in webkit
- Timestamp:
- Oct 27, 2011 1:16:20 PM (12 years ago)
- Location:
- trunk/Source
- Files:
-
- 29 edited
Legend:
- Unmodified
- Added
- Removed
-
trunk/Source/JavaScriptCore/ChangeLog
r98606 r98624 1 2011-10-27 Michael Saboff <msaboff@apple.com> 2 3 Investigate storing strings in 8-bit buffers when possible 4 https://bugs.webkit.org/show_bug.cgi?id=66161 5 6 Investigate storing strings in 8-bit buffers when possible 7 https://bugs.webkit.org/show_bug.cgi?id=66161 8 9 Added support for 8 bit string data in StringImpl. Changed 10 (UChar*) m_data to m_data16. Added char* m_data8 as a union 11 with m_data16. Added UChar* m_copyData16 to the other union 12 to store a 16 bit copy of an 8 bit string when needed. 13 Added characters8() and characters16() accessor methods 14 that assume the caller has checked the underlying string type 15 via the new is8Bit() method. The characters() method will 16 return a UChar* of the string, materializing a 16 bit copy if the 17 string is an 8 bit string. Added two flags, one for 8 bit buffer 18 and a second for a 16 bit copy for an 8 bit string. 19 20 Fixed method name typo (StringHasher::defaultCoverter()). 21 22 Over time the goal is to eliminate calls to characters() and 23 us the character8() and characters16() accessors. 24 25 This patch does not include changes that actually create 8 bit 26 strings. This is the first of at least 8 patches. Subsequent 27 patches will be submitted for JIT changes, making the JSC lexer, 28 parser and literal parser, JavaScript string changes and 29 then changes in webcore to take advantage of the 8 bit strings. 30 31 This change is performance neutral for SunSpider and V8 when 32 run from the command line with "jsc". 33 34 Reviewed by Geoffrey Garen. 35 36 * JavaScriptCore.exp: 37 * JavaScriptCore.vcproj/JavaScriptCore/JavaScriptCore.def 38 * interpreter/Interpreter.cpp: 39 (JSC::Interpreter::callEval): 40 * parser/SourceProvider.h: 41 (JSC::UStringSourceProvider::data): 42 (JSC::UStringSourceProvider::UStringSourceProvider): 43 * runtime/Identifier.cpp: 44 (JSC::IdentifierCStringTranslator::hash): 45 (JSC::IdentifierCStringTranslator::equal): 46 (JSC::IdentifierCStringTranslator::translate): 47 (JSC::Identifier::add): 48 (JSC::Identifier::toUInt32): 49 * runtime/Identifier.h: 50 (JSC::Identifier::equal): 51 (JSC::operator==): 52 (JSC::operator!=): 53 * runtime/JSString.cpp: 54 (JSC::JSString::resolveRope): 55 (JSC::JSString::resolveRopeSlowCase): 56 * runtime/RegExp.cpp: 57 (JSC::RegExp::match): 58 * runtime/StringPrototype.cpp: 59 (JSC::jsSpliceSubstringsWithSeparators): 60 * runtime/UString.cpp: 61 (JSC::UString::UString): 62 (JSC::equalSlowCase): 63 (JSC::UString::utf8): 64 * runtime/UString.h: 65 (JSC::UString::characters): 66 (JSC::UString::characters8): 67 (JSC::UString::characters16): 68 (JSC::UString::is8Bit): 69 (JSC::UString::operator[]): 70 (JSC::UString::find): 71 (JSC::operator==): 72 * wtf/StringHasher.h: 73 (WTF::StringHasher::computeHash): 74 (WTF::StringHasher::defaultConverter): 75 * wtf/text/AtomicString.cpp: 76 (WTF::CStringTranslator::hash): 77 (WTF::CStringTranslator::equal): 78 (WTF::CStringTranslator::translate): 79 (WTF::AtomicString::add): 80 * wtf/text/AtomicString.h: 81 (WTF::AtomicString::AtomicString): 82 (WTF::AtomicString::contains): 83 (WTF::AtomicString::find): 84 (WTF::AtomicString::add): 85 (WTF::operator==): 86 (WTF::operator!=): 87 (WTF::equalIgnoringCase): 88 * wtf/text/StringConcatenate.h: 89 * wtf/text/StringHash.h: 90 (WTF::StringHash::equal): 91 (WTF::CaseFoldingHash::hash): 92 * wtf/text/StringImpl.cpp: 93 (WTF::StringImpl::~StringImpl): 94 (WTF::StringImpl::createUninitialized): 95 (WTF::StringImpl::create): 96 (WTF::StringImpl::getData16SlowCase): 97 (WTF::StringImpl::containsOnlyWhitespace): 98 (WTF::StringImpl::substring): 99 (WTF::StringImpl::characterStartingAt): 100 (WTF::StringImpl::lower): 101 (WTF::StringImpl::upper): 102 (WTF::StringImpl::fill): 103 (WTF::StringImpl::foldCase): 104 (WTF::StringImpl::stripMatchedCharacters): 105 (WTF::StringImpl::removeCharacters): 106 (WTF::StringImpl::simplifyMatchedCharactersToSpace): 107 (WTF::StringImpl::toIntStrict): 108 (WTF::StringImpl::toUIntStrict): 109 (WTF::StringImpl::toInt64Strict): 110 (WTF::StringImpl::toUInt64Strict): 111 (WTF::StringImpl::toIntPtrStrict): 112 (WTF::StringImpl::toInt): 113 (WTF::StringImpl::toUInt): 114 (WTF::StringImpl::toInt64): 115 (WTF::StringImpl::toUInt64): 116 (WTF::StringImpl::toIntPtr): 117 (WTF::StringImpl::toDouble): 118 (WTF::StringImpl::toFloat): 119 (WTF::equal): 120 (WTF::equalIgnoringCase): 121 (WTF::StringImpl::find): 122 (WTF::StringImpl::findIgnoringCase): 123 (WTF::StringImpl::reverseFind): 124 (WTF::StringImpl::replace): 125 (WTF::StringImpl::defaultWritingDirection): 126 (WTF::StringImpl::adopt): 127 (WTF::StringImpl::createWithTerminatingNullCharacter): 128 * wtf/text/StringImpl.h: 129 (WTF::StringImpl::StringImpl): 130 (WTF::StringImpl::create): 131 (WTF::StringImpl::create8): 132 (WTF::StringImpl::tryCreateUninitialized): 133 (WTF::StringImpl::flagsOffset): 134 (WTF::StringImpl::flagIs8Bit): 135 (WTF::StringImpl::dataOffset): 136 (WTF::StringImpl::is8Bit): 137 (WTF::StringImpl::characters8): 138 (WTF::StringImpl::characters16): 139 (WTF::StringImpl::characters): 140 (WTF::StringImpl::has16BitShadow): 141 (WTF::StringImpl::setHash): 142 (WTF::StringImpl::hash): 143 (WTF::StringImpl::copyChars): 144 (WTF::StringImpl::operator[]): 145 (WTF::StringImpl::find): 146 (WTF::StringImpl::findIgnoringCase): 147 (WTF::equal): 148 (WTF::equalIgnoringCase): 149 (WTF::StringImpl::isolatedCopy): 150 * wtf/text/WTFString.cpp: 151 (WTF::String::String): 152 (WTF::String::append): 153 (WTF::String::format): 154 (WTF::String::fromUTF8): 155 (WTF::String::fromUTF8WithLatin1Fallback): 156 * wtf/text/WTFString.h: 157 (WTF::String::find): 158 (WTF::String::findIgnoringCase): 159 (WTF::String::contains): 160 (WTF::String::append): 161 (WTF::String::fromUTF8): 162 (WTF::String::fromUTF8WithLatin1Fallback): 163 (WTF::operator==): 164 (WTF::operator!=): 165 (WTF::equalIgnoringCase): 166 * wtf/unicode/Unicode.h: 167 * yarr/YarrJIT.cpp: 168 (JSC::Yarr::execute): 169 * yarr/YarrJIT.h: 170 (JSC::Yarr::YarrCodeBlock::execute): 171 * yarr/YarrParser.h: 172 (JSC::Yarr::Parser::Parser): 173 1 174 2011-10-27 Mark Hahnenberg <mhahnenberg@apple.com> 2 175 -
trunk/Source/JavaScriptCore/JavaScriptCore.exp
r98593 r98624 353 353 __ZN3WTF10StringImpl11reverseFindEPS0_j 354 354 __ZN3WTF10StringImpl11reverseFindEtj 355 __ZN3WTF10StringImpl16findIgnoringCaseEPKcj356 355 __ZN3WTF10StringImpl16findIgnoringCaseEPS0_j 357 356 __ZN3WTF10StringImpl18simplifyWhiteSpaceEv … … 363 362 __ZN3WTF10StringImpl4fillEt 364 363 __ZN3WTF10StringImpl4findEPFbtEj 365 __ZN3WTF10StringImpl4findEPKcj366 364 __ZN3WTF10StringImpl4findEPS0_j 367 365 __ZN3WTF10StringImpl4findEtj … … 371 369 __ZN3WTF10StringImpl5toIntEPb 372 370 __ZN3WTF10StringImpl5upperEv 373 __ZN3WTF10StringImpl6createEPKc 374 __ZN3WTF10StringImpl6createEPKcj 371 __ZN3WTF10StringImpl6createEPKh 375 372 __ZN3WTF10StringImpl6createEPKtj 376 373 __ZN3WTF10StringImpl7replaceEPS0_S1_ … … 392 389 __ZN3WTF12AtomicString11addSlowCaseEPNS_10StringImplE 393 390 __ZN3WTF12AtomicString16fromUTF8InternalEPKcS2_ 394 __ZN3WTF12AtomicString3addEPK c391 __ZN3WTF12AtomicString3addEPKh 395 392 __ZN3WTF12AtomicString3addEPKt 396 393 __ZN3WTF12AtomicString3addEPKtj … … 434 431 __ZN3WTF16fastZeroedMallocEm 435 432 __ZN3WTF17charactersToFloatEPKtmPbS2_ 436 __ZN3WTF17equalIgnoringCaseEPKtPKcj 437 __ZN3WTF17equalIgnoringCaseEPNS_10StringImplEPKc 433 __ZN3WTF17equalIgnoringCaseEPKtPKhj 438 434 __ZN3WTF17equalIgnoringCaseEPNS_10StringImplES1_ 435 __ZN3WTF17equalIgnoringCaseEPNS_10StringImplEPKh 439 436 __ZN3WTF18calculateDSTOffsetEdd 440 437 __ZN3WTF18calculateUTCOffsetEv … … 481 478 __ZN3WTF5MutexC1Ev 482 479 __ZN3WTF5MutexD1Ev 483 __ZN3WTF5equalEPKNS_10StringImplEPK c480 __ZN3WTF5equalEPKNS_10StringImplEPKh 484 481 __ZN3WTF5equalEPKNS_10StringImplEPKtj 485 482 __ZN3WTF5equalEPKNS_10StringImplES2_ 486 483 __ZN3WTF5yieldEv 487 __ZN3WTF6String26fromUTF8WithLatin1FallbackEPK cm484 __ZN3WTF6String26fromUTF8WithLatin1FallbackEPKhm 488 485 __ZN3WTF6String29charactersWithNullTerminationEv 486 __ZN3WTF6StringC1EPKcj 487 __ZN3WTF6String6appendEh 489 488 __ZN3WTF6String6appendEPKtj 490 489 __ZN3WTF6String6appendERKS0_ 491 __ZN3WTF6String6appendEc492 490 __ZN3WTF6String6appendEt 493 491 __ZN3WTF6String6formatEPKcz … … 502 500 __ZN3WTF6String6numberEy 503 501 __ZN3WTF6String6removeEji 504 __ZN3WTF6String8fromUTF8EPK c505 __ZN3WTF6String8fromUTF8EPK cm502 __ZN3WTF6String8fromUTF8EPKh 503 __ZN3WTF6String8fromUTF8EPKhm 506 504 __ZN3WTF6String8truncateEj 507 505 __ZN3WTF6StringC1EPKc 508 __ZN3WTF6StringC1EPKcj509 506 __ZN3WTF6StringC1EPKt 510 507 __ZN3WTF6StringC1EPKtj … … 578 575 __ZNK3JSC9HashTable11createTableEPNS_12JSGlobalDataE 579 576 __ZNK3JSC9HashTable11deleteTableEv 577 __ZNK3WTF10StringImpl17getData16SlowCaseEv 580 578 __ZNK3WTF12AtomicString5lowerEv 581 579 __ZNK3WTF13DecimalNumber15toStringDecimalEPtj -
trunk/Source/JavaScriptCore/JavaScriptCore.vcproj/JavaScriptCore/JavaScriptCore.def
r98606 r98624 48 48 ?absoluteTimeToWaitTimeoutInterval@WTF@@YAKN@Z 49 49 ?activityCallback@Heap@JSC@@QAEPAVGCActivityCallback@2@XZ 50 ?add@AtomicString@WTF@@CA?AV?$PassRefPtr@VStringImpl@WTF@@@2@PBD@Z 50 51 ?add@Identifier@JSC@@SA?AV?$PassRefPtr@VStringImpl@WTF@@@WTF@@PAVExecState@2@PBD@Z 51 52 ?add@PropertyNameArray@JSC@@QAEXPAVStringImpl@WTF@@@Z … … 158 159 ?empty@StringImpl@WTF@@SAPAV12@XZ 159 160 ?enumerable@PropertyDescriptor@JSC@@QBE_NXZ 160 ?equal@Identifier@JSC@@SA_NPBVStringImpl@WTF@@PBD@Z161 161 ?equalUTF16WithUTF8@Unicode@WTF@@YA_NPB_W0PBD1@Z 162 162 ?evaluate@DebuggerCallFrame@JSC@@QBE?AVJSValue@2@ABVUString@2@AAV32@@Z -
trunk/Source/JavaScriptCore/interpreter/Interpreter.cpp
r98422 r98624 446 446 // FIXME: We can use the preparser in strict mode, we just need additional logic 447 447 // to prevent duplicates. 448 LiteralParser preparser(callFrame, programSource.characters (), programSource.length(), LiteralParser::NonStrictJSON);448 LiteralParser preparser(callFrame, programSource.characters16(), programSource.length(), LiteralParser::NonStrictJSON); 449 449 if (JSValue parsedObject = preparser.tryLiteralParse()) 450 450 return parsedObject; -
trunk/Source/JavaScriptCore/parser/SourceProvider.h
r95901 r98624 89 89 return m_source.substringSharingImpl(start, end - start); 90 90 } 91 const UChar* data() const { return m_ source.characters(); }91 const UChar* data() const { return m_data; } 92 92 int length() const { return m_source.length(); } 93 93 … … 96 96 : SourceProvider(url) 97 97 , m_source(source) 98 , m_data(m_source.characters16()) 98 99 { 99 100 } 100 101 101 102 UString m_source; 103 const UChar* m_data; 102 104 }; 103 105 -
trunk/Source/JavaScriptCore/runtime/Identifier.cpp
r94475 r98624 69 69 70 70 struct IdentifierCStringTranslator { 71 static unsigned hash(const char* c)72 { 73 return StringHasher::computeHash< char>(c);74 } 75 76 static bool equal(StringImpl* r, const char* s)71 static unsigned hash(const LChar* c) 72 { 73 return StringHasher::computeHash<LChar>(c); 74 } 75 76 static bool equal(StringImpl* r, const LChar* s) 77 77 { 78 78 return Identifier::equal(r, s); 79 79 } 80 80 81 static void translate(StringImpl*& location, const char* c, unsigned hash)82 { 83 size_t length = strlen( c);81 static void translate(StringImpl*& location, const LChar* c, unsigned hash) 82 { 83 size_t length = strlen(reinterpret_cast<const char*>(c)); 84 84 UChar* d; 85 85 StringImpl* r = StringImpl::createUninitialized(length, d).leakRef(); 86 86 for (size_t i = 0; i != length; i++) 87 d[i] = static_cast<unsigned char>(c[i]); // use unsigned char to zero-extend instead of sign-extend87 d[i] = c[i]; 88 88 r->setHash(hash); 89 89 location = r; … … 98 98 return StringImpl::empty(); 99 99 if (!c[1]) 100 return add(globalData, globalData->smallStrings.singleCharacterStringRep( static_cast<unsigned char>(c[0])));100 return add(globalData, globalData->smallStrings.singleCharacterStringRep(c[0])); 101 101 102 102 IdentifierTable& identifierTable = *globalData->identifierTable; … … 107 107 return iter->second; 108 108 109 pair<HashSet<StringImpl*>::iterator, bool> addResult = identifierTable.add<const char*, IdentifierCStringTranslator>(c);109 pair<HashSet<StringImpl*>::iterator, bool> addResult = identifierTable.add<const LChar*, IdentifierCStringTranslator>(reinterpret_cast<const LChar*>(c)); 110 110 111 111 // If the string is newly-translated, then we need to adopt it. … … 155 155 156 156 unsigned length = string.length(); 157 const UChar* characters = string.characters();158 157 159 158 // An empty string is not a number. 160 159 if (!length) 161 160 return 0; 161 162 const UChar* characters = string.characters16(); 162 163 163 164 // Get the first character, turning it into a digit. -
trunk/Source/JavaScriptCore/runtime/Identifier.h
r97675 r98624 71 71 friend bool operator!=(const Identifier&, const Identifier&); 72 72 73 friend bool operator==(const Identifier&, const LChar*); 73 74 friend bool operator==(const Identifier&, const char*); 75 friend bool operator!=(const Identifier&, const LChar*); 74 76 friend bool operator!=(const Identifier&, const char*); 75 77 76 static bool equal(const StringImpl*, const char*); 78 static bool equal(const StringImpl*, const LChar*); 79 static inline bool equal(const StringImpl*a, const char*b) { return Identifier::equal(a, reinterpret_cast<const LChar*>(b)); }; 77 80 static bool equal(const StringImpl*, const UChar*, unsigned length); 78 81 static bool equal(const StringImpl* a, const StringImpl* b) { return ::equal(a, b); } … … 85 88 86 89 static bool equal(const Identifier& a, const Identifier& b) { return a.m_string.impl() == b.m_string.impl(); } 87 static bool equal(const Identifier& a, const char* b) { return equal(a.m_string.impl(), b); }90 static bool equal(const Identifier& a, const LChar* b) { return equal(a.m_string.impl(), b); } 88 91 89 92 static PassRefPtr<StringImpl> add(ExecState*, const UChar*, int length); … … 126 129 } 127 130 128 inline bool operator==(const Identifier& a, const char* b)131 inline bool operator==(const Identifier& a, const LChar* b) 129 132 { 130 133 return Identifier::equal(a, b); 131 134 } 132 135 133 inline bool operator!=(const Identifier& a, const char* b) 136 inline bool operator==(const Identifier& a, const char* b) 137 { 138 return Identifier::equal(a, reinterpret_cast<const LChar*>(b)); 139 } 140 141 inline bool operator!=(const Identifier& a, const LChar* b) 134 142 { 135 143 return !Identifier::equal(a, b); 136 144 } 137 145 138 inline bool Identifier::equal(const StringImpl* r, const char* s) 146 inline bool operator!=(const Identifier& a, const char* b) 147 { 148 return !Identifier::equal(a, reinterpret_cast<const LChar*>(b)); 149 } 150 151 inline bool Identifier::equal(const StringImpl* r, const LChar* s) 139 152 { 140 153 return WTF::equal(r, s); -
trunk/Source/JavaScriptCore/runtime/JSString.cpp
r98593 r98624 80 80 StringImpl* string = m_fibers[i]->m_value.impl(); 81 81 unsigned length = string->length(); 82 StringImpl::copyChars(position, string->characters (), length);82 StringImpl::copyChars(position, string->characters16(), length); 83 83 position += length; 84 84 m_fibers[i].clear(); … … 121 121 unsigned length = string->length(); 122 122 position -= length; 123 StringImpl::copyChars(position, string->characters (), length);123 StringImpl::copyChars(position, string->characters16(), length); 124 124 } 125 125 -
trunk/Source/JavaScriptCore/runtime/RegExp.cpp
r95936 r98624 367 367 if (m_state == JITCode) { 368 368 if (s.is8Bit()) 369 result = Yarr::execute(m_representation->m_regExpJITCode, s. latin1().data(), startOffset, s.length(), offsetVector);369 result = Yarr::execute(m_representation->m_regExpJITCode, s.characters8(), startOffset, s.length(), offsetVector); 370 370 else 371 result = Yarr::execute(m_representation->m_regExpJITCode, s.characters (), startOffset, s.length(), offsetVector);371 result = Yarr::execute(m_representation->m_regExpJITCode, s.characters16(), startOffset, s.length(), offsetVector); 372 372 #if ENABLE(YARR_JIT_DEBUG) 373 373 matchCompareWithInterpreter(s, startOffset, offsetVector, result); -
trunk/Source/JavaScriptCore/runtime/StringPrototype.cpp
r98501 r98624 331 331 if (i < rangeCount) { 332 332 if (int srcLen = substringRanges[i].length) { 333 StringImpl::copyChars(buffer + bufferPos, source.characters () + substringRanges[i].position, srcLen);333 StringImpl::copyChars(buffer + bufferPos, source.characters16() + substringRanges[i].position, srcLen); 334 334 bufferPos += srcLen; 335 335 } … … 337 337 if (i < separatorCount) { 338 338 if (int sepLen = separators[i].length()) { 339 StringImpl::copyChars(buffer + bufferPos, separators[i].characters (), sepLen);339 StringImpl::copyChars(buffer + bufferPos, separators[i].characters16(), sepLen); 340 340 bufferPos += sepLen; 341 341 } -
trunk/Source/JavaScriptCore/runtime/UString.cpp
r94452 r98624 74 74 75 75 // Construct a string with latin1 data. 76 UString::UString(const LChar* characters, unsigned length) 77 : m_impl(characters ? StringImpl::create(characters, length) : 0) 78 { 79 } 80 76 81 UString::UString(const char* characters, unsigned length) 77 : m_impl(characters ? StringImpl::create( characters, length) : 0)82 : m_impl(characters ? StringImpl::create(reinterpret_cast<const LChar*>(characters), length) : 0) 78 83 { 79 84 } 80 85 81 86 // Construct a string with latin1 data, from a null-terminated source. 87 UString::UString(const LChar* characters) 88 : m_impl(characters ? StringImpl::create(characters) : 0) 89 { 90 } 91 82 92 UString::UString(const char* characters) 83 : m_impl(characters ? StringImpl::create( characters) : 0)93 : m_impl(characters ? StringImpl::create(reinterpret_cast<const LChar*>(characters)) : 0) 84 94 { 85 95 } … … 230 240 } 231 241 242 // This method assumes that all simple checks have been performed by 243 // the inlined operator==() in the header file. 244 bool equalSlowCase(const UString& s1, const UString& s2) 245 { 246 StringImpl* rep1 = s1.impl(); 247 StringImpl* rep2 = s2.impl(); 248 unsigned size1 = rep1->length(); 249 250 // At this point we know 251 // (a) that the strings are the same length and 252 // (b) that they are greater than zero length. 253 bool s1Is8Bit = rep1->is8Bit(); 254 bool s2Is8Bit = rep2->is8Bit(); 255 256 if (s1Is8Bit) { 257 const LChar* d1 = rep1->characters8(); 258 if (s2Is8Bit) { 259 const LChar* d2 = rep2->characters8(); 260 261 if (d1 == d2) // Check to see if the data pointers are the same. 262 return true; 263 264 // Do quick checks for sizes 1 and 2. 265 switch (size1) { 266 case 1: 267 return d1[0] == d2[0]; 268 case 2: 269 return (d1[0] == d2[0]) & (d1[1] == d2[1]); 270 default: 271 return (!memcmp(d1, d2, size1 * sizeof(LChar))); 272 } 273 } 274 275 const UChar* d2 = rep2->characters16(); 276 277 for (unsigned i = 0; i < size1; i++) { 278 if (d1[i] != d2[i]) 279 return false; 280 } 281 return true; 282 } 283 284 if (s2Is8Bit) { 285 const UChar* d1 = rep1->characters16(); 286 const LChar* d2 = rep2->characters8(); 287 288 for (unsigned i = 0; i < size1; i++) { 289 if (d1[i] != d2[i]) 290 return false; 291 } 292 return true; 293 294 } 295 296 const UChar* d1 = rep1->characters16(); 297 const UChar* d2 = rep2->characters16(); 298 299 if (d1 == d2) // Check to see if the data pointers are the same. 300 return true; 301 302 // Do quick checks for sizes 1 and 2. 303 switch (size1) { 304 case 1: 305 return d1[0] == d2[0]; 306 case 2: 307 return (d1[0] == d2[0]) & (d1[1] == d2[1]); 308 default: 309 return (!memcmp(d1, d2, size1 * sizeof(UChar))); 310 } 311 } 312 232 313 bool operator<(const UString& s1, const UString& s2) 233 314 { … … 318 399 { 319 400 unsigned length = this->length(); 320 const UChar* characters = this->characters(); 401 402 if (is8Bit()) 403 return CString(reinterpret_cast<const char*>(characters8()), length); 321 404 322 405 // Allocate a buffer big enough to hold all the characters … … 332 415 if (length > numeric_limits<unsigned>::max() / 3) 333 416 return CString(); 417 418 const UChar* characters = this->characters16(); 334 419 Vector<char, 1024> bufferVector(length * 3); 335 420 -
trunk/Source/JavaScriptCore/runtime/UString.h
r94981 r98624 40 40 41 41 // Construct a string with latin1 data. 42 UString(const LChar* characters, unsigned length); 42 43 UString(const char* characters, unsigned length); 43 44 44 45 // Construct a string with latin1 data, from a null-terminated source. 46 UString(const LChar* characters); 45 47 UString(const char* characters); 46 48 … … 74 76 if (!m_impl) 75 77 return 0; 76 return m_impl->characters(); 77 } 78 79 bool is8Bit() const { return false; } 78 return m_impl->characters16(); 79 } 80 81 const LChar* characters8() const 82 { 83 if (!m_impl) 84 return 0; 85 ASSERT(m_impl->is8Bit()); 86 return m_impl->characters8(); 87 } 88 89 const UChar* characters16() const 90 { 91 if (!m_impl) 92 return 0; 93 ASSERT(!m_impl->is8Bit()); 94 return m_impl->characters16(); 95 } 96 97 bool is8Bit() const { return m_impl->is8Bit(); } 80 98 81 99 CString ascii() const; … … 87 105 if (!m_impl || index >= m_impl->length()) 88 106 return 0; 89 return m_impl->characters()[index]; 107 if (is8Bit()) 108 return m_impl->characters8()[index]; 109 return m_impl->characters16()[index]; 90 110 } 91 111 … … 101 121 size_t find(const UString& str, unsigned start = 0) const 102 122 { return m_impl ? m_impl->find(str.impl(), start) : notFound; } 103 size_t find(const char* str, unsigned start = 0) const123 size_t find(const LChar* str, unsigned start = 0) const 104 124 { return m_impl ? m_impl->find(str, start) : notFound; } 105 125 … … 116 136 }; 117 137 138 NEVER_INLINE bool equalSlowCase(const UString& s1, const UString& s2); 139 118 140 ALWAYS_INLINE bool operator==(const UString& s1, const UString& s2) 119 141 { 120 142 StringImpl* rep1 = s1.impl(); 121 143 StringImpl* rep2 = s2.impl(); 144 145 if (rep1 == rep2) // If they're the same rep, they're equal. 146 return true; 147 122 148 unsigned size1 = 0; 123 149 unsigned size2 = 0; 124 150 125 if (rep1 == rep2) // If they're the same rep, they're equal.126 return true;127 128 151 if (rep1) 129 152 size1 = rep1->length(); 130 153 131 154 if (rep2) 132 155 size2 = rep2->length(); 133 156 134 157 if (size1 != size2) // If the lengths are not the same, we're done. 135 158 return false; 136 159 137 160 if (!size1) 138 161 return true; 139 140 // At this point we know 141 // (a) that the strings are the same length and 142 // (b) that they are greater than zero length. 143 const UChar* d1 = rep1->characters(); 144 const UChar* d2 = rep2->characters(); 145 146 if (d1 == d2) // Check to see if the data pointers are the same. 147 return true; 148 149 // Do quick checks for sizes 1 and 2. 150 switch (size1) { 151 case 1: 152 return d1[0] == d2[0]; 153 case 2: 154 return (d1[0] == d2[0]) & (d1[1] == d2[1]); 155 default: 156 return memcmp(d1, d2, size1 * sizeof(UChar)) == 0; 157 } 162 163 if (size1 == 1) 164 return (*rep1)[0] == (*rep2)[0]; 165 166 return equalSlowCase(s1, s2); 158 167 } 159 168 -
trunk/Source/JavaScriptCore/wtf/StringHasher.h
r98495 r98624 135 135 template<typename T> static inline unsigned computeHash(const T* data, unsigned length) 136 136 { 137 return computeHash<T, defaultCo verter>(data, length);137 return computeHash<T, defaultConverter>(data, length); 138 138 } 139 139 140 140 template<typename T> static inline unsigned computeHash(const T* data) 141 141 { 142 return computeHash<T, defaultCo verter>(data);142 return computeHash<T, defaultConverter>(data); 143 143 } 144 144 … … 156 156 157 157 private: 158 static inline UChar defaultCo verter(UChar ch)158 static inline UChar defaultConverter(UChar ch) 159 159 { 160 160 return ch; 161 161 } 162 162 163 static inline UChar defaultCo verter(char ch)163 static inline UChar defaultConverter(LChar ch) 164 164 { 165 return static_cast<unsigned char>(ch);165 return ch; 166 166 } 167 167 -
trunk/Source/JavaScriptCore/wtf/text/AtomicString.cpp
r94475 r98624 86 86 87 87 struct CStringTranslator { 88 static unsigned hash(const char* c)88 static unsigned hash(const LChar* c) 89 89 { 90 90 return StringHasher::computeHash(c); 91 91 } 92 92 93 static inline bool equal(StringImpl* r, const char* s)93 static inline bool equal(StringImpl* r, const LChar* s) 94 94 { 95 95 return WTF::equal(r, s); 96 96 } 97 97 98 static void translate(StringImpl*& location, const char* const& c, unsigned hash)98 static void translate(StringImpl*& location, const LChar* const& c, unsigned hash) 99 99 { 100 100 location = StringImpl::create(c).leakRef(); … … 104 104 }; 105 105 106 PassRefPtr<StringImpl> AtomicString::add(const char* c)106 PassRefPtr<StringImpl> AtomicString::add(const LChar* c) 107 107 { 108 108 if (!c) … … 111 111 return StringImpl::empty(); 112 112 113 return addToStringTable<const char*, CStringTranslator>(c);113 return addToStringTable<const LChar*, CStringTranslator>(c); 114 114 } 115 115 -
trunk/Source/JavaScriptCore/wtf/text/AtomicString.h
r94475 r98624 42 42 43 43 AtomicString() { } 44 AtomicString(const LChar* s) : m_string(add(s)) { } 44 45 AtomicString(const char* s) : m_string(add(s)) { } 45 46 AtomicString(const UChar* s, unsigned length) : m_string(add(s, length)) { } … … 67 68 68 69 bool contains(UChar c) const { return m_string.contains(c); } 69 bool contains(const char* s, bool caseSensitive = true) const70 bool contains(const LChar* s, bool caseSensitive = true) const 70 71 { return m_string.contains(s, caseSensitive); } 71 72 bool contains(const String& s, bool caseSensitive = true) const … … 73 74 74 75 size_t find(UChar c, size_t start = 0) const { return m_string.find(c, start); } 75 size_t find(const char* s, size_t start = 0, bool caseSentitive = true) const76 size_t find(const LChar* s, size_t start = 0, bool caseSentitive = true) const 76 77 { return m_string.find(s, start, caseSentitive); } 77 78 size_t find(const String& s, size_t start = 0, bool caseSentitive = true) const … … 120 121 String m_string; 121 122 122 static PassRefPtr<StringImpl> add(const char*); 123 static PassRefPtr<StringImpl> add(const LChar*); 124 ALWAYS_INLINE static PassRefPtr<StringImpl> add(const char* s) { return add(reinterpret_cast<const LChar*>(s)); }; 123 125 static PassRefPtr<StringImpl> add(const UChar*, unsigned length); 126 ALWAYS_INLINE static PassRefPtr<StringImpl> add(const char* s, unsigned length) { return add(reinterpret_cast<const char*>(s), length); }; 124 127 static PassRefPtr<StringImpl> add(const UChar*, unsigned length, unsigned existingHash); 125 128 static PassRefPtr<StringImpl> add(const UChar*); … … 135 138 136 139 inline bool operator==(const AtomicString& a, const AtomicString& b) { return a.impl() == b.impl(); } 137 bool operator==(const AtomicString& a, const char* b);138 inline bool operator==(const AtomicString& a, const char* b) { return WTF::equal(a.impl(), b); }140 bool operator==(const AtomicString&, const LChar*); 141 inline bool operator==(const AtomicString& a, const char* b) { return WTF::equal(a.impl(), reinterpret_cast<const LChar*>(b)); } 139 142 inline bool operator==(const AtomicString& a, const Vector<UChar>& b) { return a.impl() && equal(a.impl(), b.data(), b.size()); } 140 143 inline bool operator==(const AtomicString& a, const String& b) { return equal(a.impl(), b.impl()); } 141 inline bool operator==(const char* a, const AtomicString& b) { return b == a; }144 inline bool operator==(const LChar* a, const AtomicString& b) { return b == a; } 142 145 inline bool operator==(const String& a, const AtomicString& b) { return equal(a.impl(), b.impl()); } 143 146 inline bool operator==(const Vector<UChar>& a, const AtomicString& b) { return b == a; } 144 147 145 148 inline bool operator!=(const AtomicString& a, const AtomicString& b) { return a.impl() != b.impl(); } 146 inline bool operator!=(const AtomicString& a, const char *b) { return !(a == b); } 149 inline bool operator!=(const AtomicString& a, const LChar* b) { return !(a == b); } 150 inline bool operator!=(const AtomicString& a, const char* b) { return !(a == b); } 147 151 inline bool operator!=(const AtomicString& a, const String& b) { return !equal(a.impl(), b.impl()); } 148 152 inline bool operator!=(const AtomicString& a, const Vector<UChar>& b) { return !(a == b); } 149 inline bool operator!=(const char* a, const AtomicString& b) { return !(b == a); }153 inline bool operator!=(const LChar* a, const AtomicString& b) { return !(b == a); } 150 154 inline bool operator!=(const String& a, const AtomicString& b) { return !equal(a.impl(), b.impl()); } 151 155 inline bool operator!=(const Vector<UChar>& a, const AtomicString& b) { return !(a == b); } 152 156 153 157 inline bool equalIgnoringCase(const AtomicString& a, const AtomicString& b) { return equalIgnoringCase(a.impl(), b.impl()); } 154 inline bool equalIgnoringCase(const AtomicString& a, const char* b) { return equalIgnoringCase(a.impl(), b); } 158 inline bool equalIgnoringCase(const AtomicString& a, const LChar* b) { return equalIgnoringCase(a.impl(), b); } 159 inline bool equalIgnoringCase(const AtomicString& a, const char* b) { return equalIgnoringCase(a.impl(), reinterpret_cast<const LChar*>(b)); } 155 160 inline bool equalIgnoringCase(const AtomicString& a, const String& b) { return equalIgnoringCase(a.impl(), b.impl()); } 156 inline bool equalIgnoringCase(const char* a, const AtomicString& b) { return equalIgnoringCase(a, b.impl()); } 161 inline bool equalIgnoringCase(const LChar* a, const AtomicString& b) { return equalIgnoringCase(a, b.impl()); } 162 inline bool equalIgnoringCase(const char* a, const AtomicString& b) { return equalIgnoringCase(reinterpret_cast<const LChar*>(a), b.impl()); } 157 163 inline bool equalIgnoringCase(const String& a, const AtomicString& b) { return equalIgnoringCase(a.impl(), b.impl()); } 158 164 -
trunk/Source/JavaScriptCore/wtf/text/StringConcatenate.h
r90813 r98624 59 59 60 60 template<> 61 class StringTypeAdapter<LChar> { 62 public: 63 StringTypeAdapter<LChar>(LChar buffer) 64 : m_buffer(buffer) 65 { 66 } 67 68 unsigned length() { return 1; } 69 void writeTo(UChar* destination) { *destination = m_buffer; } 70 71 private: 72 LChar m_buffer; 73 }; 74 75 template<> 61 76 class StringTypeAdapter<UChar> { 62 77 public: … … 98 113 99 114 template<> 115 class StringTypeAdapter<LChar*> { 116 public: 117 StringTypeAdapter<LChar*>(LChar* buffer) 118 : m_buffer(buffer) 119 , m_length(strlen(reinterpret_cast<char*>(buffer))) 120 { 121 } 122 123 unsigned length() { return m_length; } 124 125 void writeTo(UChar* destination) 126 { 127 for (unsigned i = 0; i < m_length; ++i) 128 destination[i] = m_buffer[i]; 129 } 130 131 private: 132 const LChar* m_buffer; 133 unsigned m_length; 134 }; 135 136 template<> 100 137 class StringTypeAdapter<const UChar*> { 101 138 public: … … 150 187 151 188 template<> 189 class StringTypeAdapter<const LChar*> { 190 public: 191 StringTypeAdapter<const LChar*>(const LChar* buffer) 192 : m_buffer(buffer) 193 , m_length(strlen(reinterpret_cast<const char*>(buffer))) 194 { 195 } 196 197 unsigned length() { return m_length; } 198 199 void writeTo(UChar* destination) 200 { 201 for (unsigned i = 0; i < m_length; ++i) 202 destination[i] = m_buffer[i]; 203 } 204 205 private: 206 const LChar* m_buffer; 207 unsigned m_length; 208 }; 209 210 template<> 152 211 class StringTypeAdapter<Vector<char> > { 153 212 public: … … 169 228 private: 170 229 const Vector<char>& m_buffer; 230 }; 231 232 template<> 233 class StringTypeAdapter<Vector<LChar> > { 234 public: 235 StringTypeAdapter<Vector<LChar> >(const Vector<LChar>& buffer) 236 : m_buffer(buffer) 237 { 238 } 239 240 size_t length() { return m_buffer.size(); } 241 242 void writeTo(UChar* destination) 243 { 244 for (size_t i = 0; i < m_buffer.size(); ++i) 245 destination[i] = m_buffer[i]; 246 } 247 248 private: 249 const Vector<LChar>& m_buffer; 171 250 }; 172 251 -
trunk/Source/JavaScriptCore/wtf/text/StringHash.h
r95090 r98624 54 54 return false; 55 55 56 if (a->is8Bit()) { 57 if (b->is8Bit()) { 58 // Both a & b are 8 bit. 59 const LChar* aChars = a->characters8(); 60 const LChar* bChars = b->characters8(); 61 62 unsigned i = 0; 63 64 // FIXME: perhaps we should have a more abstract macro that indicates when 65 // going 4 bytes at a time is unsafe 66 #if (CPU(X86) || CPU(X86_64)) 67 const unsigned charsPerInt = sizeof(uint32_t) / sizeof(char); 68 69 if (aLength > charsPerInt) { 70 unsigned stopCount = aLength & ~(charsPerInt - 1); 71 72 const uint32_t* aIntCharacters = reinterpret_cast<const uint32_t*>(aChars); 73 const uint32_t* bIntCharacters = reinterpret_cast<const uint32_t*>(bChars); 74 for (unsigned j = 0; i < stopCount; i += charsPerInt, ++j) { 75 if (aIntCharacters[j] != bIntCharacters[j]) 76 return false; 77 } 78 } 79 #endif 80 for (; i < aLength; ++i) { 81 if (aChars[i] != bChars[i]) 82 return false; 83 } 84 85 return true; 86 } 87 88 // We know that a is 8 bit & b is 16 bit. 89 const LChar* aChars = a->characters8(); 90 const UChar* bChars = b->characters16(); 91 for (unsigned i = 0; i != aLength; ++i) { 92 if (*aChars++ != *bChars++) 93 return false; 94 } 95 96 return true; 97 } 98 99 if (b->is8Bit()) { 100 // We know that a is 8 bit and b is 16 bit. 101 const UChar* aChars = a->characters16(); 102 const LChar* bChars = b->characters8(); 103 for (unsigned i = 0; i != aLength; ++i) { 104 if (*aChars++ != *bChars++) 105 return false; 106 } 107 108 return true; 109 } 110 111 // Both a & b are 16 bit. 56 112 // FIXME: perhaps we should have a more abstract macro that indicates when 57 113 // going 4 bytes at a time is unsafe 58 114 #if CPU(ARM) || CPU(SH4) || CPU(MIPS) || CPU(SPARC) 59 const UChar* aChars = a->characters ();60 const UChar* bChars = b->characters ();115 const UChar* aChars = a->characters16(); 116 const UChar* bChars = b->characters16(); 61 117 for (unsigned i = 0; i != aLength; ++i) { 62 118 if (*aChars++ != *bChars++) … … 66 122 #else 67 123 /* Do it 4-bytes-at-a-time on architectures where it's safe */ 68 const uint32_t* aChars = reinterpret_cast<const uint32_t*>(a->characters ());69 const uint32_t* bChars = reinterpret_cast<const uint32_t*>(b->characters ());124 const uint32_t* aChars = reinterpret_cast<const uint32_t*>(a->characters16()); 125 const uint32_t* bChars = reinterpret_cast<const uint32_t*>(b->characters16()); 70 126 71 127 unsigned halfLength = aLength >> 1; … … 113 169 } 114 170 115 static unsigned hash(const char* data, unsigned length) 116 { 117 return StringHasher::computeHash<char, foldCase<char> >(data, length); 118 } 119 171 static unsigned hash(const LChar* data, unsigned length) 172 { 173 return StringHasher::computeHash<LChar, foldCase<LChar> >(data, length); 174 } 175 176 static inline unsigned hash(const char* data, unsigned length) 177 { 178 return CaseFoldingHash::hash(reinterpret_cast<const LChar*>(data), length); 179 } 180 120 181 static bool equal(const StringImpl* a, const StringImpl* b) 121 182 { -
trunk/Source/JavaScriptCore/wtf/text/StringImpl.cpp
r98316 r98624 56 56 57 57 BufferOwnership ownership = bufferOwnership(); 58 59 if (has16BitShadow()) { 60 ASSERT(m_copyData16); 61 fastFree(m_copyData16); 62 } 63 58 64 if (ownership == BufferInternal) 59 65 return; 60 66 if (ownership == BufferOwned) { 61 ASSERT(m_data); 62 fastFree(const_cast<UChar*>(m_data)); 67 // We use m_data8, but since it is a union with m_data16 this works either way. 68 ASSERT(m_data8); 69 fastFree(const_cast<LChar*>(m_data8)); 63 70 return; 64 71 } … … 67 74 ASSERT(m_substringBuffer); 68 75 m_substringBuffer->deref(); 76 } 77 78 PassRefPtr<StringImpl> StringImpl::createUninitialized(unsigned length, LChar*& data) 79 { 80 if (!length) { 81 data = 0; 82 return empty(); 83 } 84 85 // Allocate a single buffer large enough to contain the StringImpl 86 // struct as well as the data which it contains. This removes one 87 // heap allocation from this call. 88 if (length > ((std::numeric_limits<unsigned>::max() - sizeof(StringImpl)) / sizeof(LChar))) 89 CRASH(); 90 size_t size = sizeof(StringImpl) + length * sizeof(LChar); 91 StringImpl* string = static_cast<StringImpl*>(fastMalloc(size)); 92 93 data = reinterpret_cast<LChar*>(string + 1); 94 return adoptRef(new (string) StringImpl(length, Force8BitConstructor)); 69 95 } 70 96 … … 119 145 } 120 146 121 PassRefPtr<StringImpl> StringImpl::create(const char* characters, unsigned length)147 PassRefPtr<StringImpl> StringImpl::create(const LChar* characters, unsigned length) 122 148 { 123 149 if (!characters || !length) … … 133 159 } 134 160 135 PassRefPtr<StringImpl> StringImpl::create(const char* string)161 PassRefPtr<StringImpl> StringImpl::create(const LChar* string) 136 162 { 137 163 if (!string) 138 164 return empty(); 139 size_t length = strlen( string);165 size_t length = strlen(reinterpret_cast<const char*>(string)); 140 166 if (length > numeric_limits<unsigned>::max()) 141 167 CRASH(); … … 143 169 } 144 170 171 const UChar* StringImpl::getData16SlowCase() const 172 { 173 if (has16BitShadow()) 174 return m_copyData16; 175 176 if (bufferOwnership() == BufferSubstring) { 177 // If this is a substring, return a pointer into the parent string. 178 // TODO: Consider severing this string from the parent string 179 unsigned offset = m_data8 - m_substringBuffer->characters8(); 180 return m_substringBuffer->characters16() + offset; 181 } 182 183 unsigned len = length(); 184 m_copyData16 = static_cast<UChar*>(fastMalloc(len * sizeof(UChar))); 185 for (size_t i = 0; i < len; i++) 186 m_copyData16[i] = m_data8[i]; 187 188 m_hashAndFlags |= s_hashFlagHas16BitShadow; 189 190 return m_copyData16; 191 } 192 145 193 bool StringImpl::containsOnlyWhitespace() 146 194 { … … 148 196 // that are not whitespace from the point of view of RenderText; I wonder if 149 197 // that's a problem in practice. 150 for (unsigned i = 0; i < m_length; i++) 151 if (!isASCIISpace(m_data[i])) 198 if (is8Bit()) { 199 for (unsigned i = 0; i < m_length; i++) { 200 UChar c = m_data8[i]; 201 if (!isASCIISpace(c)) 202 return false; 203 } 204 205 return true; 206 } 207 208 for (unsigned i = 0; i < m_length; i++) { 209 UChar c = m_data16[i]; 210 if (!isASCIISpace(c)) 152 211 return false; 212 } 153 213 return true; 154 214 } … … 164 224 length = maxLength; 165 225 } 166 return create(m_data + start, length); 226 if (is8Bit()) 227 return create(m_data8 + start, length); 228 229 return create(m_data16 + start, length); 167 230 } 168 231 169 232 UChar32 StringImpl::characterStartingAt(unsigned i) 170 233 { 171 if (U16_IS_SINGLE(m_data[i])) 172 return m_data[i]; 173 if (i + 1 < m_length && U16_IS_LEAD(m_data[i]) && U16_IS_TRAIL(m_data[i + 1])) 174 return U16_GET_SUPPLEMENTARY(m_data[i], m_data[i + 1]); 234 if (is8Bit()) 235 return m_data8[i]; 236 if (U16_IS_SINGLE(m_data16[i])) 237 return m_data16[i]; 238 if (i + 1 < m_length && U16_IS_LEAD(m_data16[i]) && U16_IS_TRAIL(m_data16[i + 1])) 239 return U16_GET_SUPPLEMENTARY(m_data16[i], m_data16[i + 1]); 175 240 return 0; 176 241 } … … 182 247 183 248 // First scan the string for uppercase and non-ASCII characters: 249 bool noUpper = true; 184 250 UChar ored = 0; 185 bool noUpper = true; 186 const UChar *end = m_data + m_length; 187 for (const UChar* chp = m_data; chp != end; chp++) { 251 if (is8Bit()) { 252 const LChar* end = m_data8 + m_length; 253 for (const LChar* chp = m_data8; chp != end; chp++) { 254 if (UNLIKELY(isASCIIUpper(*chp))) 255 noUpper = false; 256 ored |= *chp; 257 } 258 // Nothing to do if the string is all ASCII with no uppercase. 259 if (noUpper && !(ored & ~0x7F)) 260 return this; 261 262 if (m_length > static_cast<unsigned>(numeric_limits<int32_t>::max())) 263 CRASH(); 264 int32_t length = m_length; 265 266 LChar* data8; 267 RefPtr<StringImpl> newImpl = createUninitialized(length, data8); 268 269 if (!(ored & ~0x7F)) { 270 for (int32_t i = 0; i < length; i++) 271 data8[i] = toASCIILower(m_data8[i]); 272 273 return newImpl.release(); 274 } 275 276 // Do a slower implementation for cases that include non-ASCII Latin-1 characters. 277 for (int32_t i = 0; i < length; i++) 278 data8[i] = static_cast<LChar>(Unicode::toLower(m_data8[i])); 279 280 return newImpl.release(); 281 } 282 283 const UChar *end = m_data16 + m_length; 284 for (const UChar* chp = m_data16; chp != end; chp++) { 188 285 if (UNLIKELY(isASCIIUpper(*chp))) 189 286 noUpper = false; 190 287 ored |= *chp; 191 288 } 192 193 289 // Nothing to do if the string is all ASCII with no uppercase. 194 290 if (noUpper && !(ored & ~0x7F)) … … 199 295 int32_t length = m_length; 200 296 201 UChar* data;202 RefPtr<StringImpl> newImpl = createUninitialized(m_length, data);203 204 297 if (!(ored & ~0x7F)) { 205 // Do a faster loop for the case where all the characters are ASCII. 206 for (int i = 0; i < length; i++) { 207 UChar c = m_data[i]; 208 data[i] = toASCIILower(c); 209 } 210 return newImpl; 298 UChar* data16; 299 RefPtr<StringImpl> newImpl = createUninitialized(m_length, data16); 300 301 for (int32_t i = 0; i < length; i++) { 302 UChar c = m_data16[i]; 303 data16[i] = toASCIILower(c); 304 } 305 return newImpl.release(); 211 306 } 212 307 213 308 // Do a slower implementation for cases that include non-ASCII characters. 309 UChar* data16; 310 RefPtr<StringImpl> newImpl = createUninitialized(m_length, data16); 311 214 312 bool error; 215 int32_t realLength = Unicode::toLower(data , length, m_data, m_length, &error);313 int32_t realLength = Unicode::toLower(data16, length, m_data16, m_length, &error); 216 314 if (!error && realLength == length) 217 return newImpl; 218 newImpl = createUninitialized(realLength, data); 219 Unicode::toLower(data, realLength, m_data, m_length, &error); 315 return newImpl.release(); 316 317 newImpl = createUninitialized(realLength, data16); 318 Unicode::toLower(data16, realLength, m_data16, m_length, &error); 220 319 if (error) 221 320 return this; 222 return newImpl ;321 return newImpl.release(); 223 322 } 224 323 … … 228 327 // but in empirical testing, few actual calls to upper() are no-ops, so 229 328 // it wouldn't be worth the extra time for pre-scanning. 230 UChar* data;231 RefPtr<StringImpl> newImpl = createUninitialized(m_length, data);232 329 233 330 if (m_length > static_cast<unsigned>(numeric_limits<int32_t>::max())) … … 235 332 int32_t length = m_length; 236 333 334 if (is8Bit()) { 335 LChar* data8; 336 RefPtr<StringImpl> newImpl = createUninitialized(m_length, data8); 337 338 // Do a faster loop for the case where all the characters are ASCII. 339 char ored = 0; 340 for (int i = 0; i < length; i++) { 341 char c = m_data8[i]; 342 ored |= c; 343 data8[i] = toASCIIUpper(c); 344 } 345 if (!(ored & ~0x7F)) 346 return newImpl.release(); 347 348 // Do a slower implementation for cases that include non-ASCII Latin-1 characters. 349 for (int32_t i = 0; i < length; i++) 350 data8[i] = static_cast<LChar>(Unicode::toUpper(m_data8[i])); 351 352 return newImpl.release(); 353 } 354 355 UChar* data16; 356 RefPtr<StringImpl> newImpl = createUninitialized(m_length, data16); 357 237 358 // Do a faster loop for the case where all the characters are ASCII. 238 359 UChar ored = 0; 239 360 for (int i = 0; i < length; i++) { 240 UChar c = m_data [i];361 UChar c = m_data16[i]; 241 362 ored |= c; 242 data [i] = toASCIIUpper(c);363 data16[i] = toASCIIUpper(c); 243 364 } 244 365 if (!(ored & ~0x7F)) … … 247 368 // Do a slower implementation for cases that include non-ASCII characters. 248 369 bool error; 249 int32_t realLength = Unicode::toUpper(data, length, m_data, m_length, &error); 370 newImpl = createUninitialized(m_length, data16); 371 int32_t realLength = Unicode::toUpper(data16, length, m_data16, m_length, &error); 250 372 if (!error && realLength == length) 251 373 return newImpl; 252 newImpl = createUninitialized(realLength, data );253 Unicode::toUpper(data , realLength, m_data, m_length, &error);374 newImpl = createUninitialized(realLength, data16); 375 Unicode::toUpper(data16, realLength, m_data16, m_length, &error); 254 376 if (error) 255 377 return this; … … 262 384 return this; 263 385 386 if (!(character & ~0x7F)) { 387 LChar* data; 388 RefPtr<StringImpl> newImpl = createUninitialized(m_length, data); 389 for (unsigned i = 0; i < m_length; ++i) 390 data[i] = character; 391 return newImpl.release(); 392 } 264 393 UChar* data; 265 394 RefPtr<StringImpl> newImpl = createUninitialized(m_length, data); … … 271 400 PassRefPtr<StringImpl> StringImpl::foldCase() 272 401 { 273 UChar* data;274 RefPtr<StringImpl> newImpl = createUninitialized(m_length, data);275 276 402 if (m_length > static_cast<unsigned>(numeric_limits<int32_t>::max())) 277 403 CRASH(); 278 404 int32_t length = m_length; 279 405 406 if (is8Bit()) { 407 // Do a faster loop for the case where all the characters are ASCII. 408 LChar* data; 409 RefPtr <StringImpl>newImpl = createUninitialized(m_length, data); 410 LChar ored = 0; 411 412 for (int32_t i = 0; i < length; i++) { 413 LChar c = m_data8[i]; 414 data[i] = toASCIILower(c); 415 ored |= c; 416 } 417 418 if (!(ored & ~0x7F)) 419 return newImpl.release(); 420 421 // Do a slower implementation for cases that include non-ASCII Latin-1 characters. 422 for (int32_t i = 0; i < length; i++) 423 data[i] = static_cast<LChar>(Unicode::toLower(m_data8[i])); 424 } 425 280 426 // Do a faster loop for the case where all the characters are ASCII. 427 UChar* data; 428 RefPtr<StringImpl> newImpl = createUninitialized(m_length, data); 281 429 UChar ored = 0; 282 430 for (int32_t i = 0; i < length; i++) { 283 UChar c = m_data [i];431 UChar c = m_data16[i]; 284 432 ored |= c; 285 433 data[i] = toASCIILower(c); … … 290 438 // Do a slower implementation for cases that include non-ASCII characters. 291 439 bool error; 292 int32_t realLength = Unicode::foldCase(data, length, m_data , m_length, &error);440 int32_t realLength = Unicode::foldCase(data, length, m_data16, m_length, &error); 293 441 if (!error && realLength == length) 294 442 return newImpl.release(); 295 443 newImpl = createUninitialized(realLength, data); 296 Unicode::foldCase(data, realLength, m_data , m_length, &error);444 Unicode::foldCase(data, realLength, m_data16, m_length, &error); 297 445 if (error) 298 446 return this; … … 310 458 311 459 // skip white space from start 312 while (start <= end && predicate( m_data[start]))460 while (start <= end && predicate(is8Bit() ? m_data8[start] : m_data16[start])) 313 461 start++; 314 462 … … 318 466 319 467 // skip white space from end 320 while (end && predicate( m_data[end]))468 while (end && predicate(is8Bit() ? m_data8[end] : m_data16[end])) 321 469 end--; 322 470 323 471 if (!start && end == m_length - 1) 324 472 return this; 325 return create(m_data + start, end + 1 - start); 473 if (is8Bit()) 474 return create(m_data8 + start, end + 1 - start); 475 return create(m_data16 + start, end + 1 - start); 326 476 } 327 477 … … 357 507 } 358 508 509 // FIXME: Add 8-bit path. Likely requires templatized StringBuffer class 359 510 PassRefPtr<StringImpl> StringImpl::removeCharacters(CharacterMatchFunctionPtr findMatch) 360 511 { 361 const UChar* from = m_data;512 const UChar* from = characters(); 362 513 const UChar* fromend = from + m_length; 363 514 … … 370 521 StringBuffer data(m_length); 371 522 UChar* to = data.characters(); 372 unsigned outc = from - m_data;523 unsigned outc = from - characters16(); 373 524 374 525 if (outc) 375 memcpy(to, m_data, outc * sizeof(UChar));526 memcpy(to, characters16(), outc * sizeof(UChar)); 376 527 377 528 while (true) { … … 389 540 } 390 541 542 // FIXME: Add 8-bit path. Likely requires templatized StringBuffer class 391 543 template <class UCharPredicate> 392 544 inline PassRefPtr<StringImpl> StringImpl::simplifyMatchedCharactersToSpace(UCharPredicate predicate) … … 394 546 StringBuffer data(m_length); 395 547 396 const UChar* from = m_data;548 const UChar* from = characters16(); 397 549 const UChar* fromend = from + m_length; 398 550 int outc = 0; … … 438 590 int StringImpl::toIntStrict(bool* ok, int base) 439 591 { 440 return charactersToIntStrict( m_data, m_length, ok, base);592 return charactersToIntStrict(characters16(), m_length, ok, base); 441 593 } 442 594 443 595 unsigned StringImpl::toUIntStrict(bool* ok, int base) 444 596 { 445 return charactersToUIntStrict( m_data, m_length, ok, base);597 return charactersToUIntStrict(characters16(), m_length, ok, base); 446 598 } 447 599 448 600 int64_t StringImpl::toInt64Strict(bool* ok, int base) 449 601 { 450 return charactersToInt64Strict(m_data, m_length, ok, base); 602 return charactersToInt64Strict(characters16(), m_length, ok, base); 603 451 604 } 452 605 453 606 uint64_t StringImpl::toUInt64Strict(bool* ok, int base) 454 607 { 455 return charactersToUInt64Strict( m_data, m_length, ok, base);608 return charactersToUInt64Strict(characters16(), m_length, ok, base); 456 609 } 457 610 458 611 intptr_t StringImpl::toIntPtrStrict(bool* ok, int base) 459 612 { 460 return charactersToIntPtrStrict( m_data, m_length, ok, base);613 return charactersToIntPtrStrict(characters16(), m_length, ok, base); 461 614 } 462 615 463 616 int StringImpl::toInt(bool* ok) 464 617 { 465 return charactersToInt( m_data, m_length, ok);618 return charactersToInt(characters16(), m_length, ok); 466 619 } 467 620 468 621 unsigned StringImpl::toUInt(bool* ok) 469 622 { 470 return charactersToUInt( m_data, m_length, ok);623 return charactersToUInt(characters16(), m_length, ok); 471 624 } 472 625 473 626 int64_t StringImpl::toInt64(bool* ok) 474 627 { 475 return charactersToInt64( m_data, m_length, ok);628 return charactersToInt64(characters16(), m_length, ok); 476 629 } 477 630 478 631 uint64_t StringImpl::toUInt64(bool* ok) 479 632 { 480 return charactersToUInt64( m_data, m_length, ok);633 return charactersToUInt64(characters16(), m_length, ok); 481 634 } 482 635 483 636 intptr_t StringImpl::toIntPtr(bool* ok) 484 637 { 485 return charactersToIntPtr(m_data, m_length, ok); 638 return charactersToIntPtr(characters16(), m_length, ok); 639 486 640 } 487 641 488 642 double StringImpl::toDouble(bool* ok, bool* didReadNumber) 489 643 { 490 return charactersToDouble( m_data, m_length, ok, didReadNumber);644 return charactersToDouble(characters16(), m_length, ok, didReadNumber); 491 645 } 492 646 493 647 float StringImpl::toFloat(bool* ok, bool* didReadNumber) 494 648 { 495 return charactersToFloat( m_data, m_length, ok, didReadNumber);496 } 497 498 static bool equal(const UChar* a, const char* b, int length)649 return charactersToFloat(characters16(), m_length, ok, didReadNumber); 650 } 651 652 static bool equal(const UChar* a, const LChar* b, int length) 499 653 { 500 654 ASSERT(length >= 0); … … 507 661 } 508 662 509 bool equalIgnoringCase(const UChar* a, const char* b, unsigned length)663 bool equalIgnoringCase(const UChar* a, const LChar* b, unsigned length) 510 664 { 511 665 while (length--) { … … 548 702 size_t StringImpl::find(UChar c, unsigned start) 549 703 { 550 return WTF::find( m_data, m_length, c, start);704 return WTF::find(characters16(), m_length, c, start); 551 705 } 552 706 553 707 size_t StringImpl::find(CharacterMatchFunctionPtr matchFunction, unsigned start) 554 708 { 555 return WTF::find( m_data, m_length, matchFunction, start);556 } 557 558 size_t StringImpl::find(const char* matchString, unsigned index)709 return WTF::find(characters16(), m_length, matchFunction, start); 710 } 711 712 size_t StringImpl::find(const LChar* matchString, unsigned index) 559 713 { 560 714 // Check for null or empty string to match against 561 715 if (!matchString) 562 716 return notFound; 563 size_t matchStringLength = strlen( matchString);717 size_t matchStringLength = strlen(reinterpret_cast<const char*>(matchString)); 564 718 if (matchStringLength > numeric_limits<unsigned>::max()) 565 719 CRASH(); … … 570 724 // Optimization 1: fast case for strings of length 1. 571 725 if (matchLength == 1) 572 return WTF::find(characters (), length(), *(const unsigned char*)matchString, index);726 return WTF::find(characters16(), length(), *matchString, index); 573 727 574 728 // Check index & matchLength are in range. … … 582 736 583 737 const UChar* searchCharacters = characters() + index; 584 const unsigned char* matchCharacters = (const unsigned char*)matchString;585 738 586 739 // Optimization 2: keep a running hash of the strings, … … 590 743 for (unsigned i = 0; i < matchLength; ++i) { 591 744 searchHash += searchCharacters[i]; 592 matchHash += match Characters[i];745 matchHash += matchString[i]; 593 746 } 594 747 … … 605 758 } 606 759 607 size_t StringImpl::findIgnoringCase(const char* matchString, unsigned index)760 size_t StringImpl::findIgnoringCase(const LChar* matchString, unsigned index) 608 761 { 609 762 // Check for null or empty string to match against 610 763 if (!matchString) 611 764 return notFound; 612 size_t matchStringLength = strlen( matchString);765 size_t matchStringLength = strlen(reinterpret_cast<const char*>(matchString)); 613 766 if (matchStringLength > numeric_limits<unsigned>::max()) 614 767 CRASH(); … … 649 802 // Optimization 1: fast case for strings of length 1. 650 803 if (matchLength == 1) 651 return WTF::find(characters (), length(), matchString->characters()[0], index);804 return WTF::find(characters16(), length(), matchString->characters16()[0], index); 652 805 653 806 // Check index & matchLength are in range. … … 717 870 size_t StringImpl::reverseFind(UChar c, unsigned index) 718 871 { 719 return WTF::reverseFind( m_data, m_length, c, index);872 return WTF::reverseFind(characters16(), m_length, c, index); 720 873 } 721 874 … … 731 884 // Optimization 1: fast case for strings of length 1. 732 885 if (matchLength == 1) 733 return WTF::reverseFind(characters (), length(), matchString->characters()[0], index);886 return WTF::reverseFind(characters16(), length(), matchString->characters()[0], index); 734 887 735 888 // Check index & matchLength are in range. … … 804 957 return this; 805 958 unsigned i; 806 for (i = 0; i != m_length; ++i) 807 if (m_data[i] == oldC) 959 for (i = 0; i != m_length; ++i) { 960 UChar c = is8Bit() ? m_data8[i] : m_data16[i]; 961 if (c == oldC) 808 962 break; 963 } 809 964 if (i == m_length) 810 965 return this; 811 966 967 if (is8Bit()) { 968 if (oldC > 0xff) 969 // Looking for a 16 bit char in an 8 bit string, we're done. 970 return this; 971 972 if (newC <= 0xff) { 973 LChar* data; 974 LChar oldChar = static_cast<LChar>(oldC); 975 LChar newChar = static_cast<LChar>(newC); 976 977 RefPtr<StringImpl> newImpl = createUninitialized(m_length, data); 978 979 for (i = 0; i != m_length; ++i) { 980 char ch = m_data8[i]; 981 if (ch == oldChar) 982 ch = newChar; 983 data[i] = ch; 984 } 985 return newImpl.release(); 986 } 987 988 // There is the possibility we need to up convert from 8 to 16 bit, 989 // create a 16 bit string for the result. 990 UChar* data; 991 RefPtr<StringImpl> newImpl = createUninitialized(m_length, data); 992 993 for (i = 0; i != m_length; ++i) { 994 UChar ch = m_data8[i]; 995 if (ch == oldC) 996 ch = newC; 997 data[i] = ch; 998 } 999 1000 return newImpl.release(); 1001 } 1002 812 1003 UChar* data; 813 1004 RefPtr<StringImpl> newImpl = createUninitialized(m_length, data); 814 1005 815 1006 for (i = 0; i != m_length; ++i) { 816 UChar ch = m_data [i];1007 UChar ch = m_data16[i]; 817 1008 if (ch == oldC) 818 1009 ch = newC; … … 829 1020 if (!lengthToReplace && !lengthToInsert) 830 1021 return this; 831 UChar* data;832 1022 833 1023 if ((length() - lengthToReplace) >= (numeric_limits<unsigned>::max() - lengthToInsert)) 834 1024 CRASH(); 835 1025 1026 if (is8Bit() && (!str || str->is8Bit())) { 1027 LChar* data; 1028 RefPtr<StringImpl> newImpl = 1029 createUninitialized(length() - lengthToReplace + lengthToInsert, data); 1030 memcpy(data, m_data8, position * sizeof(LChar)); 1031 if (str) 1032 memcpy(data + position, str->m_data8, lengthToInsert * sizeof(LChar)); 1033 memcpy(data + position + lengthToInsert, m_data8 + position + lengthToReplace, 1034 (length() - position - lengthToReplace) * sizeof(LChar)); 1035 return newImpl.release(); 1036 } 1037 UChar* data; 836 1038 RefPtr<StringImpl> newImpl = 837 1039 createUninitialized(length() - lengthToReplace + lengthToInsert, data); 838 memcpy(data, characters(), position * sizeof(UChar)); 839 if (str) 840 memcpy(data + position, str->characters(), lengthToInsert * sizeof(UChar)); 841 memcpy(data + position + lengthToInsert, characters() + position + lengthToReplace, 842 (length() - position - lengthToReplace) * sizeof(UChar)); 1040 if (is8Bit()) 1041 for (unsigned i = 0; i < position; i++) 1042 data[i] = m_data8[i]; 1043 else 1044 memcpy(data, m_data16, position * sizeof(UChar)); 1045 if (str) { 1046 if (str->is8Bit()) 1047 for (unsigned i = 0; i < lengthToInsert; i++) 1048 data[i + position] = str->m_data8[i]; 1049 else 1050 memcpy(data + position, str->m_data16, lengthToInsert * sizeof(UChar)); 1051 } 1052 if (is8Bit()) { 1053 for (unsigned i = 0; i < length() - position - lengthToReplace; i++) 1054 data[i + position + lengthToInsert] = m_data8[i + position + lengthToReplace]; 1055 } else { 1056 memcpy(data + position + lengthToInsert, characters() + position + lengthToReplace, 1057 (length() - position - lengthToReplace) * sizeof(UChar)); 1058 } 843 1059 return newImpl.release(); 844 1060 } … … 853 1069 unsigned matchCount = 0; 854 1070 855 // Count the matches 1071 // Count the matches. 856 1072 while ((srcSegmentStart = find(pattern, srcSegmentStart)) != notFound) { 857 1073 ++matchCount; … … 859 1075 } 860 1076 861 // If we have 0 matches , we don't have to do any more work1077 // If we have 0 matches then we don't have to do any more work to do. 862 1078 if (!matchCount) 863 1079 return this; … … 873 1089 newSize += replaceSize; 874 1090 875 UChar* data; 876 RefPtr<StringImpl> newImpl = createUninitialized(newSize, data); 877 878 // Construct the new data 1091 // Construct the new data. 879 1092 size_t srcSegmentEnd; 880 1093 unsigned srcSegmentLength; 881 1094 srcSegmentStart = 0; 882 1095 unsigned dstOffset = 0; 883 1096 bool srcIs8Bit = is8Bit(); 1097 bool replacementIs8Bit = replacement->is8Bit(); 1098 1099 // There are 4 cases: 1100 // 1. This and replacement are both 8 bit. 1101 // 2. This and replacement are both 16 bit. 1102 // 3. This is 8 bit and replacement is 16 bit. 1103 // 4. This is 16 bit and replacement is 8 bit. 1104 if (srcIs8Bit && replacementIs8Bit) { 1105 // Case 1 1106 LChar* data; 1107 RefPtr<StringImpl> newImpl = createUninitialized(newSize, data); 1108 1109 while ((srcSegmentEnd = find(pattern, srcSegmentStart)) != notFound) { 1110 srcSegmentLength = srcSegmentEnd - srcSegmentStart; 1111 memcpy(data + dstOffset, m_data8 + srcSegmentStart, srcSegmentLength * sizeof(LChar)); 1112 dstOffset += srcSegmentLength; 1113 memcpy(data + dstOffset, replacement->m_data8, repStrLength * sizeof(LChar)); 1114 dstOffset += repStrLength; 1115 srcSegmentStart = srcSegmentEnd + 1; 1116 } 1117 1118 srcSegmentLength = m_length - srcSegmentStart; 1119 memcpy(data + dstOffset, m_data8 + srcSegmentStart, srcSegmentLength * sizeof(LChar)); 1120 1121 ASSERT(dstOffset + srcSegmentLength == newImpl->length()); 1122 1123 return newImpl.release(); 1124 } 1125 1126 UChar* data; 1127 RefPtr<StringImpl> newImpl = createUninitialized(newSize, data); 1128 884 1129 while ((srcSegmentEnd = find(pattern, srcSegmentStart)) != notFound) { 885 1130 srcSegmentLength = srcSegmentEnd - srcSegmentStart; 886 memcpy(data + dstOffset, m_data + srcSegmentStart, srcSegmentLength * sizeof(UChar)); 1131 if (srcIs8Bit) { 1132 // Case 3. 1133 for (unsigned i = 0; i < srcSegmentLength; i++) 1134 data[i + dstOffset] = m_data8[i + srcSegmentStart]; 1135 } else { 1136 // Cases 2 & 4. 1137 memcpy(data + dstOffset, m_data16 + srcSegmentStart, srcSegmentLength * sizeof(UChar)); 1138 } 887 1139 dstOffset += srcSegmentLength; 888 memcpy(data + dstOffset, replacement->m_data, repStrLength * sizeof(UChar)); 1140 if (replacementIs8Bit) { 1141 // Case 4. 1142 for (unsigned i = 0; i < repStrLength; i++) 1143 data[i + dstOffset] = replacement->m_data8[i]; 1144 } else { 1145 // Cases 2 & 3. 1146 memcpy(data + dstOffset, replacement->m_data16, repStrLength * sizeof(UChar)); 1147 } 889 1148 dstOffset += repStrLength; 890 1149 srcSegmentStart = srcSegmentEnd + 1; … … 892 1151 893 1152 srcSegmentLength = m_length - srcSegmentStart; 894 memcpy(data + dstOffset, m_data + srcSegmentStart, srcSegmentLength * sizeof(UChar)); 1153 if (srcIs8Bit) { 1154 // Case 3. 1155 for (unsigned i = 0; i < srcSegmentLength; i++) 1156 data[i + dstOffset] = m_data8[i + srcSegmentStart]; 1157 } else { 1158 // Cases 2 & 4. 1159 memcpy(data + dstOffset, m_data16 + srcSegmentStart, srcSegmentLength * sizeof(UChar)); 1160 } 895 1161 896 1162 ASSERT(dstOffset + srcSegmentLength == newImpl->length()); … … 912 1178 unsigned matchCount = 0; 913 1179 914 // Count the matches 1180 // Count the matches. 915 1181 while ((srcSegmentStart = find(pattern, srcSegmentStart)) != notFound) { 916 1182 ++matchCount; … … 931 1197 newSize += matchCount * repStrLength; 932 1198 933 UChar* data;934 RefPtr<StringImpl> newImpl = createUninitialized(newSize, data);935 1199 936 1200 // Construct the new data … … 939 1203 srcSegmentStart = 0; 940 1204 unsigned dstOffset = 0; 941 1205 bool srcIs8Bit = is8Bit(); 1206 bool replacementIs8Bit = replacement->is8Bit(); 1207 1208 // There are 4 cases: 1209 // 1. This and replacement are both 8 bit. 1210 // 2. This and replacement are both 16 bit. 1211 // 3. This is 8 bit and replacement is 16 bit. 1212 // 4. This is 16 bit and replacement is 8 bit. 1213 if (srcIs8Bit && replacementIs8Bit) { 1214 // Case 1 1215 LChar* data; 1216 RefPtr<StringImpl> newImpl = createUninitialized(newSize, data); 1217 while ((srcSegmentEnd = find(pattern, srcSegmentStart)) != notFound) { 1218 srcSegmentLength = srcSegmentEnd - srcSegmentStart; 1219 memcpy(data + dstOffset, m_data8 + srcSegmentStart, srcSegmentLength * sizeof(LChar)); 1220 dstOffset += srcSegmentLength; 1221 memcpy(data + dstOffset, replacement->m_data8, repStrLength * sizeof(LChar)); 1222 dstOffset += repStrLength; 1223 srcSegmentStart = srcSegmentEnd + patternLength; 1224 } 1225 1226 srcSegmentLength = m_length - srcSegmentStart; 1227 memcpy(data + dstOffset, m_data8 + srcSegmentStart, srcSegmentLength * sizeof(LChar)); 1228 1229 ASSERT(dstOffset + srcSegmentLength == newImpl->length()); 1230 1231 return newImpl.release(); 1232 } 1233 1234 UChar* data; 1235 RefPtr<StringImpl> newImpl = createUninitialized(newSize, data); 942 1236 while ((srcSegmentEnd = find(pattern, srcSegmentStart)) != notFound) { 943 1237 srcSegmentLength = srcSegmentEnd - srcSegmentStart; 944 memcpy(data + dstOffset, m_data + srcSegmentStart, srcSegmentLength * sizeof(UChar)); 1238 if (srcIs8Bit) { 1239 // Case 3. 1240 for (unsigned i = 0; i < srcSegmentLength; i++) 1241 data[i + dstOffset] = m_data8[i + srcSegmentStart]; 1242 } else { 1243 // Case 2 & 4. 1244 memcpy(data + dstOffset, m_data16 + srcSegmentStart, srcSegmentLength * sizeof(UChar)); 1245 } 945 1246 dstOffset += srcSegmentLength; 946 memcpy(data + dstOffset, replacement->m_data, repStrLength * sizeof(UChar)); 1247 if (replacementIs8Bit) { 1248 // Cases 2 & 3. 1249 for (unsigned i = 0; i < repStrLength; i++) 1250 data[i + dstOffset] = replacement->m_data8[i]; 1251 } else { 1252 // Case 4 1253 memcpy(data + dstOffset, replacement->m_data16, repStrLength * sizeof(UChar)); 1254 } 947 1255 dstOffset += repStrLength; 948 1256 srcSegmentStart = srcSegmentEnd + patternLength; … … 950 1258 951 1259 srcSegmentLength = m_length - srcSegmentStart; 952 memcpy(data + dstOffset, m_data + srcSegmentStart, srcSegmentLength * sizeof(UChar)); 1260 if (srcIs8Bit) { 1261 // Case 3. 1262 for (unsigned i = 0; i < srcSegmentLength; i++) 1263 data[i + dstOffset] = m_data8[i + srcSegmentStart]; 1264 } else { 1265 // Cases 2 & 4. 1266 memcpy(data + dstOffset, m_data16 + srcSegmentStart, srcSegmentLength * sizeof(UChar)); 1267 } 953 1268 954 1269 ASSERT(dstOffset + srcSegmentLength == newImpl->length()); … … 962 1277 } 963 1278 964 bool equal(const StringImpl* a, const char* b)1279 bool equal(const StringImpl* a, const LChar* b, unsigned length) 965 1280 { 966 1281 if (!a) … … 969 1284 return !a; 970 1285 1286 if (length != a->length()) 1287 return false; 1288 1289 if (a->is8Bit()) { 1290 const LChar* aChars = a->characters8(); 1291 1292 for (unsigned i = 0; i != length; ++i) { 1293 unsigned char bc = b[i]; 1294 unsigned char ac = aChars[i]; 1295 if (!bc) 1296 return false; 1297 if (ac != bc) 1298 return false; 1299 } 1300 1301 return true; 1302 } 1303 1304 const UChar* aChars = a->characters16(); 1305 1306 for (unsigned i = 0; i != length; ++i) { 1307 UChar bc = b[i]; 1308 UChar ac = aChars[i]; 1309 if (!bc) 1310 return false; 1311 if (ac != bc) 1312 return false; 1313 } 1314 1315 return true; 1316 } 1317 1318 bool equal(const StringImpl* a, const LChar* b) 1319 { 1320 if (!a) 1321 return !b; 1322 if (!b) 1323 return !a; 1324 971 1325 unsigned length = a->length(); 972 const UChar* as = a->characters(); 1326 1327 if (a->is8Bit()) { 1328 const LChar* aPtr = a->characters8(); 1329 for (unsigned i = 0; i != length; ++i) { 1330 unsigned char bc = b[i]; 1331 unsigned char ac = aPtr[i]; 1332 if (!bc) 1333 return false; 1334 if (ac != bc) 1335 return false; 1336 } 1337 1338 return !b[length]; 1339 } 1340 1341 const UChar* aPtr = a->characters16(); 973 1342 for (unsigned i = 0; i != length; ++i) { 974 1343 unsigned char bc = b[i]; 975 1344 if (!bc) 976 1345 return false; 977 if (a s[i] != bc)1346 if (aPtr[i] != bc) 978 1347 return false; 979 1348 } … … 1000 1369 return true; 1001 1370 #else 1002 /* Do it 4-bytes-at-a-time on architectures where it's safe */ 1003 1004 const uint32_t* aCharacters = reinterpret_cast<const uint32_t*>(a->characters()); 1371 if (a->is8Bit()) { 1372 const LChar* as = a->characters8(); 1373 for (unsigned i = 0; i != length; ++i) 1374 if (as[i] != b[i]) 1375 return false; 1376 return true; 1377 } 1378 1379 // Do comparison 4-bytes-at-a-time on architectures where it's safe. 1380 1381 const uint32_t* aCharacters = reinterpret_cast<const uint32_t*>(a->characters16()); 1005 1382 const uint32_t* bCharacters = reinterpret_cast<const uint32_t*>(b); 1006 1383 1007 1384 unsigned halfLength = length >> 1; 1008 1385 for (unsigned i = 0; i != halfLength; ++i) { … … 1010 1387 return false; 1011 1388 } 1012 1389 1013 1390 if (length & 1 && *reinterpret_cast<const uint16_t*>(aCharacters) != *reinterpret_cast<const uint16_t*>(bCharacters)) 1014 1391 return false; 1015 1392 1016 1393 return true; 1017 1394 #endif … … 1023 1400 } 1024 1401 1025 bool equalIgnoringCase(StringImpl* a, const char* b)1402 bool equalIgnoringCase(StringImpl* a, const LChar* b) 1026 1403 { 1027 1404 if (!a) … … 1031 1408 1032 1409 unsigned length = a->length(); 1033 const UChar* as = a->characters();1034 1410 1035 1411 // Do a faster loop for the case where all the characters are ASCII. 1036 1412 UChar ored = 0; 1037 1413 bool equal = true; 1414 if (a->is8Bit()) { 1415 const LChar* as = a->characters8(); 1416 for (unsigned i = 0; i != length; ++i) { 1417 LChar bc = b[i]; 1418 if (!bc) 1419 return false; 1420 UChar ac = as[i]; 1421 ored |= ac; 1422 equal = equal && (toASCIILower(ac) == toASCIILower(bc)); 1423 } 1424 1425 // Do a slower implementation for cases that include non-ASCII characters. 1426 if (ored & ~0x7F) { 1427 equal = true; 1428 for (unsigned i = 0; i != length; ++i) 1429 equal = equal && (foldCase(as[i]) == foldCase(b[i])); 1430 } 1431 1432 return equal && !b[length]; 1433 } 1434 1435 const UChar* as = a->characters16(); 1038 1436 for (unsigned i = 0; i != length; ++i) { 1039 char bc = b[i];1437 LChar bc = b[i]; 1040 1438 if (!bc) 1041 1439 return false; … … 1049 1447 equal = true; 1050 1448 for (unsigned i = 0; i != length; ++i) { 1051 unsigned char bc = b[i]; 1052 equal = equal && (foldCase(as[i]) == foldCase(bc)); 1449 equal = equal && (foldCase(as[i]) == foldCase(b[i])); 1053 1450 } 1054 1451 } … … 1072 1469 { 1073 1470 for (unsigned i = 0; i < m_length; ++i) { 1074 WTF::Unicode::Direction charDirection = WTF::Unicode::direction( m_data[i]);1471 WTF::Unicode::Direction charDirection = WTF::Unicode::direction(is8Bit() ? m_data8[i] : m_data16[i]); 1075 1472 if (charDirection == WTF::Unicode::LeftToRight) { 1076 1473 if (hasStrongDirectionality) … … 1091 1488 PassRefPtr<StringImpl> StringImpl::adopt(StringBuffer& buffer) 1092 1489 { 1490 // FIXME: handle 8-bit StringBuffer when it exists. 1093 1491 unsigned length = buffer.length(); 1094 1492 if (length == 0) … … 1101 1499 // Use createUninitialized instead of 'new StringImpl' so that the string and its buffer 1102 1500 // get allocated in a single memory block. 1103 UChar* data;1104 1501 unsigned length = string.m_length; 1105 1502 if (length >= numeric_limits<unsigned>::max()) 1106 1503 CRASH(); 1107 RefPtr<StringImpl> terminatedString = createUninitialized(length + 1, data); 1108 memcpy(data, string.m_data, length * sizeof(UChar)); 1109 data[length] = 0; 1504 RefPtr<StringImpl> terminatedString; 1505 if (string.is8Bit()) { 1506 LChar* data; 1507 terminatedString = createUninitialized(length + 1, data); 1508 memcpy(data, string.m_data8, length * sizeof(LChar)); 1509 data[length] = 0; 1510 } else { 1511 UChar* data; 1512 terminatedString = createUninitialized(length + 1, data); 1513 memcpy(data, string.m_data16, length * sizeof(UChar)); 1514 data[length] = 0; 1515 } 1110 1516 terminatedString->m_length--; 1111 1517 terminatedString->m_hashAndFlags = (string.m_hashAndFlags & ~s_flagMask) | s_hashFlagHasTerminatingNullCharacter; -
trunk/Source/JavaScriptCore/wtf/text/StringImpl.h
r98495 r98624 45 45 namespace JSC { 46 46 struct IdentifierCStringTranslator; 47 struct IdentifierCharBufferTranslator; 47 48 struct IdentifierUCharBufferTranslator; 48 49 } … … 63 64 WTF_MAKE_NONCOPYABLE(StringImpl); WTF_MAKE_FAST_ALLOCATED; 64 65 friend struct JSC::IdentifierCStringTranslator; 66 friend struct JSC::IdentifierCharBufferTranslator; 65 67 friend struct JSC::IdentifierUCharBufferTranslator; 66 68 friend struct WTF::CStringTranslator; … … 84 86 : m_refCount(s_refCountFlagIsStaticString) 85 87 , m_length(length) 86 , m_data (characters)88 , m_data16(characters) 87 89 , m_buffer(0) 88 90 , m_hashAndFlags(s_hashFlagIsIdentifier | BufferOwned) … … 94 96 } 95 97 96 // Create a normal string with internal storage (BufferInternal) 98 // Used to construct static strings, which have an special refCount that can never hit zero. 99 // This means that the static string will never be destroyed, which is important because 100 // static strings will be shared across threads & ref-counted in a non-threadsafe manner. 101 StringImpl(const LChar* characters, unsigned length, ConstructStaticStringTag) 102 : m_refCount(s_refCountFlagIsStaticString) 103 , m_length(length) 104 , m_data8(characters) 105 , m_buffer(0) 106 , m_hashAndFlags(s_hashFlag8BitBuffer | s_hashFlagIsIdentifier | BufferOwned) 107 { 108 // Ensure that the hash is computed so that AtomicStringHash can call existingHash() 109 // with impunity. The empty string is special because it is never entered into 110 // AtomicString's HashKey, but still needs to compare correctly. 111 hash(); 112 } 113 114 // FIXME: there has to be a less hacky way to do this. 115 enum Force8Bit { Force8BitConstructor }; 116 // Create a normal 8-bit string with internal storage (BufferInternal) 117 StringImpl(unsigned length, Force8Bit) 118 : m_refCount(s_refCountIncrement) 119 , m_length(length) 120 , m_data8(reinterpret_cast<const LChar*>(this + 1)) 121 , m_buffer(0) 122 , m_hashAndFlags(s_hashFlag8BitBuffer | BufferInternal) 123 { 124 ASSERT(m_data8); 125 ASSERT(m_length); 126 } 127 128 // Create a normal 16-bit string with internal storage (BufferInternal) 97 129 StringImpl(unsigned length) 98 130 : m_refCount(s_refCountIncrement) 99 131 , m_length(length) 100 , m_data (reinterpret_cast<const UChar*>(this + 1))132 , m_data16(reinterpret_cast<const UChar*>(this + 1)) 101 133 , m_buffer(0) 102 134 , m_hashAndFlags(BufferInternal) 103 135 { 104 ASSERT(m_data );136 ASSERT(m_data16); 105 137 ASSERT(m_length); 106 138 } … … 110 142 : m_refCount(s_refCountIncrement) 111 143 , m_length(length) 112 , m_data (characters)144 , m_data16(characters) 113 145 , m_buffer(0) 114 146 , m_hashAndFlags(BufferOwned) 115 147 { 116 ASSERT(m_data );148 ASSERT(m_data16); 117 149 ASSERT(m_length); 118 150 } 119 151 120 // Used to create new strings that are a substring of an existing StringImpl (BufferSubstring) 152 // Used to create new strings that are a substring of an existing 8-bit StringImpl (BufferSubstring) 153 StringImpl(const LChar* characters, unsigned length, PassRefPtr<StringImpl> base) 154 : m_refCount(s_refCountIncrement) 155 , m_length(length) 156 , m_data8(characters) 157 , m_substringBuffer(base.leakRef()) 158 , m_hashAndFlags(s_hashFlag8BitBuffer | BufferSubstring) 159 { 160 ASSERT(is8Bit()); 161 ASSERT(m_data8); 162 ASSERT(m_length); 163 ASSERT(m_substringBuffer->bufferOwnership() != BufferSubstring); 164 } 165 166 // Used to create new strings that are a substring of an existing 16-bit StringImpl (BufferSubstring) 121 167 StringImpl(const UChar* characters, unsigned length, PassRefPtr<StringImpl> base) 122 168 : m_refCount(s_refCountIncrement) 123 169 , m_length(length) 124 , m_data (characters)170 , m_data16(characters) 125 171 , m_substringBuffer(base.leakRef()) 126 172 , m_hashAndFlags(BufferSubstring) 127 173 { 128 ASSERT(m_data); 174 ASSERT(!is8Bit()); 175 ASSERT(m_data16); 129 176 ASSERT(m_length); 130 177 ASSERT(m_substringBuffer->bufferOwnership() != BufferSubstring); … … 135 182 136 183 static PassRefPtr<StringImpl> create(const UChar*, unsigned length); 137 static PassRefPtr<StringImpl> create(const char*, unsigned length); 138 static PassRefPtr<StringImpl> create(const char*); 139 static ALWAYS_INLINE PassRefPtr<StringImpl> create(PassRefPtr<StringImpl> rep, unsigned offset, unsigned length) 184 static PassRefPtr<StringImpl> create(const LChar*, unsigned length); 185 ALWAYS_INLINE static PassRefPtr<StringImpl> create(const char* s, unsigned length) { return create(s, length); }; 186 static PassRefPtr<StringImpl> create(const LChar*); 187 ALWAYS_INLINE static PassRefPtr<StringImpl> create(const char* s) { return create(reinterpret_cast<const LChar*>(s)); }; 188 189 static ALWAYS_INLINE PassRefPtr<StringImpl> create8(PassRefPtr<StringImpl> rep, unsigned offset, unsigned length) 140 190 { 141 191 ASSERT(rep); … … 145 195 return empty(); 146 196 197 ASSERT(rep->is8Bit()); 147 198 StringImpl* ownerRep = (rep->bufferOwnership() == BufferSubstring) ? rep->m_substringBuffer : rep.get(); 148 return adoptRef(new StringImpl(rep->m_data + offset, length, ownerRep)); 149 } 150 199 return adoptRef(new StringImpl(rep->m_data8 + offset, length, ownerRep)); 200 } 201 202 static ALWAYS_INLINE PassRefPtr<StringImpl> create(PassRefPtr<StringImpl> rep, unsigned offset, unsigned length) 203 { 204 ASSERT(rep); 205 ASSERT(length <= rep->length()); 206 207 if (!length) 208 return empty(); 209 210 StringImpl* ownerRep = (rep->bufferOwnership() == BufferSubstring) ? rep->m_substringBuffer : rep.get(); 211 if (rep->is8Bit()) 212 return adoptRef(new StringImpl(rep->m_data8 + offset, length, ownerRep)); 213 return adoptRef(new StringImpl(rep->m_data16 + offset, length, ownerRep)); 214 } 215 216 static PassRefPtr<StringImpl> createUninitialized(unsigned length, LChar*& data); 151 217 static PassRefPtr<StringImpl> createUninitialized(unsigned length, UChar*& data); 152 static ALWAYS_INLINE PassRefPtr<StringImpl> tryCreateUninitialized(unsigned length, UChar*& output)218 template <typename T> static ALWAYS_INLINE PassRefPtr<StringImpl> tryCreateUninitialized(unsigned length, T*& output) 153 219 { 154 220 if (!length) { … … 157 223 } 158 224 159 if (length > ((std::numeric_limits<unsigned>::max() - sizeof(StringImpl)) / sizeof( UChar))) {225 if (length > ((std::numeric_limits<unsigned>::max() - sizeof(StringImpl)) / sizeof(T))) { 160 226 output = 0; 161 227 return 0; 162 228 } 163 229 StringImpl* resultImpl; 164 if (!tryFastMalloc(sizeof( UChar) * length + sizeof(StringImpl)).getValue(resultImpl)) {230 if (!tryFastMalloc(sizeof(T) * length + sizeof(StringImpl)).getValue(resultImpl)) { 165 231 output = 0; 166 232 return 0; 167 233 } 168 output = reinterpret_cast<UChar*>(resultImpl + 1); 234 output = reinterpret_cast<T*>(resultImpl + 1); 235 236 if (sizeof(T) == sizeof(char)) 237 return adoptRef(new(resultImpl) StringImpl(length, Force8BitConstructor)); 238 169 239 return adoptRef(new(resultImpl) StringImpl(length)); 170 240 } … … 175 245 static PassRefPtr<StringImpl> reallocate(PassRefPtr<StringImpl> originalString, unsigned length, UChar*& data); 176 246 177 static unsigned dataOffset() { return OBJECT_OFFSETOF(StringImpl, m_data); } 247 static unsigned flagsOffset() { return OBJECT_OFFSETOF(StringImpl, m_hashAndFlags); } 248 static unsigned flagIs8Bit() { return s_hashFlag8BitBuffer; } 249 static unsigned dataOffset() { return OBJECT_OFFSETOF(StringImpl, m_data8); } 178 250 static PassRefPtr<StringImpl> createWithTerminatingNullCharacter(const StringImpl&); 179 251 … … 192 264 193 265 unsigned length() const { return m_length; } 194 const UChar* characters() const { return m_data; } 266 bool is8Bit() const { return m_hashAndFlags & s_hashFlag8BitBuffer; } 267 268 // FIXME: Remove all unnecessary usages of characters() 269 ALWAYS_INLINE const LChar* characters8() const { ASSERT(is8Bit()); ASSERT_NOT_REACHED(); return m_data8; } 270 ALWAYS_INLINE const UChar* characters16() const { ASSERT(!is8Bit()); return m_data16; } 271 ALWAYS_INLINE const UChar* characters() const 272 { 273 if (!is8Bit()) 274 return m_data16; 275 276 return getData16SlowCase(); 277 } 195 278 196 279 size_t cost() … … 207 290 } 208 291 292 bool has16BitShadow() const { return m_hashAndFlags & s_hashFlagHas16BitShadow; } 209 293 bool isIdentifier() const { return m_hashAndFlags & s_hashFlagIsIdentifier; } 210 294 void setIsIdentifier(bool isIdentifier) … … 236 320 { 237 321 ASSERT(!hasHash()); 238 ASSERT(hash == StringHasher::computeHash(m_data, m_length)); // Multiple clients assume that StringHasher is the canonical string hash function. 322 // Multiple clients assume that StringHasher is the canonical string hash function. 323 ASSERT(hash == (is8Bit() ? StringHasher::computeHash(m_data8, m_length) : StringHasher::computeHash(m_data16, m_length))); 239 324 ASSERT(!(hash & (s_flagMask << (8 * sizeof(hash) - s_flagCount)))); // Verify that enough high bits are empty. 240 325 … … 265 350 unsigned hash() const 266 351 { 267 if (!hasHash()) 268 setHash(StringHasher::computeHash(m_data, m_length)); 352 if (!hasHash()) { 353 if (is8Bit()) 354 setHash(StringHasher::computeHash(m_data8, m_length)); 355 else 356 setHash(StringHasher::computeHash(m_data16, m_length)); 357 } 269 358 return existingHash(); 270 359 } … … 292 381 static StringImpl* empty(); 293 382 294 static void copyChars(UChar* destination, const UChar* source, unsigned numCharacters) 295 { 383 // FIXME: Does this really belong in StringImpl? 384 template <typename T> static void copyChars(T* destination, const T* source, unsigned numCharacters) 385 { 386 if (numCharacters == 1) { 387 *destination = *source; 388 return; 389 } 390 296 391 if (numCharacters <= s_copyCharsInlineCutOff) { 297 for (unsigned i = 0; i < numCharacters; ++i) 392 unsigned i = 0; 393 #if (CPU(X86) || CPU(X86_64)) 394 const unsigned charsPerInt = sizeof(uint32_t) / sizeof(T); 395 396 if (numCharacters > charsPerInt) { 397 unsigned stopCount = numCharacters & ~(charsPerInt - 1); 398 399 const uint32_t* srcCharacters = reinterpret_cast<const uint32_t*>(source); 400 uint32_t* destCharacters = reinterpret_cast<uint32_t*>(destination); 401 for (unsigned j = 0; i < stopCount; i += charsPerInt, ++j) 402 destCharacters[j] = srcCharacters[j]; 403 } 404 #endif 405 for (; i < numCharacters; ++i) 298 406 destination[i] = source[i]; 299 407 } else 300 memcpy(destination, source, numCharacters * sizeof( UChar));408 memcpy(destination, source, numCharacters * sizeof(T)); 301 409 } 302 410 … … 308 416 PassRefPtr<StringImpl> substring(unsigned pos, unsigned len = UINT_MAX); 309 417 310 UChar operator[](unsigned i) { ASSERT(i < m_length); return m_data[i]; } 418 UChar operator[](unsigned i) const 419 { 420 ASSERT(i < m_length); 421 if (is8Bit()) 422 return m_data8[i]; 423 return m_data16[i]; 424 } 311 425 UChar32 characterStartingAt(unsigned); 312 426 … … 332 446 333 447 PassRefPtr<StringImpl> fill(UChar); 448 // FIXME: Do we need fill(char) or can we just do the right thing if UChar is ASCII? 334 449 PassRefPtr<StringImpl> foldCase(); 335 450 … … 341 456 PassRefPtr<StringImpl> removeCharacters(CharacterMatchFunctionPtr); 342 457 458 // FIXME: Do we need char version, or is it okay to just pass in an ASCII char for 8-bit? Same for reverseFind, replace 343 459 size_t find(UChar, unsigned index = 0); 344 460 size_t find(CharacterMatchFunctionPtr, unsigned index = 0); 345 size_t find(const char*, unsigned index = 0); 461 size_t find(const LChar*, unsigned index = 0); 462 ALWAYS_INLINE size_t find(const char* s, unsigned index = 0) { return find(reinterpret_cast<const LChar*>(s), index); }; 346 463 size_t find(StringImpl*, unsigned index = 0); 347 size_t findIgnoringCase(const char*, unsigned index = 0); 464 size_t findIgnoringCase(const LChar*, unsigned index = 0); 465 ALWAYS_INLINE size_t findIgnoringCase(const char* s, unsigned index = 0) { return findIgnoringCase(reinterpret_cast<const LChar*>(s), index); }; 348 466 size_t findIgnoringCase(StringImpl*, unsigned index = 0); 349 467 … … 377 495 template <class UCharPredicate> PassRefPtr<StringImpl> stripMatchedCharacters(UCharPredicate); 378 496 template <class UCharPredicate> PassRefPtr<StringImpl> simplifyMatchedCharactersToSpace(UCharPredicate); 497 NEVER_INLINE const UChar* getData16SlowCase() const; 379 498 380 499 // The bottom bit in the ref count indicates a static (immortal) string. … … 387 506 COMPILE_ASSERT(s_flagCount == StringHasher::flagCount, StringHasher_reserves_enough_bits_for_StringImpl_flags); 388 507 508 static const unsigned s_hashFlagHas16BitShadow = 1u << 7; 509 static const unsigned s_hashFlag8BitBuffer = 1u << 6; 389 510 static const unsigned s_hashFlagHasTerminatingNullCharacter = 1u << 5; 390 511 static const unsigned s_hashFlagIsAtomic = 1u << 4; … … 395 516 unsigned m_refCount; 396 517 unsigned m_length; 397 const UChar* m_data; 518 union { 519 const LChar* m_data8; 520 const UChar* m_data16; 521 }; 398 522 union { 399 523 void* m_buffer; 400 524 StringImpl* m_substringBuffer; 525 mutable UChar* m_copyData16; 401 526 }; 402 527 mutable unsigned m_hashAndFlags; … … 404 529 405 530 bool equal(const StringImpl*, const StringImpl*); 406 bool equal(const StringImpl*, const char*); 407 inline bool equal(const char* a, StringImpl* b) { return equal(b, a); } 531 bool equal(const StringImpl*, const LChar*); 532 inline bool equal(const StringImpl* a, const char* b) { return equal(a, reinterpret_cast<const LChar*>(b)); } 533 bool equal(const StringImpl*, const LChar*, unsigned); 534 inline bool equal(const LChar* a, StringImpl* b) { return equal(b, a); } 535 inline bool equal(const char* a, StringImpl* b) { return equal(b, reinterpret_cast<const LChar*>(a)); } 408 536 bool equal(const StringImpl*, const UChar*, unsigned); 409 537 410 538 bool equalIgnoringCase(StringImpl*, StringImpl*); 411 bool equalIgnoringCase(StringImpl*, const char*); 412 inline bool equalIgnoringCase(const char* a, StringImpl* b) { return equalIgnoringCase(b, a); } 413 bool equalIgnoringCase(const UChar* a, const char* b, unsigned length); 414 inline bool equalIgnoringCase(const char* a, const UChar* b, unsigned length) { return equalIgnoringCase(b, a, length); } 539 bool equalIgnoringCase(StringImpl*, const LChar*); 540 inline bool equalIgnoringCase(const LChar* a, StringImpl* b) { return equalIgnoringCase(b, a); } 541 bool equalIgnoringCase(const UChar*, const LChar*, unsigned); 542 inline bool equalIgnoringCase(const UChar* a, const char* b, unsigned length) { return equalIgnoringCase(a, reinterpret_cast<const LChar*>(b), length); } 543 inline bool equalIgnoringCase(const LChar* a, const UChar* b, unsigned length) { return equalIgnoringCase(b, a, length); } 544 inline bool equalIgnoringCase(const char* a, const UChar* b, unsigned length) { return equalIgnoringCase(b, reinterpret_cast<const LChar*>(a), length); } 415 545 416 546 bool equalIgnoringNullity(StringImpl*, StringImpl*); … … 437 567 inline PassRefPtr<StringImpl> StringImpl::isolatedCopy() const 438 568 { 439 return create(m_data, m_length); 569 if (is8Bit()) 570 return create(m_data8, m_length); 571 return create(m_data16, m_length); 440 572 } 441 573 -
trunk/Source/JavaScriptCore/wtf/text/WTFString.cpp
r98316 r98624 62 62 63 63 // Construct a string with latin1 data. 64 String::String(const LChar* characters, unsigned length) 65 : m_impl(characters ? StringImpl::create(characters, length) : 0) 66 { 67 } 68 64 69 String::String(const char* characters, unsigned length) 65 : m_impl(characters ? StringImpl::create( characters, length) : 0)70 : m_impl(characters ? StringImpl::create(reinterpret_cast<const LChar*>(characters), length) : 0) 66 71 { 67 72 } 68 73 69 74 // Construct a string with latin1 data, from a null-terminated source. 75 String::String(const LChar* characters) 76 : m_impl(characters ? StringImpl::create(characters) : 0) 77 { 78 } 79 70 80 String::String(const char* characters) 71 : m_impl(characters ? StringImpl::create( characters) : 0)81 : m_impl(characters ? StringImpl::create(reinterpret_cast<const LChar*>(characters)) : 0) 72 82 { 73 83 } … … 96 106 } 97 107 98 void String::append( char c)108 void String::append(LChar c) 99 109 { 100 110 // FIXME: This is extremely inefficient. So much so that we might want to take this … … 339 349 340 350 QByteArray ba = buffer.toUtf8(); 341 return StringImpl::create( ba.constData(), ba.length());351 return StringImpl::create(reinterpret_cast<const LChar*>(ba.constData()), ba.length()); 342 352 343 353 #elif OS(WINCE) … … 356 366 return String(""); 357 367 if (written > 0) 358 return StringImpl::create( buffer.data(), written);368 return StringImpl::create(reinterpret_cast<const LChar*>(buffer.data()), written); 359 369 360 370 bufferSize <<= 1; … … 397 407 va_end(args); 398 408 399 return StringImpl::create( buffer.data(), len);409 return StringImpl::create(reinterpret_cast<const LChar*>(buffer.data()), len); 400 410 #endif 401 411 } … … 718 728 } 719 729 720 String String::fromUTF8(const char* stringStart, size_t length)730 String String::fromUTF8(const LChar* stringStart, size_t length) 721 731 { 722 732 if (length > numeric_limits<unsigned>::max()) … … 733 743 734 744 // Try converting into the buffer. 735 const char* stringCurrent = stringStart;736 if (convertUTF8ToUTF16(&stringCurrent, stringStart + length, &buffer, bufferEnd) != conversionOK)745 const char* stringCurrent = reinterpret_cast<const char*>(stringStart); 746 if (convertUTF8ToUTF16(&stringCurrent, reinterpret_cast<const char *>(stringStart + length), &buffer, bufferEnd) != conversionOK) 737 747 return String(); 738 748 … … 747 757 } 748 758 749 String String::fromUTF8(const char* string)759 String String::fromUTF8(const LChar* string) 750 760 { 751 761 if (!string) 752 762 return String(); 753 return fromUTF8(string, strlen( string));754 } 755 756 String String::fromUTF8WithLatin1Fallback(const char* string, size_t size)763 return fromUTF8(string, strlen(reinterpret_cast<const char*>(string))); 764 } 765 766 String String::fromUTF8WithLatin1Fallback(const LChar* string, size_t size) 757 767 { 758 768 String utf8 = fromUTF8(string, size); -
trunk/Source/JavaScriptCore/wtf/text/WTFString.h
r98316 r98624 90 90 91 91 // Construct a string with latin1 data. 92 WTF_EXPORT_PRIVATE String(const LChar* characters, unsigned length); 92 93 WTF_EXPORT_PRIVATE String(const char* characters, unsigned length); 93 94 94 95 // Construct a string with latin1 data, from a null-terminated source. 96 WTF_EXPORT_PRIVATE String(const LChar* characters); 95 97 WTF_EXPORT_PRIVATE String(const char* characters); 96 98 … … 156 158 size_t find(CharacterMatchFunctionPtr matchFunction, unsigned start = 0) const 157 159 { return m_impl ? m_impl->find(matchFunction, start) : notFound; } 158 size_t find(const char* str, unsigned start = 0) const160 size_t find(const LChar* str, unsigned start = 0) const 159 161 { return m_impl ? m_impl->find(str, start) : notFound; } 160 162 … … 166 168 167 169 // Case insensitive string matching. 168 WTF_EXPORT_PRIVATE size_t findIgnoringCase(const char* str, unsigned start = 0) const170 WTF_EXPORT_PRIVATE size_t findIgnoringCase(const LChar* str, unsigned start = 0) const 169 171 { return m_impl ? m_impl->findIgnoringCase(str, start) : notFound; } 170 172 WTF_EXPORT_PRIVATE size_t findIgnoringCase(const String& str, unsigned start = 0) const … … 174 176 175 177 // Wrappers for find & reverseFind adding dynamic sensitivity check. 176 size_t find(const char* str, unsigned start, bool caseSensitive) const178 size_t find(const LChar* str, unsigned start, bool caseSensitive) const 177 179 { return caseSensitive ? find(str, start) : findIgnoringCase(str, start); } 178 180 size_t find(const String& str, unsigned start, bool caseSensitive) const … … 186 188 187 189 bool contains(UChar c) const { return find(c) != notFound; } 188 bool contains(const char* str, bool caseSensitive = true) const { return find(str, 0, caseSensitive) != notFound; }190 bool contains(const LChar* str, bool caseSensitive = true) const { return find(str, 0, caseSensitive) != notFound; } 189 191 bool contains(const String& str, bool caseSensitive = true) const { return find(str, 0, caseSensitive) != notFound; } 190 192 … … 195 197 196 198 WTF_EXPORT_PRIVATE void append(const String&); 197 WTF_EXPORT_PRIVATE void append(char); 199 WTF_EXPORT_PRIVATE void append(LChar); 200 inline WTF_EXPORT_PRIVATE void append(char c) { append(static_cast<LChar>(c)); }; 198 201 WTF_EXPORT_PRIVATE void append(UChar); 199 202 WTF_EXPORT_PRIVATE void append(const UChar*, unsigned length); … … 300 303 // String::fromUTF8 will return a null string if 301 304 // the input data contains invalid UTF-8 sequences. 302 WTF_EXPORT_PRIVATE static String fromUTF8(const char*, size_t); 303 WTF_EXPORT_PRIVATE static String fromUTF8(const char*); 305 WTF_EXPORT_PRIVATE static String fromUTF8(const LChar*, size_t); 306 WTF_EXPORT_PRIVATE static String fromUTF8(const LChar*); 307 inline WTF_EXPORT_PRIVATE static String fromUTF8(const char* s, size_t length) { return fromUTF8(reinterpret_cast<const LChar*>(s), length); }; 308 inline WTF_EXPORT_PRIVATE static String fromUTF8(const char* s) { return fromUTF8(reinterpret_cast<const LChar*>(s)); }; 304 309 305 310 // Tries to convert the passed in string to UTF-8, but will fall back to Latin-1 if the string is not valid UTF-8. 306 WTF_EXPORT_PRIVATE static String fromUTF8WithLatin1Fallback(const char*, size_t); 311 WTF_EXPORT_PRIVATE static String fromUTF8WithLatin1Fallback(const LChar*, size_t); 312 inline WTF_EXPORT_PRIVATE static String fromUTF8WithLatin1Fallback(const char* s, size_t length) { return fromUTF8WithLatin1Fallback(reinterpret_cast<const LChar*>(s), length); }; 307 313 308 314 // Determines the writing direction using the Unicode Bidi Algorithm rules P2 and P3. … … 339 345 340 346 inline bool operator==(const String& a, const String& b) { return equal(a.impl(), b.impl()); } 341 inline bool operator==(const String& a, const char* b) { return equal(a.impl(), b); } 342 inline bool operator==(const char* a, const String& b) { return equal(a, b.impl()); } 347 inline bool operator==(const String& a, const LChar* b) { return equal(a.impl(), b); } 348 inline bool operator==(const String& a, const char* b) { return equal(a.impl(), reinterpret_cast<const LChar*>(b)); } 349 inline bool operator==(const LChar* a, const String& b) { return equal(a, b.impl()); } 350 inline bool operator==(const char* a, const String& b) { return equal(reinterpret_cast<const LChar*>(a), b.impl()); } 343 351 344 352 inline bool operator!=(const String& a, const String& b) { return !equal(a.impl(), b.impl()); } 345 inline bool operator!=(const String& a, const char* b) { return !equal(a.impl(), b); } 346 inline bool operator!=(const char* a, const String& b) { return !equal(a, b.impl()); } 353 inline bool operator!=(const String& a, const LChar* b) { return !equal(a.impl(), b); } 354 inline bool operator!=(const String& a, const char* b) { return !equal(a.impl(), reinterpret_cast<const LChar*>(b)); } 355 inline bool operator!=(const LChar* a, const String& b) { return !equal(a, b.impl()); } 356 inline bool operator!=(const char* a, const String& b) { return !equal(reinterpret_cast<const LChar*>(a), b.impl()); } 347 357 348 358 inline bool equalIgnoringCase(const String& a, const String& b) { return equalIgnoringCase(a.impl(), b.impl()); } 349 inline bool equalIgnoringCase(const String& a, const char* b) { return equalIgnoringCase(a.impl(), b); } 350 inline bool equalIgnoringCase(const char* a, const String& b) { return equalIgnoringCase(a, b.impl()); } 359 inline bool equalIgnoringCase(const String& a, const LChar* b) { return equalIgnoringCase(a.impl(), b); } 360 inline bool equalIgnoringCase(const String& a, const char* b) { return equalIgnoringCase(a.impl(), reinterpret_cast<const LChar*>(b)); } 361 inline bool equalIgnoringCase(const LChar* a, const String& b) { return equalIgnoringCase(a, b.impl()); } 362 inline bool equalIgnoringCase(const char* a, const String& b) { return equalIgnoringCase(reinterpret_cast<const LChar*>(a), b.impl()); } 351 363 352 364 inline bool equalPossiblyIgnoringCase(const String& a, const String& b, bool ignoreCase) -
trunk/Source/JavaScriptCore/wtf/unicode/Unicode.h
r95555 r98624 40 40 COMPILE_ASSERT(sizeof(UChar) == 2, UCharIsTwoBytes); 41 41 42 // Define platform neutral 8 bit character type (L is for Latin-1). 43 typedef unsigned char LChar; 44 42 45 #endif // WTF_UNICODE_H -
trunk/Source/JavaScriptCore/yarr/YarrJIT.cpp
r96169 r98624 2501 2501 } 2502 2502 2503 int execute(YarrCodeBlock& jitObject, const char* input, unsigned start, unsigned length, int* output)2503 int execute(YarrCodeBlock& jitObject, const LChar* input, unsigned start, unsigned length, int* output) 2504 2504 { 2505 2505 return jitObject.execute(input, start, length, output); -
trunk/Source/JavaScriptCore/yarr/YarrJIT.h
r95901 r98624 49 49 50 50 class YarrCodeBlock { 51 typedef int (*YarrJITCode8)(const char* input, unsigned start, unsigned length, int* output) YARR_CALL;51 typedef int (*YarrJITCode8)(const LChar* input, unsigned start, unsigned length, int* output) YARR_CALL; 52 52 typedef int (*YarrJITCode16)(const UChar* input, unsigned start, unsigned length, int* output) YARR_CALL; 53 53 … … 69 69 void set16BitCode(MacroAssembler::CodeRef ref) { m_ref16 = ref; } 70 70 71 int execute(const char* input, unsigned start, unsigned length, int* output)71 int execute(const LChar* input, unsigned start, unsigned length, int* output) 72 72 { 73 73 ASSERT(has8BitCode()); … … 92 92 void jitCompile(YarrPattern&, YarrCharSize, JSGlobalData*, YarrCodeBlock& jitObject); 93 93 int execute(YarrCodeBlock& jitObject, const UChar* input, unsigned start, unsigned length, int* output); 94 int execute(YarrCodeBlock& jitObject, const char* input, unsigned start, unsigned length, int* output);94 int execute(YarrCodeBlock& jitObject, const LChar* input, unsigned start, unsigned length, int* output); 95 95 96 96 } } // namespace JSC::Yarr -
trunk/Source/JavaScriptCore/yarr/YarrParser.h
r95901 r98624 232 232 , m_backReferenceLimit(backReferenceLimit) 233 233 , m_err(NoError) 234 , m_data(pattern )234 , m_data(pattern.characters16()) 235 235 , m_size(pattern.length()) 236 236 , m_index(0) … … 794 794 unsigned m_backReferenceLimit; 795 795 ErrorCode m_err; 796 const U String&m_data;796 const UChar* m_data; 797 797 unsigned m_size; 798 798 unsigned m_index; -
trunk/Source/WebCore/ChangeLog
r98618 r98624 1 2011-10-27 Michael Saboff <msaboff@apple.com> 2 3 Investigate storing strings in 8-bit buffers when possible 4 https://bugs.webkit.org/show_bug.cgi?id=66161 5 6 Changes to support 8 bit StringImpl changes. 7 8 Reviewed by Geoffrey Garen. 9 10 No new tests, refactored StringImpl for 8 bit strings. 11 12 * platform/text/cf/StringImplCF.cpp: 13 (WTF::StringImpl::createCFString): 14 1 15 2011-10-27 Nat Duca <nduca@chromium.org> 2 16 -
trunk/Source/WebCore/platform/text/cf/StringImplCF.cpp
r93012 r98624 137 137 CFAllocatorRef allocator = (m_length && isMainThread()) ? StringWrapperCFAllocator::allocator() : 0; 138 138 if (!allocator) 139 return CFStringCreateWithCharacters(0, reinterpret_cast<const UniChar*>( m_data), m_length);139 return CFStringCreateWithCharacters(0, reinterpret_cast<const UniChar*>(characters()), m_length); 140 140 141 141 // Put pointer to the StringImpl in a global so the allocator can store it with the CFString. … … 143 143 StringWrapperCFAllocator::currentString = this; 144 144 145 CFStringRef string = CFStringCreateWithCharactersNoCopy(allocator, reinterpret_cast<const UniChar*>( m_data), m_length, kCFAllocatorNull);145 CFStringRef string = CFStringCreateWithCharactersNoCopy(allocator, reinterpret_cast<const UniChar*>(characters()), m_length, kCFAllocatorNull); 146 146 147 147 // The allocator cleared the global when it read it, but also clear it here just in case. -
trunk/Source/WebKit2/ChangeLog
r98622 r98624 1 2011-10-27 Michael Saboff <msaboff@apple.com> 2 3 Investigate storing strings in 8-bit buffers when possible 4 https://bugs.webkit.org/show_bug.cgi?id=66161 5 6 Added export of StringImpl::getData16SlowCase for linking tests. 7 8 Reviewed by Geoffrey Garen. 9 10 * win/WebKit2.def: 11 1 12 2011-10-27 Sam Weinig <sam@webkit.org> 2 13 -
trunk/Source/WebKit2/win/WebKit2.def
r98598 r98624 147 147 ?absoluteBoundingBoxRectIgnoringTransforms@RenderObject@WebCore@@QBE?AVIntRect@2@XZ 148 148 ?add@AtomicString@WTF@@CA?AV?$PassRefPtr@VStringImpl@WTF@@@2@PBD@Z 149 ?add@AtomicString@WTF@@CA?AV?$PassRefPtr@VStringImpl@WTF@@@2@PBE@Z 149 150 ?addSlowCase@AtomicString@WTF@@CA?AV?$PassRefPtr@VStringImpl@WTF@@@2@PAVStringImpl@2@@Z 150 151 ?cacheDOMStructure@WebCore@@YAPAVStructure@JSC@@PAVJSDOMGlobalObject@1@PAV23@PBUClassInfo@3@@Z … … 153 154 ?createWrapper@WebCore@@YA?AVJSValue@JSC@@PAVExecState@3@PAVJSDOMGlobalObject@1@PAVNode@1@@Z 154 155 ?ensureShadowRoot@Element@WebCore@@QAEPAVShadowRoot@2@XZ 155 ?equal@WTF@@YA_NPBVStringImpl@1@PB D@Z156 ?equal@WTF@@YA_NPBVStringImpl@1@PBE@Z 156 157 ?externalRepresentation@WebCore@@YA?AVString@WTF@@PAVElement@1@I@Z 157 158 ?getCachedDOMStructure@WebCore@@YAPAVStructure@JSC@@PAVJSDOMGlobalObject@1@PBUClassInfo@3@@Z 159 ?getData16SlowCase@StringImpl@WTF@@ABEPB_WXZ 158 160 ?getElementById@TreeScope@WebCore@@QBEPAVElement@2@ABVAtomicString@WTF@@@Z 159 161 ?getLocationAndLengthFromRange@TextIterator@WebCore@@SA_NPAVElement@2@PBVRange@2@AAI2@Z
Note: See TracChangeset
for help on using the changeset viewer.