Changeset 28793 in webkit
- Timestamp:
- Dec 16, 2007 8:19:25 PM (16 years ago)
- Location:
- trunk/JavaScriptCore
- Files:
-
- 10 edited
Legend:
- Unmodified
- Added
- Removed
-
trunk/JavaScriptCore/ChangeLog
r28785 r28793 1 2007-12-16 Darin Adler <darin@apple.com> 2 3 Reviewed by Maciej. 4 5 - http://bugs.webkit.org/show_bug.cgi?id=16438 6 - removed some more unused code 7 - changed quite a few more names to WebKit-style 8 - moved more things out of pcre_internal.h 9 - changed some indentation to WebKit-style 10 - improved design of the functions for reading and writing 11 2-byte values from the opcode stream (in pcre_internal.h) 12 13 * pcre/dftables.cpp: 14 (main): Added the kjs prefix a normal way in lieu of using macros. 15 16 * pcre/pcre_compile.cpp: Moved some definitions here from pcre_internal.h. 17 (errorText): Name changes, fewer typedefs. 18 (checkEscape): Ditto. Changed uppercase conversion to use toASCIIUpper. 19 (isCountedRepeat): Name change. 20 (readRepeatCounts): Name change. 21 (firstSignificantOpcode): Got rid of the use of OP_lengths, which is 22 very lightly used here. Hard-coded the length of OP_BRANUMBER. 23 (firstSignificantOpcodeSkippingAssertions): Ditto. Also changed to 24 use the advanceToEndOfBracket function. 25 (getOthercaseRange): Name changes. 26 (encodeUTF8): Ditto. 27 (compileBranch): Name changes. Removed unused after_manual_callout and 28 the code to handle it. Removed code to handle OP_ONCE since we never 29 emit this opcode. Changed to use advanceToEndOfBracket in more places. 30 (compileBracket): Name changes. 31 (branchIsAnchored): Removed code to handle OP_ONCE since we never emit 32 this opcode. 33 (bracketIsAnchored): Name changes. 34 (branchNeedsLineStart): More fo the same. 35 (bracketNeedsLineStart): Ditto. 36 (branchFindFirstAssertedCharacter): Removed OP_ONCE code. 37 (bracketFindFirstAssertedCharacter): More of the same. 38 (calculateCompiledPatternLengthAndFlags): Ditto. 39 (returnError): Name changes. 40 (jsRegExpCompile): Ditto. 41 42 * pcre/pcre_exec.cpp: Moved some definitions here from pcre_internal.h. 43 (matchRef): Updated names. 44 Improved macros to use the do { } while(0) idiom so they expand to single 45 statements rather than to blocks or multiple statements. And refeactored 46 the recursive match macros. 47 (MatchStack::pushNewFrame): Name changes. 48 (getUTF8CharAndIncrementLength): Name changes. 49 (match): Name changes. Removed the ONCE opcode. 50 (jsRegExpExecute): Name changes. 51 52 * pcre/pcre_internal.h: Removed quite a few unneeded includes. Rewrote 53 quite a few comments. Removed the macros that add kjs prefixes to the 54 functions with external linkage; instead renamed the functions. Removed 55 the unneeded typedefs pcre_uint16, pcre_uint32, and uschar. Removed the 56 dead and not-all-working code for LINK_SIZE values other than 2, although 57 we aim to keep the abstraction working. Removed the OP_LENGTHS macro. 58 (put2ByteValue): Replaces put2ByteOpcodeValueAtOffset. 59 (get2ByteValue): Replaces get2ByteOpcodeValueAtOffset. 60 (put2ByteValueAndAdvance): Replaces put2ByteOpcodeValueAtOffsetAndAdvance. 61 (putLinkValueAllowZero): Replaces putOpcodeValueAtOffset; doesn't do the 62 addition, since a comma is really no better than a plus sign. Added an 63 assertion to catch out of range values and changed the parameter type to 64 int rather than unsigned. 65 (getLinkValueAllowZero): Replaces getOpcodeValueAtOffset. 66 (putLinkValue): New function that most former callers of the 67 putOpcodeValueAtOffset function can use; asserts the value that is 68 being stored is non-zero and then calls putLinkValueAllowZero. 69 (getLinkValue): Ditto. 70 (putLinkValueAndAdvance): Replaces putOpcodeValueAtOffsetAndAdvance. No 71 caller was using an offset, which makes sense given the advancing behavior. 72 (putLinkValueAllowZeroAndAdvance): Ditto. 73 (isBracketOpcode): Added. For use in an assertion. 74 (advanceToEndOfBracket): Renamed from moveOpcodePtrPastAnyAlternateBranches, 75 and removed comments about how it's not well designed. This function takes 76 a pointer to the beginning of a bracket and advances to the end of the 77 bracket. 78 79 * pcre/pcre_tables.cpp: Updated names. 80 * pcre/pcre_ucp_searchfuncs.cpp: 81 (kjs_pcre_ucp_othercase): Ditto. 82 * pcre/pcre_xclass.cpp: 83 (getUTF8CharAndAdvancePointer): Ditto. 84 (kjs_pcre_xclass): Ditto. 85 * pcre/ucpinternal.h: Ditto. 86 87 * wtf/ASCIICType.h: 88 (WTF::isASCIIAlpha): Added an int overload, like the one we already have for 89 isASCIIDigit. 90 (WTF::isASCIIAlphanumeric): Ditto. 91 (WTF::isASCIIHexDigit): Ditto. 92 (WTF::isASCIILower): Ditto. 93 (WTF::isASCIISpace): Ditto. 94 (WTF::toASCIILower): Ditto. 95 (WTF::toASCIIUpper): Ditto. 96 1 97 2007-12-16 Darin Adler <darin@apple.com> 2 98 … … 1171 1267 Reviewed by Maciej. 1172 1268 1173 Centralize code for subjectPtr adjustments using inlines, only ever check for a single trailing surrogate (as UTF16 only allows one), possibly fix PCRE bugs involving char classes and garbled UTF16 strings. 1269 Centralize code for subjectPtr adjustments using inlines, only ever check for a single 1270 trailing surrogate (as UTF16 only allows one), possibly fix PCRE bugs involving char 1271 classes and garbled UTF16 strings. 1174 1272 1175 1273 * pcre/pcre_exec.cpp: -
trunk/JavaScriptCore/pcre/dftables.cpp
r27730 r28793 79 79 "128 (ASCII characters). These tables are used when no external tables are\n" 80 80 "passed to PCRE. */\n\n" 81 "const unsigned char _pcre_default_tables[%d] = {\n\n"81 "const unsigned char kjs_pcre_default_tables[%d] = {\n\n" 82 82 "/* This table is a lower casing table. */\n\n", tables_length); 83 83 -
trunk/JavaScriptCore/pcre/pcre_compile.cpp
r28785 r28793 51 51 using namespace WTF; 52 52 53 /* Negative values for the firstchar and reqchar variables */ 54 55 #define REQ_UNSET (-2) 56 #define REQ_NONE (-1) 57 53 58 /************************************************* 54 59 * Code parameters and static tables * … … 89 94 }; 90 95 91 /* Table of sizes for the fixed-length opcodes. It's defined in a macro so that92 the definition is next to the definition of the opcodes in pcre_internal.h. */93 94 static const uschar OP_lengths[] = { OP_LENGTHS };95 96 96 /* The texts of compile-time error messages. These are "char *" because they 97 97 are passed to the outside world. */ 98 98 99 static const char* error _text(ErrorCode code)99 static const char* errorText(ErrorCode code) 100 100 { 101 static const char error _texts[] =101 static const char errorTexts[] = 102 102 /* 1 */ 103 103 "\\ at end of pattern\0" … … 124 124 125 125 int i = code; 126 const char* text = error _texts;126 const char* text = errorTexts; 127 127 while (i > 1) 128 128 i -= !*text++; … … 142 142 needOuterBracket = false; 143 143 } 144 const u schar* start_code; /* The start of the compiled code */144 const unsigned char* start_code; /* The start of the compiled code */ 145 145 const UChar* start_pattern; /* The start of the pattern */ 146 146 int top_backref; /* Maximum back reference */ … … 152 152 /* Definitions to allow mutual recursion */ 153 153 154 static bool compileBracket(int, int*, u schar**, const UChar**, const UChar*, ErrorCode*, int, int*, int*, CompileData&);155 static bool bracketIsAnchored(const u schar* code);156 static bool bracketNeedsLineStart(const u schar* code, unsigned captureMap, unsigned backrefMap);157 static int bracketFindFirstAssertedCharacter(const u schar* code, bool inassert);154 static bool compileBracket(int, int*, unsigned char**, const UChar**, const UChar*, ErrorCode*, int, int*, int*, CompileData&); 155 static bool bracketIsAnchored(const unsigned char* code); 156 static bool bracketNeedsLineStart(const unsigned char* code, unsigned captureMap, unsigned backrefMap); 157 static int bracketFindFirstAssertedCharacter(const unsigned char* code, bool inassert); 158 158 159 159 /************************************************* … … 179 179 */ 180 180 181 static int check _escape(const UChar** ptrptr, const UChar* patternEnd, ErrorCode* errorcodeptr, int bracount, bool isclass)181 static int checkEscape(const UChar** ptrptr, const UChar* patternEnd, ErrorCode* errorcodeptr, int bracount, bool isclass) 182 182 { 183 183 const UChar* ptr = *ptrptr + 1; … … 209 209 } else { 210 210 switch (c) { 211 case '1': 212 case '2': 213 case '3': 214 case '4': 215 case '5': 216 case '6': 217 case '7': 218 case '8': 219 case '9': 220 /* Escape sequences starting with a non-zero digit are backreferences, 221 unless there are insufficient brackets, in which case they are octal 222 escape sequences. Those sequences end on the first non-octal character 223 or when we overflow 0-255, whichever comes first. */ 224 225 if (!isclass) { 226 const UChar* oldptr = ptr; 227 c -= '0'; 228 while ((ptr + 1 < patternEnd) && isASCIIDigit(ptr[1]) && c <= bracount) 229 c = c * 10 + *(++ptr) - '0'; 230 if (c <= bracount) { 231 c = -(ESC_REF + c); 211 case '1': 212 case '2': 213 case '3': 214 case '4': 215 case '5': 216 case '6': 217 case '7': 218 case '8': 219 case '9': 220 /* Escape sequences starting with a non-zero digit are backreferences, 221 unless there are insufficient brackets, in which case they are octal 222 escape sequences. Those sequences end on the first non-octal character 223 or when we overflow 0-255, whichever comes first. */ 224 225 if (!isclass) { 226 const UChar* oldptr = ptr; 227 c -= '0'; 228 while ((ptr + 1 < patternEnd) && isASCIIDigit(ptr[1]) && c <= bracount) 229 c = c * 10 + *(++ptr) - '0'; 230 if (c <= bracount) { 231 c = -(ESC_REF + c); 232 break; 233 } 234 ptr = oldptr; /* Put the pointer back and fall through */ 235 } 236 237 /* Handle an octal number following \. If the first digit is 8 or 9, 238 this is not octal. */ 239 240 if ((c = *ptr) >= '8') 232 241 break; 233 } 234 ptr = oldptr; /* Put the pointer back and fall through */ 235 } 236 237 /* Handle an octal number following \. If the first digit is 8 or 9, 238 this is not octal. */ 239 240 if ((c = *ptr) >= '8') 241 break; 242 242 243 243 /* \0 always starts an octal number, but we may drop through to here with a 244 244 larger first octal digit. */ 245 246 case '0': { 247 c -= '0'; 248 int i; 249 for (i = 1; i <= 2; ++i) { 250 if (ptr + i >= patternEnd || ptr[i] < '0' || ptr[i] > '7') 251 break; 252 int cc = c * 8 + ptr[i] - '0'; 253 if (cc > 255) 254 break; 255 c = cc; 245 246 case '0': { 247 c -= '0'; 248 int i; 249 for (i = 1; i <= 2; ++i) { 250 if (ptr + i >= patternEnd || ptr[i] < '0' || ptr[i] > '7') 251 break; 252 int cc = c * 8 + ptr[i] - '0'; 253 if (cc > 255) 254 break; 255 c = cc; 256 } 257 ptr += i - 1; 258 break; 256 259 } 257 ptr += i - 1; 258 break; 259 } 260 case 'x': { 261 c = 0; 262 int i; 263 for (i = 1; i <= 2; ++i) { 264 if (ptr + i >= patternEnd || !isASCIIHexDigit(ptr[i])) { 265 c = 'x'; 266 i = 1; 267 break; 268 } 269 int cc = ptr[i]; 270 if (cc >= 'a') 271 cc -= 32; /* Convert to upper case */ 272 c = c * 16 + cc - ((cc < 'A') ? '0' : ('A' - 10)); 260 261 case 'x': { 262 c = 0; 263 int i; 264 for (i = 1; i <= 2; ++i) { 265 if (ptr + i >= patternEnd || !isASCIIHexDigit(ptr[i])) { 266 c = 'x'; 267 i = 1; 268 break; 269 } 270 int cc = ptr[i]; 271 if (cc >= 'a') 272 cc -= 32; /* Convert to upper case */ 273 c = c * 16 + cc - ((cc < 'A') ? '0' : ('A' - 10)); 274 } 275 ptr += i - 1; 276 break; 273 277 } 274 ptr += i - 1; 275 break; 276 } 277 case 'u': { 278 c = 0; 279 int i; 280 for (i = 1; i <= 4; ++i) { 281 if (ptr + i >= patternEnd || !isASCIIHexDigit(ptr[i])) { 282 c = 'u'; 283 i = 1; 284 break; 285 } 286 int cc = ptr[i]; 287 if (cc >= 'a') 288 cc -= 32; /* Convert to upper case */ 289 c = c * 16 + cc - ((cc < 'A') ? '0' : ('A' - 10)); 278 279 case 'u': { 280 c = 0; 281 int i; 282 for (i = 1; i <= 4; ++i) { 283 if (ptr + i >= patternEnd || !isASCIIHexDigit(ptr[i])) { 284 c = 'u'; 285 i = 1; 286 break; 287 } 288 int cc = ptr[i]; 289 if (cc >= 'a') 290 cc -= 32; /* Convert to upper case */ 291 c = c * 16 + cc - ((cc < 'A') ? '0' : ('A' - 10)); 292 } 293 ptr += i - 1; 294 break; 290 295 } 291 ptr += i - 1; 292 break; 293 294 /* Other special escapes not starting with a digit are straightforward */ 295 } 296 case 'c': 297 if (++ptr == patternEnd) { 298 *errorcodeptr = ERR2; 299 return 0; 296 297 case 'c': 298 if (++ptr == patternEnd) { 299 *errorcodeptr = ERR2; 300 return 0; 301 } 302 c = *ptr; 303 304 /* A letter is upper-cased; then the 0x40 bit is flipped. This coding 305 is ASCII-specific, but then the whole concept of \cx is ASCII-specific. */ 306 c = toASCIIUpper(c) ^ 0x40; 307 break; 300 308 } 301 c = *ptr;302 303 /* A letter is upper-cased; then the 0x40 bit is flipped. This coding304 is ASCII-specific, but then the whole concept of \cx is ASCII-specific. */305 306 if (c >= 'a' && c <= 'z')307 c -= 32;308 c ^= 0x40;309 break;310 }311 309 } 312 310 … … 314 312 return c; 315 313 } 316 317 318 314 319 315 /************************************************* … … 332 328 */ 333 329 334 static bool is _counted_repeat(const UChar* p, const UChar* patternEnd)330 static bool isCountedRepeat(const UChar* p, const UChar* patternEnd) 335 331 { 336 332 if (p >= patternEnd || !isASCIIDigit(*p)) … … 356 352 } 357 353 358 359 354 /************************************************* 360 355 * Read repeat counts * … … 362 357 363 358 /* Read an item of the form {n,m} and return the values. This is called only 364 after is _counted_repeat() has confirmed that a repeat-count quantifier exists,359 after isCountedRepeat() has confirmed that a repeat-count quantifier exists, 365 360 so the syntax is guaranteed to be correct, but we need to check the values. 366 361 … … 376 371 */ 377 372 378 static const UChar* read _repeat_counts(const UChar* p, int* minp, int* maxp, ErrorCode* errorcodeptr)373 static const UChar* readRepeatCounts(const UChar* p, int* minp, int* maxp, ErrorCode* errorcodeptr) 379 374 { 380 375 int min = 0; … … 420 415 } 421 416 422 423 417 /************************************************* 424 418 * Find first significant op code * … … 427 421 /* This is called by several functions that scan a compiled expression looking 428 422 for a fixed first character, or an anchoring op code etc. It skips over things 429 that do not influence this. For some calls, a change of option is important. 430 For some calls, it makes sense to skip negative forward and all backward 431 assertions, and also the \b assertion; for others it does not. 423 that do not influence this. 432 424 433 425 Arguments: 434 426 code pointer to the start of the group 435 skipassert true if certain assertions are to be skipped436 437 427 Returns: pointer to the first significant opcode 438 428 */ 439 429 440 static const u schar* firstSignificantOpCode(const uschar* code)430 static const unsigned char* firstSignificantOpcode(const unsigned char* code) 441 431 { 442 432 while (*code == OP_BRANUMBER) 443 code += OP_lengths[*code];433 code += 3; 444 434 return code; 445 435 } 446 436 447 static const u schar* firstSignificantOpCodeSkippingAssertions(const uschar* code)437 static const unsigned char* firstSignificantOpcodeSkippingAssertions(const unsigned char* code) 448 438 { 449 439 while (true) { 450 440 switch (*code) { 451 case OP_ASSERT_NOT:452 do {453 code += getOpcodeValueAtOffset(code, 1);454 } while (*code == OP_ALT);455 c ode += OP_lengths[*code];456 break;457 case OP_WORD_BOUNDARY:458 case OP_NOT_WORD_BOUNDARY:459 case OP_BRANUMBER:460 code += OP_lengths[*code];461 break;462 default:463 return code;441 case OP_ASSERT_NOT: 442 advanceToEndOfBracket(code); 443 code += 1 + LINK_SIZE; 444 break; 445 case OP_WORD_BOUNDARY: 446 case OP_NOT_WORD_BOUNDARY: 447 ++code; 448 break; 449 case OP_BRANUMBER: 450 code += 3; 451 break; 452 default: 453 return code; 464 454 } 465 455 } 466 ASSERT_NOT_REACHED();467 456 } 468 469 470 /*************************************************471 * Find the fixed length of a pattern *472 *************************************************/473 474 /* Scan a pattern and compute the fixed length of subject that will match it,475 if the length is fixed. This is needed for dealing with backward assertions.476 In UTF8 mode, the result is in characters rather than bytes.477 478 Arguments:479 code points to the start of the pattern (the bracket)480 options the compiling options481 482 Returns: the fixed length, or -1 if there is no fixed length,483 or -2 if \C was encountered484 */485 486 static int find_fixedlength(uschar* code, int options)487 {488 int length = -1;489 490 int branchlength = 0;491 uschar* cc = code + 1 + LINK_SIZE;492 493 /* Scan along the opcodes for this branch. If we get to the end of the494 branch, check the length against that of the other branches. */495 496 while (true) {497 int d;498 int op = *cc;499 if (op >= OP_BRA)500 op = OP_BRA;501 502 switch (op) {503 case OP_BRA:504 case OP_ONCE:505 d = find_fixedlength(cc, options);506 if (d < 0)507 return d;508 branchlength += d;509 do {510 cc += getOpcodeValueAtOffset(cc, 1);511 } while (*cc == OP_ALT);512 cc += 1 + LINK_SIZE;513 break;514 515 /* Reached end of a branch; if it's a ket it is the end of a nested516 call. If it's ALT it is an alternation in a nested call. If it is517 END it's the end of the outer call. All can be handled by the same code. */518 519 case OP_ALT:520 case OP_KET:521 case OP_KETRMAX:522 case OP_KETRMIN:523 case OP_END:524 if (length < 0)525 length = branchlength;526 else if (length != branchlength)527 return -1;528 if (*cc != OP_ALT)529 return length;530 cc += 1 + LINK_SIZE;531 branchlength = 0;532 break;533 534 /* Skip over assertive subpatterns */535 536 case OP_ASSERT:537 case OP_ASSERT_NOT:538 do {539 cc += getOpcodeValueAtOffset(cc, 1);540 } while (*cc == OP_ALT);541 /* Fall through */542 543 /* Skip over things that don't match chars */544 545 case OP_BRANUMBER:546 case OP_CIRC:547 case OP_DOLL:548 case OP_NOT_WORD_BOUNDARY:549 case OP_WORD_BOUNDARY:550 cc += OP_lengths[*cc];551 break;552 553 /* Handle literal characters */554 555 case OP_CHAR:556 case OP_CHAR_IGNORING_CASE:557 case OP_NOT:558 branchlength++;559 cc += 2;560 while ((*cc & 0xc0) == 0x80)561 cc++;562 break;563 564 case OP_ASCII_CHAR:565 case OP_ASCII_LETTER_IGNORING_CASE:566 branchlength++;567 cc += 2;568 break;569 570 /* Handle exact repetitions. The count is already in characters, but we571 need to skip over a multibyte character in UTF8 mode. */572 573 case OP_EXACT:574 branchlength += get2ByteOpcodeValueAtOffset(cc,1);575 cc += 4;576 while((*cc & 0x80) == 0x80)577 cc++;578 break;579 580 case OP_TYPEEXACT:581 branchlength += get2ByteOpcodeValueAtOffset(cc,1);582 cc += 4;583 break;584 585 /* Handle single-char matchers */586 587 case OP_NOT_DIGIT:588 case OP_DIGIT:589 case OP_NOT_WHITESPACE:590 case OP_WHITESPACE:591 case OP_NOT_WORDCHAR:592 case OP_WORDCHAR:593 case OP_NOT_NEWLINE:594 branchlength++;595 cc++;596 break;597 598 /* Check a class for variable quantification */599 600 case OP_XCLASS:601 cc += getOpcodeValueAtOffset(cc, 1) - 33;602 /* Fall through */603 604 case OP_CLASS:605 case OP_NCLASS:606 cc += 33;607 608 switch (*cc) {609 case OP_CRSTAR:610 case OP_CRMINSTAR:611 case OP_CRQUERY:612 case OP_CRMINQUERY:613 return -1;614 615 case OP_CRRANGE:616 case OP_CRMINRANGE:617 if (get2ByteOpcodeValueAtOffset(cc, 1) != get2ByteOpcodeValueAtOffset(cc, 3))618 return -1;619 branchlength += get2ByteOpcodeValueAtOffset(cc, 1);620 cc += 5;621 break;622 623 default:624 branchlength++;625 }626 break;627 628 /* Anything else is variable length */629 630 default:631 return -1;632 }633 }634 ASSERT_NOT_REACHED();635 }636 637 638 /*************************************************639 * Complete a callout item *640 *************************************************/641 642 /* A callout item contains the length of the next item in the pattern, which643 we can't fill in till after we have reached the relevant point. This is used644 for both automatic and manual callouts.645 646 Arguments:647 previous_callout points to previous callout item648 ptr current pattern pointer649 cd pointers to tables etc650 */651 652 static void complete_callout(uschar* previous_callout, const UChar* ptr, const CompileData& cd)653 {654 int length = ptr - cd.start_pattern - getOpcodeValueAtOffset(previous_callout, 2);655 putOpcodeValueAtOffset(previous_callout, 2 + LINK_SIZE, length);656 }657 658 659 457 660 458 /************************************************* … … 676 474 */ 677 475 678 static bool get _othercase_range(int* cptr, int d, int* ocptr, int* odptr)476 static bool getOthercaseRange(int* cptr, int d, int* ocptr, int* odptr) 679 477 { 680 478 int c, othercase = 0; 681 479 682 480 for (c = *cptr; c <= d; c++) { 683 if ((othercase = _pcre_ucp_othercase(c)) >= 0)481 if ((othercase = kjs_pcre_ucp_othercase(c)) >= 0) 684 482 break; 685 483 } … … 692 490 693 491 for (++c; c <= d; c++) { 694 if ( _pcre_ucp_othercase(c) != next)492 if (kjs_pcre_ucp_othercase(c) != next) 695 493 break; 696 494 next++; … … 717 515 */ 718 516 719 // FIXME: This should be removed as soon as all UTF8 uses are removed from PCRE 720 int _pcre_ord2utf8(int cvalue, uschar *buffer) 517 static int encodeUTF8(int cvalue, unsigned char *buffer) 721 518 { 722 519 int i; 723 for (i = 0; i < _pcre_utf8_table1_size; i++)724 if (cvalue <= _pcre_utf8_table1[i])520 for (i = 0; i < kjs_pcre_utf8_table1_size; i++) 521 if (cvalue <= kjs_pcre_utf8_table1[i]) 725 522 break; 726 523 buffer += i; … … 729 526 cvalue >>= 6; 730 527 } 731 *buffer = _pcre_utf8_table2[i] | cvalue;528 *buffer = kjs_pcre_utf8_table2[i] | cvalue; 732 529 return i + 1; 733 530 } … … 759 556 760 557 static bool 761 compileBranch(int options, int* brackets, u schar** codeptr,558 compileBranch(int options, int* brackets, unsigned char** codeptr, 762 559 const UChar** ptrptr, const UChar* patternEnd, ErrorCode* errorcodeptr, int *firstbyteptr, 763 560 int* reqbyteptr, CompileData& cd) … … 767 564 int bravalue = 0; 768 565 int reqvary, tempreqvary; 769 int after_manual_callout = 0;770 566 int c; 771 u schar* code = *codeptr;772 u schar* tempcode;567 unsigned char* code = *codeptr; 568 unsigned char* tempcode; 773 569 bool groupsetfirstbyte = false; 774 570 const UChar* ptr = *ptrptr; 775 571 const UChar* tempptr; 776 uschar* previous = NULL; 777 uschar* previous_callout = NULL; 778 uschar classbits[32]; 572 unsigned char* previous = NULL; 573 unsigned char classbits[32]; 779 574 780 575 bool class_utf8; 781 u schar* class_utf8data;782 u schar utf8_char[6];576 unsigned char* class_utf8data; 577 unsigned char utf8_char[6]; 783 578 784 579 /* Initialize no first byte, no required byte. REQ_UNSET means "no char … … 815 610 int subfirstbyte; 816 611 int mclength; 817 u schar mcbuffer[8];612 unsigned char mcbuffer[8]; 818 613 819 614 /* Next byte in the pattern */ … … 824 619 a quantifier. */ 825 620 826 bool is_quantifier = c == '*' || c == '+' || c == '?' || (c == '{' && is_counted_repeat(ptr + 1, patternEnd)); 827 828 if (!is_quantifier && previous_callout && after_manual_callout-- <= 0) { 829 complete_callout(previous_callout, ptr, cd); 830 previous_callout = NULL; 831 } 621 bool is_quantifier = c == '*' || c == '+' || c == '?' || (c == '{' && isCountedRepeat(ptr + 1, patternEnd)); 832 622 833 623 switch (c) { … … 922 712 bit map. */ 923 713 924 memset(classbits, 0, 32 * sizeof(u schar));714 memset(classbits, 0, 32 * sizeof(unsigned char)); 925 715 926 716 /* Process characters until ] is reached. The first pass … … 939 729 940 730 if (c == '\\') { 941 c = check _escape(&ptr, patternEnd, errorcodeptr, *brackets, true);731 c = checkEscape(&ptr, patternEnd, errorcodeptr, *brackets, true); 942 732 if (c < 0) { 943 733 class_charcount += 2; /* Greater than 1 is what matters */ … … 1006 796 if (d == '\\') { 1007 797 const UChar* oldptr = ptr; 1008 d = check _escape(&ptr, patternEnd, errorcodeptr, *brackets, true);798 d = checkEscape(&ptr, patternEnd, errorcodeptr, *brackets, true); 1009 799 1010 800 /* \X is literal X; any other special means the '-' was literal */ … … 1037 827 int cc = c; 1038 828 int origd = d; 1039 while (get _othercase_range(&cc, origd, &occ, &ocd)) {829 while (getOthercaseRange(&cc, origd, &occ, &ocd)) { 1040 830 if (occ >= c && ocd <= d) 1041 831 continue; /* Skip embedded ranges */ … … 1056 846 else { 1057 847 *class_utf8data++ = XCL_RANGE; 1058 class_utf8data += _pcre_ord2utf8(occ, class_utf8data);848 class_utf8data += encodeUTF8(occ, class_utf8data); 1059 849 } 1060 class_utf8data += _pcre_ord2utf8(ocd, class_utf8data);850 class_utf8data += encodeUTF8(ocd, class_utf8data); 1061 851 } 1062 852 } … … 1066 856 1067 857 *class_utf8data++ = XCL_RANGE; 1068 class_utf8data += _pcre_ord2utf8(c, class_utf8data);1069 class_utf8data += _pcre_ord2utf8(d, class_utf8data);858 class_utf8data += encodeUTF8(c, class_utf8data); 859 class_utf8data += encodeUTF8(d, class_utf8data); 1070 860 1071 861 /* With UCP support, we are done. Without UCP support, there is no … … 1104 894 class_utf8 = true; 1105 895 *class_utf8data++ = XCL_SINGLE; 1106 class_utf8data += _pcre_ord2utf8(c, class_utf8data);896 class_utf8data += encodeUTF8(c, class_utf8data); 1107 897 1108 898 if (options & IgnoreCaseOption) { 1109 899 int othercase; 1110 if ((othercase = _pcre_ucp_othercase(c)) >= 0) {900 if ((othercase = kjs_pcre_ucp_othercase(c)) >= 0) { 1111 901 *class_utf8data++ = XCL_SINGLE; 1112 class_utf8data += _pcre_ord2utf8(othercase, class_utf8data);902 class_utf8data += encodeUTF8(othercase, class_utf8data); 1113 903 } 1114 904 } … … 1198 988 /* Now fill in the complete length of the item */ 1199 989 1200 put OpcodeValueAtOffset(previous,1, code - previous);990 putLinkValue(previous + 1, code - previous); 1201 991 break; /* End of class handling */ 1202 992 } … … 1223 1013 if (!is_quantifier) 1224 1014 goto NORMAL_CHAR; 1225 ptr = read _repeat_counts(ptr+1, &repeat_min, &repeat_max, errorcodeptr);1015 ptr = readRepeatCounts(ptr + 1, &repeat_min, &repeat_max, errorcodeptr); 1226 1016 if (*errorcodeptr) 1227 1017 goto FAILED; … … 1261 1051 /* Save start of previous item, in case we have to move it up to make space 1262 1052 for an inserted OP_ONCE for the additional '+' extension. */ 1053 /* FIXME: Probably don't need this because we don't use OP_ONCE. */ 1263 1054 1264 1055 tempcode = previous; … … 1289 1080 1290 1081 if (code[-1] & 0x80) { 1291 u schar *lastchar = code - 1;1082 unsigned char *lastchar = code - 1; 1292 1083 while((*lastchar & 0xc0) == 0x80) 1293 1084 lastchar--; … … 1335 1126 int prop_value = -1; 1336 1127 1337 u schar* oldcode = code;1128 unsigned char* oldcode = code; 1338 1129 code = previous; /* Usually overwrite previous item */ 1339 1130 … … 1358 1149 else { 1359 1150 *code++ = OP_UPTO + repeat_type; 1360 put2Byte OpcodeValueAtOffsetAndAdvance(code, 0, repeat_max);1151 put2ByteValueAndAdvance(code, repeat_max); 1361 1152 } 1362 1153 } … … 1375 1166 goto END_REPEAT; 1376 1167 *code++ = OP_UPTO + repeat_type; 1377 put2Byte OpcodeValueAtOffsetAndAdvance(code, 0, repeat_max - 1);1168 put2ByteValueAndAdvance(code, repeat_max - 1); 1378 1169 } 1379 1170 } … … 1384 1175 else { 1385 1176 *code++ = OP_EXACT + op_type; /* NB EXACT doesn't have repeat_type */ 1386 put2Byte OpcodeValueAtOffsetAndAdvance(code, 0, repeat_min);1177 put2ByteValueAndAdvance(code, repeat_min); 1387 1178 1388 1179 /* If the maximum is unlimited, insert an OP_STAR. Before doing so, … … 1421 1212 repeat_max -= repeat_min; 1422 1213 *code++ = OP_UPTO + repeat_type; 1423 put2Byte OpcodeValueAtOffsetAndAdvance(code, 0, repeat_max);1214 put2ByteValueAndAdvance(code, repeat_max); 1424 1215 } 1425 1216 } … … 1463 1254 else { 1464 1255 *code++ = OP_CRRANGE + repeat_type; 1465 put2Byte OpcodeValueAtOffsetAndAdvance(code, 0, repeat_min);1256 put2ByteValueAndAdvance(code, repeat_min); 1466 1257 if (repeat_max == -1) 1467 1258 repeat_max = 0; /* 2-byte encoding for max */ 1468 put2Byte OpcodeValueAtOffsetAndAdvance(code, 0, repeat_max);1259 put2ByteValueAndAdvance(code, repeat_max); 1469 1260 } 1470 1261 } … … 1473 1264 cases. */ 1474 1265 1475 else if (*previous >= OP_BRA || *previous == OP_ONCE) {1266 else if (*previous >= OP_BRA) { 1476 1267 int ketoffset = 0; 1477 1268 int len = code - previous; 1478 u schar* bralink = NULL;1269 unsigned char* bralink = NULL; 1479 1270 1480 1271 /* If the maximum repeat count is unlimited, find the end of the bracket … … 1485 1276 1486 1277 if (repeat_max == -1) { 1487 uschar* ket = previous; 1488 do { 1489 ket += getOpcodeValueAtOffset(ket, 1); 1490 } while (*ket != OP_KET); 1278 const unsigned char* ket = previous; 1279 advanceToEndOfBracket(ket); 1491 1280 ketoffset = code - ket; 1492 1281 } … … 1540 1329 int offset = (!bralink) ? 0 : previous - bralink; 1541 1330 bralink = previous; 1542 put OpcodeValueAtOffsetAndAdvance(previous, 0, offset);1331 putLinkValueAllowZeroAndAdvance(previous, offset); 1543 1332 } 1544 1333 … … 1581 1370 int offset = (!bralink) ? 0 : code - bralink; 1582 1371 bralink = code; 1583 put OpcodeValueAtOffsetAndAdvance(code, 0, offset);1372 putLinkValueAllowZeroAndAdvance(code, offset); 1584 1373 } 1585 1374 … … 1593 1382 while (bralink) { 1594 1383 int offset = code - bralink + 1; 1595 u schar* bra = code - offset;1596 int oldlinkoffset = get OpcodeValueAtOffset(bra,1);1597 bralink = oldlinkoffset ? bralink - oldlinkoffset : 0;1384 unsigned char* bra = code - offset; 1385 int oldlinkoffset = getLinkValueAllowZero(bra + 1); 1386 bralink = (!oldlinkoffset) ? 0 : bralink - oldlinkoffset; 1598 1387 *code++ = OP_KET; 1599 put OpcodeValueAtOffsetAndAdvance(code, 0, offset);1600 put OpcodeValueAtOffset(bra,1, offset);1388 putLinkValueAndAdvance(code, offset); 1389 putLinkValue(bra + 1, offset); 1601 1390 } 1602 1391 } … … 1639 1428 if (*(++ptr) == '?') { 1640 1429 switch (*(++ptr)) { 1641 case ':': /* Non-extracting bracket */1642 bravalue = OP_BRA;1643 ptr++;1644 break;1645 1646 case '=': /* Positive lookahead */1647 bravalue = OP_ASSERT;1648 ptr++;1649 break;1650 1651 case '!': /* Negative lookahead */1652 bravalue = OP_ASSERT_NOT;1653 ptr++;1654 break;1655 1430 case ':': /* Non-extracting bracket */ 1431 bravalue = OP_BRA; 1432 ptr++; 1433 break; 1434 1435 case '=': /* Positive lookahead */ 1436 bravalue = OP_ASSERT; 1437 ptr++; 1438 break; 1439 1440 case '!': /* Negative lookahead */ 1441 bravalue = OP_ASSERT_NOT; 1442 ptr++; 1443 break; 1444 1656 1445 /* Character after (? not specially recognized */ 1657 1658 default: /* Option setting */1659 *errorcodeptr = ERR12;1660 goto FAILED;1661 }1446 1447 default: 1448 *errorcodeptr = ERR12; 1449 goto FAILED; 1450 } 1662 1451 } 1663 1452 … … 1670 1459 bravalue = OP_BRA + EXTRACT_BASIC_MAX + 1; 1671 1460 code[1 + LINK_SIZE] = OP_BRANUMBER; 1672 put2Byte OpcodeValueAtOffset(code, 2+LINK_SIZE, *brackets);1461 put2ByteValue(code + 2 + LINK_SIZE, *brackets); 1673 1462 skipbytes = 3; 1674 1463 } … … 1682 1471 new setting for the ims options if they have changed. */ 1683 1472 1684 previous = (bravalue >= OP_ ONCE) ? code : 0;1473 previous = (bravalue >= OP_BRAZERO) ? code : 0; 1685 1474 *code = bravalue; 1686 1475 tempcode = code; … … 1715 1504 groupsetfirstbyte = false; 1716 1505 1717 if (bravalue >= OP_BRA || bravalue == OP_ONCE) {1506 if (bravalue >= OP_BRA) { 1718 1507 /* If we have not yet set a firstbyte in this branch, take it from the 1719 1508 subpattern, remembering that it was set here so that a repeat of more … … 1775 1564 case '\\': 1776 1565 tempptr = ptr; 1777 c = check _escape(&ptr, patternEnd, errorcodeptr, *brackets, false);1566 c = checkEscape(&ptr, patternEnd, errorcodeptr, *brackets, false); 1778 1567 1779 1568 /* Handle metacharacters introduced by \. For ones like \d, the ESC_ values … … 1802 1591 previous = code; 1803 1592 *code++ = OP_REF; 1804 put2Byte OpcodeValueAtOffsetAndAdvance(code, 0, number);1593 put2ByteValueAndAdvance(code, number); 1805 1594 } 1806 1595 … … 1838 1627 } 1839 1628 } else { 1840 mclength = _pcre_ord2utf8(c, mcbuffer);1629 mclength = encodeUTF8(c, mcbuffer); 1841 1630 1842 1631 *code++ = (options & IgnoreCaseOption) ? OP_CHAR_IGNORING_CASE : OP_CHAR; … … 1888 1677 return false; 1889 1678 } 1890 1891 1892 1893 1679 1894 1680 /************************************************* … … 1919 1705 1920 1706 static bool 1921 compileBracket(int options, int* brackets, u schar** codeptr,1707 compileBracket(int options, int* brackets, unsigned char** codeptr, 1922 1708 const UChar** ptrptr, const UChar* patternEnd, ErrorCode* errorcodeptr, int skipbytes, 1923 1709 int* firstbyteptr, int* reqbyteptr, CompileData& cd) 1924 1710 { 1925 1711 const UChar* ptr = *ptrptr; 1926 u schar* code = *codeptr;1927 u schar* last_branch = code;1928 u schar* start_bracket = code;1712 unsigned char* code = *codeptr; 1713 unsigned char* last_branch = code; 1714 unsigned char* start_bracket = code; 1929 1715 int firstbyte = REQ_UNSET; 1930 1716 int reqbyte = REQ_UNSET; … … 1932 1718 /* Offset is set zero to mark that this bracket is still open */ 1933 1719 1934 put OpcodeValueAtOffset(code,1, 0);1720 putLinkValueAllowZero(code + 1, 0); 1935 1721 code += 1 + LINK_SIZE + skipbytes; 1936 1722 … … 1998 1784 int length = code - last_branch; 1999 1785 do { 2000 int prev_length = get OpcodeValueAtOffset(last_branch,1);2001 put OpcodeValueAtOffset(last_branch,1, length);1786 int prev_length = getLinkValueAllowZero(last_branch + 1); 1787 putLinkValue(last_branch + 1, length); 2002 1788 length = prev_length; 2003 1789 last_branch -= length; … … 2007 1793 2008 1794 *code = OP_KET; 2009 put OpcodeValueAtOffset(code,1, code - start_bracket);1795 putLinkValue(code + 1, code - start_bracket); 2010 1796 code += 1 + LINK_SIZE; 2011 1797 … … 2025 1811 2026 1812 *code = OP_ALT; 2027 put OpcodeValueAtOffset(code,1, code - last_branch);1813 putLinkValue(code + 1, code - last_branch); 2028 1814 last_branch = code; 2029 1815 code += 1 + LINK_SIZE; … … 2032 1818 ASSERT_NOT_REACHED(); 2033 1819 } 2034 2035 1820 2036 1821 /************************************************* … … 2051 1836 */ 2052 1837 2053 static bool branchIsAnchored(const u schar* code)1838 static bool branchIsAnchored(const unsigned char* code) 2054 1839 { 2055 const u schar* scode = firstSignificantOpCode(code);1840 const unsigned char* scode = firstSignificantOpcode(code); 2056 1841 int op = *scode; 2057 1842 2058 1843 /* Brackets */ 2059 if (op >= OP_BRA || op == OP_ASSERT || op == OP_ONCE)1844 if (op >= OP_BRA || op == OP_ASSERT) 2060 1845 return bracketIsAnchored(scode); 2061 1846 … … 2064 1849 } 2065 1850 2066 static bool bracketIsAnchored(const u schar* code)1851 static bool bracketIsAnchored(const unsigned char* code) 2067 1852 { 2068 1853 do { 2069 1854 if (!branchIsAnchored(code + 1 + LINK_SIZE)) 2070 1855 return false; 2071 code += get OpcodeValueAtOffset(code,1);1856 code += getLinkValue(code + 1); 2072 1857 } while (*code == OP_ALT); /* Loop for each alternative */ 2073 1858 return true; … … 2096 1881 */ 2097 1882 2098 static bool branchNeedsLineStart(const u schar* code, unsigned captureMap, unsigned backrefMap)1883 static bool branchNeedsLineStart(const unsigned char* code, unsigned captureMap, unsigned backrefMap) 2099 1884 { 2100 const u schar* scode = firstSignificantOpCode(code);1885 const unsigned char* scode = firstSignificantOpcode(code); 2101 1886 int op = *scode; 2102 1887 … … 2105 1890 int captureNum = op - OP_BRA; 2106 1891 if (captureNum > EXTRACT_BASIC_MAX) 2107 captureNum = get2Byte OpcodeValueAtOffset(scode,2 + LINK_SIZE);1892 captureNum = get2ByteValue(scode + 2 + LINK_SIZE); 2108 1893 int bracketMask = (captureNum < 32) ? (1 << captureNum) : 1; 2109 1894 return bracketNeedsLineStart(scode, captureMap | bracketMask, backrefMap); … … 2111 1896 2112 1897 /* Other brackets */ 2113 if (op == OP_BRA || op == OP_ASSERT || op == OP_ONCE)1898 if (op == OP_BRA || op == OP_ASSERT) 2114 1899 return bracketNeedsLineStart(scode, captureMap, backrefMap); 2115 1900 … … 2124 1909 } 2125 1910 2126 static bool bracketNeedsLineStart(const u schar* code, unsigned captureMap, unsigned backrefMap)1911 static bool bracketNeedsLineStart(const unsigned char* code, unsigned captureMap, unsigned backrefMap) 2127 1912 { 2128 1913 do { 2129 1914 if (!branchNeedsLineStart(code + 1 + LINK_SIZE, captureMap, backrefMap)) 2130 1915 return false; 2131 code += get OpcodeValueAtOffset(code,1);1916 code += getLinkValue(code + 1); 2132 1917 } while (*code == OP_ALT); /* Loop for each alternative */ 2133 1918 return true; … … 2154 1939 */ 2155 1940 2156 static int branchFindFirstAssertedCharacter(const u schar* code, bool inassert)1941 static int branchFindFirstAssertedCharacter(const unsigned char* code, bool inassert) 2157 1942 { 2158 const u schar* scode = firstSignificantOpCodeSkippingAssertions(code);1943 const unsigned char* scode = firstSignificantOpcodeSkippingAssertions(code); 2159 1944 int op = *scode; 2160 1945 … … 2168 1953 case OP_BRA: 2169 1954 case OP_ASSERT: 2170 case OP_ONCE:2171 1955 return bracketFindFirstAssertedCharacter(scode, op == OP_ASSERT); 2172 1956 … … 2187 1971 } 2188 1972 2189 static int bracketFindFirstAssertedCharacter(const u schar* code, bool inassert)1973 static int bracketFindFirstAssertedCharacter(const unsigned char* code, bool inassert) 2190 1974 { 2191 1975 int c = -1; … … 2198 1982 else if (c != d) 2199 1983 return -1; 2200 code += get OpcodeValueAtOffset(code,1);1984 code += getLinkValue(code + 1); 2201 1985 } while (*code == OP_ALT); 2202 1986 return c; … … 2218 2002 unsigned brastackptr = 0; 2219 2003 int brastack[BRASTACK_SIZE]; 2220 u schar bralenstack[BRASTACK_SIZE];2004 unsigned char bralenstack[BRASTACK_SIZE]; 2221 2005 int bracount = 0; 2222 2006 … … 2233 2017 2234 2018 case '\\': 2235 c = check _escape(&ptr, patternEnd, &errorcode, bracount, false);2019 c = checkEscape(&ptr, patternEnd, &errorcode, bracount, false); 2236 2020 if (errorcode != 0) 2237 2021 return -1; … … 2244 2028 if (c > 127) { 2245 2029 int i; 2246 for (i = 0; i < _pcre_utf8_table1_size; i++)2247 if (c <= _pcre_utf8_table1[i]) break;2030 for (i = 0; i < kjs_pcre_utf8_table1_size; i++) 2031 if (c <= kjs_pcre_utf8_table1[i]) break; 2248 2032 length += i; 2249 2033 lastitemlength += i; … … 2267 2051 cd.top_backref = refnum; 2268 2052 length += 2; /* For single back reference */ 2269 if (safelyCheckNextChar(ptr, patternEnd, '{') && is _counted_repeat(ptr + 2, patternEnd)) {2270 ptr = read _repeat_counts(ptr + 2, &minRepeats, &maxRepeats, &errorcode);2053 if (safelyCheckNextChar(ptr, patternEnd, '{') && isCountedRepeat(ptr + 2, patternEnd)) { 2054 ptr = readRepeatCounts(ptr + 2, &minRepeats, &maxRepeats, &errorcode); 2271 2055 if (errorcode) 2272 2056 return -1; … … 2299 2083 2300 2084 case '{': 2301 if (!is _counted_repeat(ptr+1, patternEnd))2085 if (!isCountedRepeat(ptr + 1, patternEnd)) 2302 2086 goto NORMAL_CHAR; 2303 ptr = read _repeat_counts(ptr+1, &minRepeats, &maxRepeats, &errorcode);2087 ptr = readRepeatCounts(ptr + 1, &minRepeats, &maxRepeats, &errorcode); 2304 2088 if (errorcode != 0) 2305 2089 return -1; … … 2366 2150 2367 2151 if (*ptr == '\\') { 2368 c = check _escape(&ptr, patternEnd, &errorcode, bracount, true);2152 c = checkEscape(&ptr, patternEnd, &errorcode, bracount, true); 2369 2153 if (errorcode != 0) 2370 2154 return -1; … … 2401 2185 if (safelyCheckNextChar(ptr, patternEnd, '\\')) { 2402 2186 ptr++; 2403 d = check _escape(&ptr, patternEnd, &errorcode, bracount, true);2187 d = checkEscape(&ptr, patternEnd, &errorcode, bracount, true); 2404 2188 if (errorcode != 0) 2405 2189 return -1; … … 2422 2206 2423 2207 if ((d > 255 || (ignoreCase && d > 127))) { 2424 u schar buffer[6];2208 unsigned char buffer[6]; 2425 2209 if (!class_utf8) /* Allow for XCLASS overhead */ 2426 2210 { … … 2439 2223 int cc = c; 2440 2224 int origd = d; 2441 while (get _othercase_range(&cc, origd, &occ, &ocd)) {2225 while (getOthercaseRange(&cc, origd, &occ, &ocd)) { 2442 2226 if (occ >= c && ocd <= d) 2443 2227 continue; /* Skip embedded */ … … 2456 2240 /* An extra item is needed */ 2457 2241 2458 length += 1 + _pcre_ord2utf8(occ, buffer) +2459 ((occ == ocd) ? 0 : _pcre_ord2utf8(ocd, buffer));2242 length += 1 + encodeUTF8(occ, buffer) + 2243 ((occ == ocd) ? 0 : encodeUTF8(ocd, buffer)); 2460 2244 } 2461 2245 } … … 2463 2247 /* The length of the (possibly extended) range */ 2464 2248 2465 length += 1 + _pcre_ord2utf8(c, buffer) + _pcre_ord2utf8(d, buffer);2249 length += 1 + encodeUTF8(c, buffer) + encodeUTF8(d, buffer); 2466 2250 } 2467 2251 … … 2475 2259 else { 2476 2260 if ((c > 255 || (ignoreCase && c > 127))) { 2477 u schar buffer[6];2261 unsigned char buffer[6]; 2478 2262 class_optcount = 10; /* Ensure > 1 */ 2479 2263 if (!class_utf8) /* Allow for XCLASS overhead */ … … 2482 2266 length += LINK_SIZE + 2; 2483 2267 } 2484 length += (ignoreCase ? 2 : 1) * (1 + _pcre_ord2utf8(c, buffer));2268 length += (ignoreCase ? 2 : 1) * (1 + encodeUTF8(c, buffer)); 2485 2269 } 2486 2270 } … … 2508 2292 we also need extra for wrapping the whole thing in a sub-pattern. */ 2509 2293 2510 if (safelyCheckNextChar(ptr, patternEnd, '{') && is _counted_repeat(ptr+2, patternEnd)) {2511 ptr = read _repeat_counts(ptr+2, &minRepeats, &maxRepeats, &errorcode);2294 if (safelyCheckNextChar(ptr, patternEnd, '{') && isCountedRepeat(ptr + 2, patternEnd)) { 2295 ptr = readRepeatCounts(ptr + 2, &minRepeats, &maxRepeats, &errorcode); 2512 2296 if (errorcode != 0) 2513 2297 return -1; … … 2538 2322 if (safelyCheckNextChar(ptr, patternEnd, '?')) { 2539 2323 switch (c = (ptr + 2 < patternEnd ? ptr[2] : 0)) { 2540 2541 2542 2543 2324 /* Non-referencing groups and lookaheads just move the pointer on, and 2325 then behave like a non-special bracket, except that they don't increment 2326 the count of extracting brackets. Ditto for the "once only" bracket, 2327 which is in Perl from version 5.005. */ 2544 2328 2545 2329 case ':': … … 2549 2333 break; 2550 2334 2551 2552 2553 2554 2335 /* Else loop checking valid options until ) is met. Anything else is an 2336 error. If we are without any brackets, i.e. at top level, the settings 2337 act as if specified in the options, so massage the options immediately. 2338 This is for backward compatibility with Perl 5.004. */ 2555 2339 2556 2340 default: … … 2605 2389 duplength = 0; 2606 2390 2607 /* Leave ptr at the final char; for read _repeat_counts this happens2391 /* Leave ptr at the final char; for readRepeatCounts this happens 2608 2392 automatically; for the others we need an increment. */ 2609 2393 2610 if ((ptr + 1 < patternEnd) && (c = ptr[1]) == '{' && is _counted_repeat(ptr+2, patternEnd)) {2611 ptr = read _repeat_counts(ptr+2, &minRepeats, &maxRepeats, &errorcode);2394 if ((ptr + 1 < patternEnd) && (c = ptr[1]) == '{' && isCountedRepeat(ptr + 2, patternEnd)) { 2395 ptr = readRepeatCounts(ptr + 2, &minRepeats, &maxRepeats, &errorcode); 2612 2396 if (errorcode) 2613 2397 return -1; … … 2672 2456 if (c > 127) { 2673 2457 int i; 2674 for (i = 0; i < _pcre_utf8_table1_size; i++)2675 if (c <= _pcre_utf8_table1[i])2458 for (i = 0; i < kjs_pcre_utf8_table1_size; i++) 2459 if (c <= kjs_pcre_utf8_table1[i]) 2676 2460 break; 2677 2461 length += i; … … 2709 2493 */ 2710 2494 2711 static JSRegExp* returnError(ErrorCode errorcode, const char** errorptr)2495 static inline JSRegExp* returnError(ErrorCode errorcode, const char** errorptr) 2712 2496 { 2713 *errorptr = error _text(errorcode);2497 *errorptr = errorText(errorcode); 2714 2498 return 0; 2715 2499 } … … 2746 2530 passed around in the compile data block. */ 2747 2531 2748 const u schar* codeStart = (const uschar*)(re + 1);2532 const unsigned char* codeStart = (const unsigned char*)(re + 1); 2749 2533 cd.start_code = codeStart; 2750 2534 cd.start_pattern = (const UChar*)pattern; … … 2756 2540 const UChar* ptr = (const UChar*)pattern; 2757 2541 const UChar* patternEnd = pattern + patternLength; 2758 u schar* code = (uschar*)codeStart;2542 unsigned char* code = (unsigned char*)codeStart; 2759 2543 int firstbyte, reqbyte; 2760 2544 int bracketCount = 0; -
trunk/JavaScriptCore/pcre/pcre_exec.cpp
r28627 r28793 74 74 struct { 75 75 const UChar* subjectPtr; 76 const u schar* instructionPtr;77 int offset _top;76 const unsigned char* instructionPtr; 77 int offsetTop; 78 78 const UChar* subpatternStart; 79 79 } args; … … 84 84 store local variables on the current MatchFrame. */ 85 85 struct { 86 const u schar* data;87 const u schar* startOfRepeatingBracket;86 const unsigned char* data; 87 const unsigned char* startOfRepeatingBracket; 88 88 const UChar* subjectPtrAtStartOfInstruction; // Several instrutions stash away a subjectPtr here for later compare 89 const u schar* instructionPtrAtStartOfOnce;89 const unsigned char* instructionPtrAtStartOfOnce; 90 90 91 int repeat _othercase;91 int repeatOthercase; 92 92 93 93 int ctype; … … 98 98 int number; 99 99 int offset; 100 int save _offset1;101 int save _offset2;102 int save _offset3;100 int saveOffset1; 101 int saveOffset2; 102 int saveOffset3; 103 103 104 104 const UChar* subpatternStart; … … 110 110 111 111 struct MatchData { 112 int* offset _vector; /* Offset vector */113 int offset _end; /* One past the end */114 int offset _max; /* The maximum usable for return data */115 bool offset _overflow; /* Set if too many extractions */116 const UChar* start _subject; /* Start of the subject string */117 const UChar* end _subject; /* End of the subject string */118 const UChar* end _match_ptr; /* Subject position at end match */119 int end _offset_top; /* Highwater mark at end of match */112 int* offsetVector; /* Offset vector */ 113 int offsetEnd; /* One past the end */ 114 int offsetMax; /* The maximum usable for return data */ 115 bool offsetOverflow; /* Set if too many extractions */ 116 const UChar* startSubject; /* Start of the subject string */ 117 const UChar* endSubject; /* End of the subject string */ 118 const UChar* endMatchPtr; /* Subject position at end match */ 119 int endOffsetTop; /* Highwater mark at end of match */ 120 120 bool multiline; 121 121 bool ignoreCase; … … 123 123 124 124 /* Non-error returns from the match() function. Error returns are externally 125 defined PCRE_ERROR_xxxcodes, which are all negative. */125 defined error codes, which are all negative. */ 126 126 127 127 #define MATCH_MATCH 1 128 128 #define MATCH_NOMATCH 0 129 130 /* The maximum remaining length of subject we are prepared to search for a 131 req_byte match. */ 132 133 #define REQ_BYTE_MAX 1000 134 135 /* The below limit restricts the number of recursive match calls in order to 136 limit the maximum amount of storage. 137 138 This limit is tied to the size of MatchFrame. Right now we allow PCRE to allocate up 139 to MATCH_RECURSION_LIMIT - 16 * sizeof(MatchFrame) bytes of "stack" space before we give up. 140 Currently that's 100000 - 16 * (23 * 4) ~ 90MB. */ 141 142 #define MATCH_RECURSION_LIMIT 100000 129 143 130 144 #ifdef DEBUG … … 139 153 p points to characters 140 154 length number to print 141 is _subject true if printing from within md.start_subject142 md pointer to matching data block, if is _subject is true155 isSubject true if printing from within md.startSubject 156 md pointer to matching data block, if isSubject is true 143 157 */ 144 158 145 static void pchars(const UChar* p, int length, bool is _subject, const MatchData& md)159 static void pchars(const UChar* p, int length, bool isSubject, const MatchData& md) 146 160 { 147 if (is _subject && length > md.end_subject - p)148 length = md.end _subject - p;161 if (isSubject && length > md.endSubject - p) 162 length = md.endSubject - p; 149 163 while (length-- > 0) { 150 164 int c; … … 159 173 #endif 160 174 161 162 163 175 /************************************************* 164 176 * Match a back-reference * … … 177 189 */ 178 190 179 static bool match _ref(int offset, const UChar* subjectPtr, int length, const MatchData& md)191 static bool matchRef(int offset, const UChar* subjectPtr, int length, const MatchData& md) 180 192 { 181 const UChar* p = md.start _subject + md.offset_vector[offset];193 const UChar* p = md.startSubject + md.offsetVector[offset]; 182 194 183 195 #ifdef DEBUG 184 if (subjectPtr >= md.end _subject)196 if (subjectPtr >= md.endSubject) 185 197 printf("matching subject <null>"); 186 198 else { … … 195 207 /* Always fail if not enough characters left */ 196 208 197 if (length > md.end _subject - subjectPtr)209 if (length > md.endSubject - subjectPtr) 198 210 return false; 199 211 … … 203 215 while (length-- > 0) { 204 216 UChar c = *p++; 205 int othercase = _pcre_ucp_othercase(c);217 int othercase = kjs_pcre_ucp_othercase(c); 206 218 UChar d = *subjectPtr++; 207 219 if (c != d && othercase != d) … … 239 251 #endif 240 252 241 #define CHECK_RECURSION_LIMIT \ 242 if (stack.size >= MATCH_LIMIT_RECURSION) \ 243 return matchError(JSRegExpErrorRecursionLimit, stack); 244 245 #define RECURSE_WITH_RETURN_NUMBER(num) \ 246 CHECK_RECURSION_LIMIT \ 253 #define RECURSIVE_MATCH_COMMON(num) \ 254 if (stack.size >= MATCH_RECURSION_LIMIT) \ 255 return matchError(JSRegExpErrorRecursionLimit, stack); \ 247 256 goto RECURSE;\ 248 RRETURN_##num: 257 RRETURN_##num: \ 258 stack.popCurrentFrame(); 249 259 250 260 #define RECURSIVE_MATCH(num, ra, rb) \ 251 {\ 252 stack.pushNewFrame((ra), (rb), RMATCH_WHERE(num)); \ 253 RECURSE_WITH_RETURN_NUMBER(num) \ 254 stack.popCurrentFrame(); \ 255 } 261 do { \ 262 stack.pushNewFrame((ra), (rb), RMATCH_WHERE(num)); \ 263 RECURSIVE_MATCH_COMMON(num) \ 264 } while (0) 256 265 257 266 #define RECURSIVE_MATCH_STARTNG_NEW_GROUP(num, ra, rb) \ 258 {\ 259 stack.pushNewFrame((ra), (rb), RMATCH_WHERE(num)); \ 260 startNewGroup(stack.currentFrame); \ 261 RECURSE_WITH_RETURN_NUMBER(num) \ 262 stack.popCurrentFrame(); \ 263 } 267 do { \ 268 stack.pushNewFrame((ra), (rb), RMATCH_WHERE(num)); \ 269 startNewGroup(stack.currentFrame); \ 270 RECURSIVE_MATCH_COMMON(num) \ 271 } while (0) 264 272 265 273 #define RRETURN goto RRETURN_LABEL 266 274 267 #define RRETURN_NO_MATCH \ 268 {\ 269 is_match = false;\ 270 RRETURN;\ 271 } 275 #define RRETURN_NO_MATCH do { isMatch = false; RRETURN; } while (0) 272 276 273 277 /************************************************* … … 285 289 subjectPtr pointer in subject 286 290 instructionPtr position in code 287 offset _top current top pointer291 offsetTop current top pointer 288 292 md pointer to "static" info for the match 289 293 290 294 Returns: MATCH_MATCH if matched ) these values are >= 0 291 295 MATCH_NOMATCH if failed to match ) 292 a negative PCRE_ERROR_xxxvalue if aborted by an error condition296 a negative error value if aborted by an error condition 293 297 (e.g. stopped by repeated call or recursion limit) 294 298 */ … … 322 326 } 323 327 324 inline void pushNewFrame(const u schar* instructionPtr, const UChar* subpatternStart, ReturnLocation returnLocation)328 inline void pushNewFrame(const unsigned char* instructionPtr, const UChar* subpatternStart, ReturnLocation returnLocation) 325 329 { 326 330 MatchFrame* newframe = allocateNextFrame(); … … 328 332 329 333 newframe->args.subjectPtr = currentFrame->args.subjectPtr; 330 newframe->args.offset _top = currentFrame->args.offset_top;334 newframe->args.offsetTop = currentFrame->args.offsetTop; 331 335 newframe->args.instructionPtr = instructionPtr; 332 336 newframe->args.subpatternStart = subpatternStart; … … 362 366 if there are extra bytes. This is called when we know we are in UTF-8 mode. */ 363 367 364 static inline void getUTF8CharAndIncrementLength(int& c, const u schar* subjectPtr, int& len)368 static inline void getUTF8CharAndIncrementLength(int& c, const unsigned char* subjectPtr, int& len) 365 369 { 366 370 c = *subjectPtr; 367 371 if ((c & 0xc0) == 0xc0) { 368 int gcaa = _pcre_utf8_table4[c & 0x3f]; /* Number of additional bytes */372 int gcaa = kjs_pcre_utf8_table4[c & 0x3f]; /* Number of additional bytes */ 369 373 int gcss = 6 * gcaa; 370 c = (c & _pcre_utf8_table3[gcaa]) << gcss;374 c = (c & kjs_pcre_utf8_table3[gcaa]) << gcss; 371 375 for (int gcii = 1; gcii <= gcaa; gcii++) { 372 376 gcss -= 6; … … 402 406 } 403 407 404 static int match(const UChar* subjectPtr, const u schar* instructionPtr, int offset_top, MatchData& md)408 static int match(const UChar* subjectPtr, const unsigned char* instructionPtr, int offsetTop, MatchData& md) 405 409 { 406 int is _match = false;410 int isMatch = false; 407 411 int min; 408 412 bool minimize = false; /* Initialization not really needed, but some compilers think so. */ … … 413 417 #ifdef USE_COMPUTED_GOTO_FOR_MATCH_OPCODE_LOOP 414 418 #define EMIT_JUMP_TABLE_ENTRY(opcode) &&LABEL_OP_##opcode, 415 static void* opcode _jump_table[256] = { FOR_EACH_OPCODE(EMIT_JUMP_TABLE_ENTRY) };419 static void* opcodeJumpTable[256] = { FOR_EACH_OPCODE(EMIT_JUMP_TABLE_ENTRY) }; 416 420 #undef EMIT_JUMP_TABLE_ENTRY 417 421 #endif … … 419 423 /* One-time setup of the opcode jump table. */ 420 424 #ifdef USE_COMPUTED_GOTO_FOR_MATCH_OPCODE_LOOP 421 for (int i = 255; !opcode _jump_table[i]; i--)422 opcode _jump_table[i] = &&CAPTURING_BRACKET;425 for (int i = 255; !opcodeJumpTable[i]; i--) 426 opcodeJumpTable[i] = &&CAPTURING_BRACKET; 423 427 #endif 424 428 … … 432 436 stack.currentFrame->args.subjectPtr = subjectPtr; 433 437 stack.currentFrame->args.instructionPtr = instructionPtr; 434 stack.currentFrame->args.offset _top = offset_top;438 stack.currentFrame->args.offsetTop = offsetTop; 435 439 stack.currentFrame->args.subpatternStart = 0; 436 440 startNewGroup(stack.currentFrame); … … 449 453 #ifdef USE_COMPUTED_GOTO_FOR_MATCH_OPCODE_LOOP 450 454 #define BEGIN_OPCODE(opcode) LABEL_OP_##opcode 451 #define NEXT_OPCODE goto *opcode _jump_table[*stack.currentFrame->args.instructionPtr]455 #define NEXT_OPCODE goto *opcodeJumpTable[*stack.currentFrame->args.instructionPtr] 452 456 #else 453 457 #define BEGIN_OPCODE(opcode) case OP_##opcode … … 468 472 do { 469 473 RECURSIVE_MATCH_STARTNG_NEW_GROUP(2, stack.currentFrame->args.instructionPtr + 1 + LINK_SIZE, stack.currentFrame->args.subpatternStart); 470 if (is _match)474 if (isMatch) 471 475 RRETURN; 472 stack.currentFrame->args.instructionPtr += get OpcodeValueAtOffset(stack.currentFrame->args.instructionPtr,1);476 stack.currentFrame->args.instructionPtr += getLinkValue(stack.currentFrame->args.instructionPtr + 1); 473 477 } while (*stack.currentFrame->args.instructionPtr == OP_ALT); 474 478 DPRINTF(("bracket 0 failed\n")); … … 484 488 485 489 BEGIN_OPCODE(END): 486 md.end _match_ptr = stack.currentFrame->args.subjectPtr; /* Record where we ended */487 md.end _offset_top = stack.currentFrame->args.offset_top; /* and how many extracts were taken */488 is _match = true;490 md.endMatchPtr = stack.currentFrame->args.subjectPtr; /* Record where we ended */ 491 md.endOffsetTop = stack.currentFrame->args.offsetTop; /* and how many extracts were taken */ 492 isMatch = true; 489 493 RRETURN; 490 494 … … 498 502 do { 499 503 RECURSIVE_MATCH_STARTNG_NEW_GROUP(6, stack.currentFrame->args.instructionPtr + 1 + LINK_SIZE, NULL); 500 if (is _match)504 if (isMatch) 501 505 break; 502 stack.currentFrame->args.instructionPtr += get OpcodeValueAtOffset(stack.currentFrame->args.instructionPtr,1);506 stack.currentFrame->args.instructionPtr += getLinkValue(stack.currentFrame->args.instructionPtr + 1); 503 507 } while (*stack.currentFrame->args.instructionPtr == OP_ALT); 504 508 if (*stack.currentFrame->args.instructionPtr == OP_KET) … … 508 512 mark, since extracts may have been taken during the assertion. */ 509 513 510 moveOpcodePtrPastAnyAlternateBranches(stack.currentFrame->args.instructionPtr);514 advanceToEndOfBracket(stack.currentFrame->args.instructionPtr); 511 515 stack.currentFrame->args.instructionPtr += 1 + LINK_SIZE; 512 stack.currentFrame->args.offset _top = md.end_offset_top;516 stack.currentFrame->args.offsetTop = md.endOffsetTop; 513 517 NEXT_OPCODE; 514 518 … … 518 522 do { 519 523 RECURSIVE_MATCH_STARTNG_NEW_GROUP(7, stack.currentFrame->args.instructionPtr + 1 + LINK_SIZE, NULL); 520 if (is _match)524 if (isMatch) 521 525 RRETURN_NO_MATCH; 522 stack.currentFrame->args.instructionPtr += get OpcodeValueAtOffset(stack.currentFrame->args.instructionPtr,1);526 stack.currentFrame->args.instructionPtr += getLinkValue(stack.currentFrame->args.instructionPtr + 1); 523 527 } while (*stack.currentFrame->args.instructionPtr == OP_ALT); 524 528 … … 526 530 NEXT_OPCODE; 527 531 528 /* "Once" brackets are like assertion brackets except that after a match, 529 the point in the subject string is not moved back. Thus there can never be 530 a move back into the brackets. Friedl calls these "atomic" subpatterns. 531 Check the alternative branches in turn - the matching won't pass the KET 532 for this kind of subpattern. If any one branch matches, we carry on as at 533 the end of a normal bracket, leaving the subject pointer. */ 534 535 BEGIN_OPCODE(ONCE): 536 stack.currentFrame->locals.instructionPtrAtStartOfOnce = stack.currentFrame->args.instructionPtr; 537 stack.currentFrame->locals.subjectPtrAtStartOfInstruction = stack.currentFrame->args.subjectPtr; 538 539 do { 540 RECURSIVE_MATCH_STARTNG_NEW_GROUP(9, stack.currentFrame->args.instructionPtr + 1 + LINK_SIZE, stack.currentFrame->args.subpatternStart); 541 if (is_match) 542 break; 543 stack.currentFrame->args.instructionPtr += getOpcodeValueAtOffset(stack.currentFrame->args.instructionPtr, 1); 544 } while (*stack.currentFrame->args.instructionPtr == OP_ALT); 545 546 /* If hit the end of the group (which could be repeated), fail */ 547 548 if (*stack.currentFrame->args.instructionPtr != OP_ONCE && *stack.currentFrame->args.instructionPtr != OP_ALT) 532 /* An alternation is the end of a branch; scan along to find the end of the 533 bracketed group and go to there. */ 534 535 BEGIN_OPCODE(ALT): 536 advanceToEndOfBracket(stack.currentFrame->args.instructionPtr); 537 NEXT_OPCODE; 538 539 /* BRAZERO and BRAMINZERO occur just before a bracket group, indicating 540 that it may occur zero times. It may repeat infinitely, or not at all - 541 i.e. it could be ()* or ()? in the pattern. Brackets with fixed upper 542 repeat limits are compiled as a number of copies, with the optional ones 543 preceded by BRAZERO or BRAMINZERO. */ 544 545 BEGIN_OPCODE(BRAZERO): { 546 stack.currentFrame->locals.startOfRepeatingBracket = stack.currentFrame->args.instructionPtr + 1; 547 RECURSIVE_MATCH_STARTNG_NEW_GROUP(14, stack.currentFrame->locals.startOfRepeatingBracket, stack.currentFrame->args.subpatternStart); 548 if (isMatch) 549 549 RRETURN; 550 551 /* Continue as from after the assertion, updating the offsets high water 552 mark, since extracts may have been taken. */ 553 554 moveOpcodePtrPastAnyAlternateBranches(stack.currentFrame->args.instructionPtr); 555 556 stack.currentFrame->args.offset_top = md.end_offset_top; 557 stack.currentFrame->args.subjectPtr = md.end_match_ptr; 550 advanceToEndOfBracket(stack.currentFrame->locals.startOfRepeatingBracket); 551 stack.currentFrame->args.instructionPtr = stack.currentFrame->locals.startOfRepeatingBracket + 1 + LINK_SIZE; 552 NEXT_OPCODE; 553 } 554 555 BEGIN_OPCODE(BRAMINZERO): { 556 stack.currentFrame->locals.startOfRepeatingBracket = stack.currentFrame->args.instructionPtr + 1; 557 advanceToEndOfBracket(stack.currentFrame->locals.startOfRepeatingBracket); 558 RECURSIVE_MATCH_STARTNG_NEW_GROUP(15, stack.currentFrame->locals.startOfRepeatingBracket + 1 + LINK_SIZE, stack.currentFrame->args.subpatternStart); 559 if (isMatch) 560 RRETURN; 561 stack.currentFrame->args.instructionPtr++; 562 NEXT_OPCODE; 563 } 564 565 /* End of a group, repeated or non-repeating. If we are at the end of 566 an assertion "group", stop matching and return MATCH_MATCH, but record the 567 current high water mark for use by positive assertions. Do this also 568 for the "once" (not-backup up) groups. */ 569 570 BEGIN_OPCODE(KET): 571 BEGIN_OPCODE(KETRMIN): 572 BEGIN_OPCODE(KETRMAX): 573 stack.currentFrame->locals.instructionPtrAtStartOfOnce = stack.currentFrame->args.instructionPtr - getLinkValue(stack.currentFrame->args.instructionPtr + 1); 574 stack.currentFrame->args.subpatternStart = stack.currentFrame->locals.subpatternStart; 575 stack.currentFrame->locals.subpatternStart = stack.currentFrame->previousFrame->args.subpatternStart; 576 577 if (*stack.currentFrame->locals.instructionPtrAtStartOfOnce == OP_ASSERT || *stack.currentFrame->locals.instructionPtrAtStartOfOnce == OP_ASSERT_NOT) { 578 md.endOffsetTop = stack.currentFrame->args.offsetTop; 579 isMatch = true; 580 RRETURN; 581 } 582 583 /* In all other cases except a conditional group we have to check the 584 group number back at the start and if necessary complete handling an 585 extraction by setting the offsets and bumping the high water mark. */ 586 587 stack.currentFrame->locals.number = *stack.currentFrame->locals.instructionPtrAtStartOfOnce - OP_BRA; 588 589 /* For extended extraction brackets (large number), we have to fish out 590 the number from a dummy opcode at the start. */ 591 592 if (stack.currentFrame->locals.number > EXTRACT_BASIC_MAX) 593 stack.currentFrame->locals.number = get2ByteValue(stack.currentFrame->locals.instructionPtrAtStartOfOnce + 2 + LINK_SIZE); 594 stack.currentFrame->locals.offset = stack.currentFrame->locals.number << 1; 595 596 #ifdef DEBUG 597 printf("end bracket %d", stack.currentFrame->locals.number); 598 printf("\n"); 599 #endif 600 601 /* Test for a numbered group. This includes groups called as a result 602 of recursion. Note that whole-pattern recursion is coded as a recurse 603 into group 0, so it won't be picked up here. Instead, we catch it when 604 the OP_END is reached. */ 605 606 if (stack.currentFrame->locals.number > 0) { 607 if (stack.currentFrame->locals.offset >= md.offsetMax) 608 md.offsetOverflow = true; 609 else { 610 md.offsetVector[stack.currentFrame->locals.offset] = 611 md.offsetVector[md.offsetEnd - stack.currentFrame->locals.number]; 612 md.offsetVector[stack.currentFrame->locals.offset+1] = stack.currentFrame->args.subjectPtr - md.startSubject; 613 if (stack.currentFrame->args.offsetTop <= stack.currentFrame->locals.offset) 614 stack.currentFrame->args.offsetTop = stack.currentFrame->locals.offset + 2; 615 } 616 } 558 617 559 618 /* For a non-repeating ket, just continue at this level. This also … … 569 628 570 629 /* The repeating kets try the rest of the pattern or restart from the 571 preceding bracket, in the appropriate order. We need to reset any options572 that changed within the bracket before re-running it, so check the next573 opcode. */574 575 if (*stack.currentFrame->args.instructionPtr == OP_KETRMIN) {576 RECURSIVE_MATCH(10, stack.currentFrame->args.instructionPtr + 1 + LINK_SIZE, stack.currentFrame->args.subpatternStart);577 if (is_match)578 RRETURN;579 RECURSIVE_MATCH_STARTNG_NEW_GROUP(11, stack.currentFrame->locals.instructionPtrAtStartOfOnce, stack.currentFrame->args.subpatternStart);580 if (is_match)581 RRETURN;582 } else { /* OP_KETRMAX */583 RECURSIVE_MATCH_STARTNG_NEW_GROUP(12, stack.currentFrame->locals.instructionPtrAtStartOfOnce, stack.currentFrame->args.subpatternStart);584 if (is_match)585 RRETURN;586 RECURSIVE_MATCH(13, stack.currentFrame->args.instructionPtr + 1 + LINK_SIZE, stack.currentFrame->args.subpatternStart);587 if (is_match)588 RRETURN;589 }590 RRETURN;591 592 /* An alternation is the end of a branch; scan along to find the end of the593 bracketed group and go to there. */594 595 BEGIN_OPCODE(ALT):596 moveOpcodePtrPastAnyAlternateBranches(stack.currentFrame->args.instructionPtr);597 NEXT_OPCODE;598 599 /* BRAZERO and BRAMINZERO occur just before a bracket group, indicating600 that it may occur zero times. It may repeat infinitely, or not at all -601 i.e. it could be ()* or ()? in the pattern. Brackets with fixed upper602 repeat limits are compiled as a number of copies, with the optional ones603 preceded by BRAZERO or BRAMINZERO. */604 605 BEGIN_OPCODE(BRAZERO): {606 stack.currentFrame->locals.startOfRepeatingBracket = stack.currentFrame->args.instructionPtr + 1;607 RECURSIVE_MATCH_STARTNG_NEW_GROUP(14, stack.currentFrame->locals.startOfRepeatingBracket, stack.currentFrame->args.subpatternStart);608 if (is_match)609 RRETURN;610 moveOpcodePtrPastAnyAlternateBranches(stack.currentFrame->locals.startOfRepeatingBracket);611 stack.currentFrame->args.instructionPtr = stack.currentFrame->locals.startOfRepeatingBracket + 1 + LINK_SIZE;612 NEXT_OPCODE;613 }614 615 BEGIN_OPCODE(BRAMINZERO): {616 stack.currentFrame->locals.startOfRepeatingBracket = stack.currentFrame->args.instructionPtr + 1;617 moveOpcodePtrPastAnyAlternateBranches(stack.currentFrame->locals.startOfRepeatingBracket);618 RECURSIVE_MATCH_STARTNG_NEW_GROUP(15, stack.currentFrame->locals.startOfRepeatingBracket + 1 + LINK_SIZE, stack.currentFrame->args.subpatternStart);619 if (is_match)620 RRETURN;621 stack.currentFrame->args.instructionPtr++;622 NEXT_OPCODE;623 }624 625 /* End of a group, repeated or non-repeating. If we are at the end of626 an assertion "group", stop matching and return MATCH_MATCH, but record the627 current high water mark for use by positive assertions. Do this also628 for the "once" (not-backup up) groups. */629 630 BEGIN_OPCODE(KET):631 BEGIN_OPCODE(KETRMIN):632 BEGIN_OPCODE(KETRMAX):633 stack.currentFrame->locals.instructionPtrAtStartOfOnce = stack.currentFrame->args.instructionPtr - getOpcodeValueAtOffset(stack.currentFrame->args.instructionPtr, 1);634 stack.currentFrame->args.subpatternStart = stack.currentFrame->locals.subpatternStart;635 stack.currentFrame->locals.subpatternStart = stack.currentFrame->previousFrame->args.subpatternStart;636 637 if (*stack.currentFrame->locals.instructionPtrAtStartOfOnce == OP_ASSERT || *stack.currentFrame->locals.instructionPtrAtStartOfOnce == OP_ASSERT_NOT || *stack.currentFrame->locals.instructionPtrAtStartOfOnce == OP_ONCE) {638 md.end_match_ptr = stack.currentFrame->args.subjectPtr; /* For ONCE */639 md.end_offset_top = stack.currentFrame->args.offset_top;640 is_match = true;641 RRETURN;642 }643 644 /* In all other cases except a conditional group we have to check the645 group number back at the start and if necessary complete handling an646 extraction by setting the offsets and bumping the high water mark. */647 648 stack.currentFrame->locals.number = *stack.currentFrame->locals.instructionPtrAtStartOfOnce - OP_BRA;649 650 /* For extended extraction brackets (large number), we have to fish out651 the number from a dummy opcode at the start. */652 653 if (stack.currentFrame->locals.number > EXTRACT_BASIC_MAX)654 stack.currentFrame->locals.number = get2ByteOpcodeValueAtOffset(stack.currentFrame->locals.instructionPtrAtStartOfOnce, 2+LINK_SIZE);655 stack.currentFrame->locals.offset = stack.currentFrame->locals.number << 1;656 657 #ifdef DEBUG658 printf("end bracket %d", stack.currentFrame->locals.number);659 printf("\n");660 #endif661 662 /* Test for a numbered group. This includes groups called as a result663 of recursion. Note that whole-pattern recursion is coded as a recurse664 into group 0, so it won't be picked up here. Instead, we catch it when665 the OP_END is reached. */666 667 if (stack.currentFrame->locals.number > 0) {668 if (stack.currentFrame->locals.offset >= md.offset_max)669 md.offset_overflow = true;670 else {671 md.offset_vector[stack.currentFrame->locals.offset] =672 md.offset_vector[md.offset_end - stack.currentFrame->locals.number];673 md.offset_vector[stack.currentFrame->locals.offset+1] = stack.currentFrame->args.subjectPtr - md.start_subject;674 if (stack.currentFrame->args.offset_top <= stack.currentFrame->locals.offset)675 stack.currentFrame->args.offset_top = stack.currentFrame->locals.offset + 2;676 }677 }678 679 /* For a non-repeating ket, just continue at this level. This also680 happens for a repeating ket if no characters were matched in the group.681 This is the forcible breaking of infinite loops as implemented in Perl682 5.005. If there is an options reset, it will get obeyed in the normal683 course of events. */684 685 if (*stack.currentFrame->args.instructionPtr == OP_KET || stack.currentFrame->args.subjectPtr == stack.currentFrame->locals.subjectPtrAtStartOfInstruction) {686 stack.currentFrame->args.instructionPtr += 1 + LINK_SIZE;687 NEXT_OPCODE;688 }689 690 /* The repeating kets try the rest of the pattern or restart from the691 630 preceding bracket, in the appropriate order. */ 692 631 693 632 if (*stack.currentFrame->args.instructionPtr == OP_KETRMIN) { 694 633 RECURSIVE_MATCH(16, stack.currentFrame->args.instructionPtr + 1 + LINK_SIZE, stack.currentFrame->args.subpatternStart); 695 if (is _match)634 if (isMatch) 696 635 RRETURN; 697 636 RECURSIVE_MATCH_STARTNG_NEW_GROUP(17, stack.currentFrame->locals.instructionPtrAtStartOfOnce, stack.currentFrame->args.subpatternStart); 698 if (is _match)637 if (isMatch) 699 638 RRETURN; 700 639 } else { /* OP_KETRMAX */ 701 640 RECURSIVE_MATCH_STARTNG_NEW_GROUP(18, stack.currentFrame->locals.instructionPtrAtStartOfOnce, stack.currentFrame->args.subpatternStart); 702 if (is _match)641 if (isMatch) 703 642 RRETURN; 704 643 RECURSIVE_MATCH(19, stack.currentFrame->args.instructionPtr + 1 + LINK_SIZE, stack.currentFrame->args.subpatternStart); 705 if (is _match)644 if (isMatch) 706 645 RRETURN; 707 646 } … … 711 650 712 651 BEGIN_OPCODE(CIRC): 713 if (stack.currentFrame->args.subjectPtr != md.start _subject && (!md.multiline || !isNewline(stack.currentFrame->args.subjectPtr[-1])))652 if (stack.currentFrame->args.subjectPtr != md.startSubject && (!md.multiline || !isNewline(stack.currentFrame->args.subjectPtr[-1]))) 714 653 RRETURN_NO_MATCH; 715 654 stack.currentFrame->args.instructionPtr++; … … 719 658 720 659 BEGIN_OPCODE(DOLL): 721 if (stack.currentFrame->args.subjectPtr < md.end _subject && (!md.multiline || !isNewline(*stack.currentFrame->args.subjectPtr)))660 if (stack.currentFrame->args.subjectPtr < md.endSubject && (!md.multiline || !isNewline(*stack.currentFrame->args.subjectPtr))) 722 661 RRETURN_NO_MATCH; 723 662 stack.currentFrame->args.instructionPtr++; … … 731 670 bool previousCharIsWordChar = false; 732 671 733 if (stack.currentFrame->args.subjectPtr > md.start _subject)672 if (stack.currentFrame->args.subjectPtr > md.startSubject) 734 673 previousCharIsWordChar = isWordChar(stack.currentFrame->args.subjectPtr[-1]); 735 if (stack.currentFrame->args.subjectPtr < md.end _subject)674 if (stack.currentFrame->args.subjectPtr < md.endSubject) 736 675 currentCharIsWordChar = isWordChar(*stack.currentFrame->args.subjectPtr); 737 676 … … 746 685 747 686 BEGIN_OPCODE(NOT_NEWLINE): 748 if (stack.currentFrame->args.subjectPtr >= md.end _subject)687 if (stack.currentFrame->args.subjectPtr >= md.endSubject) 749 688 RRETURN_NO_MATCH; 750 689 if (isNewline(*stack.currentFrame->args.subjectPtr++)) … … 754 693 755 694 BEGIN_OPCODE(NOT_DIGIT): 756 if (stack.currentFrame->args.subjectPtr >= md.end _subject)695 if (stack.currentFrame->args.subjectPtr >= md.endSubject) 757 696 RRETURN_NO_MATCH; 758 697 if (isASCIIDigit(*stack.currentFrame->args.subjectPtr++)) … … 762 701 763 702 BEGIN_OPCODE(DIGIT): 764 if (stack.currentFrame->args.subjectPtr >= md.end _subject)703 if (stack.currentFrame->args.subjectPtr >= md.endSubject) 765 704 RRETURN_NO_MATCH; 766 705 if (!isASCIIDigit(*stack.currentFrame->args.subjectPtr++)) … … 770 709 771 710 BEGIN_OPCODE(NOT_WHITESPACE): 772 if (stack.currentFrame->args.subjectPtr >= md.end _subject)711 if (stack.currentFrame->args.subjectPtr >= md.endSubject) 773 712 RRETURN_NO_MATCH; 774 713 if (isSpaceChar(*stack.currentFrame->args.subjectPtr++)) … … 778 717 779 718 BEGIN_OPCODE(WHITESPACE): 780 if (stack.currentFrame->args.subjectPtr >= md.end _subject)719 if (stack.currentFrame->args.subjectPtr >= md.endSubject) 781 720 RRETURN_NO_MATCH; 782 721 if (!isSpaceChar(*stack.currentFrame->args.subjectPtr++)) … … 786 725 787 726 BEGIN_OPCODE(NOT_WORDCHAR): 788 if (stack.currentFrame->args.subjectPtr >= md.end _subject)727 if (stack.currentFrame->args.subjectPtr >= md.endSubject) 789 728 RRETURN_NO_MATCH; 790 729 if (isWordChar(*stack.currentFrame->args.subjectPtr++)) … … 794 733 795 734 BEGIN_OPCODE(WORDCHAR): 796 if (stack.currentFrame->args.subjectPtr >= md.end _subject)735 if (stack.currentFrame->args.subjectPtr >= md.endSubject) 797 736 RRETURN_NO_MATCH; 798 737 if (!isWordChar(*stack.currentFrame->args.subjectPtr++)) … … 810 749 811 750 BEGIN_OPCODE(REF): 812 stack.currentFrame->locals.offset = get2Byte OpcodeValueAtOffset(stack.currentFrame->args.instructionPtr,1) << 1; /* Doubled ref number */751 stack.currentFrame->locals.offset = get2ByteValue(stack.currentFrame->args.instructionPtr + 1) << 1; /* Doubled ref number */ 813 752 stack.currentFrame->args.instructionPtr += 3; /* Advance past item */ 814 753 … … 818 757 minima. */ 819 758 820 if (stack.currentFrame->locals.offset >= stack.currentFrame->args.offset _top || md.offset_vector[stack.currentFrame->locals.offset] < 0)759 if (stack.currentFrame->locals.offset >= stack.currentFrame->args.offsetTop || md.offsetVector[stack.currentFrame->locals.offset] < 0) 821 760 stack.currentFrame->locals.length = 0; 822 761 else 823 stack.currentFrame->locals.length = md.offset _vector[stack.currentFrame->locals.offset+1] - md.offset_vector[stack.currentFrame->locals.offset];762 stack.currentFrame->locals.length = md.offsetVector[stack.currentFrame->locals.offset+1] - md.offsetVector[stack.currentFrame->locals.offset]; 824 763 825 764 /* Set up for repetition, or handle the non-repeated case */ … … 838 777 case OP_CRMINRANGE: 839 778 minimize = (*stack.currentFrame->args.instructionPtr == OP_CRMINRANGE); 840 min = get2Byte OpcodeValueAtOffset(stack.currentFrame->args.instructionPtr,1);841 stack.currentFrame->locals.max = get2Byte OpcodeValueAtOffset(stack.currentFrame->args.instructionPtr,3);779 min = get2ByteValue(stack.currentFrame->args.instructionPtr + 1); 780 stack.currentFrame->locals.max = get2ByteValue(stack.currentFrame->args.instructionPtr + 3); 842 781 if (stack.currentFrame->locals.max == 0) 843 782 stack.currentFrame->locals.max = INT_MAX; … … 846 785 847 786 default: /* No repeat follows */ 848 if (!match _ref(stack.currentFrame->locals.offset, stack.currentFrame->args.subjectPtr, stack.currentFrame->locals.length, md))787 if (!matchRef(stack.currentFrame->locals.offset, stack.currentFrame->args.subjectPtr, stack.currentFrame->locals.length, md)) 849 788 RRETURN_NO_MATCH; 850 789 stack.currentFrame->args.subjectPtr += stack.currentFrame->locals.length; … … 861 800 862 801 for (int i = 1; i <= min; i++) { 863 if (!match _ref(stack.currentFrame->locals.offset, stack.currentFrame->args.subjectPtr, stack.currentFrame->locals.length, md))802 if (!matchRef(stack.currentFrame->locals.offset, stack.currentFrame->args.subjectPtr, stack.currentFrame->locals.length, md)) 864 803 RRETURN_NO_MATCH; 865 804 stack.currentFrame->args.subjectPtr += stack.currentFrame->locals.length; … … 877 816 for (stack.currentFrame->locals.fi = min;; stack.currentFrame->locals.fi++) { 878 817 RECURSIVE_MATCH(20, stack.currentFrame->args.instructionPtr, stack.currentFrame->args.subpatternStart); 879 if (is _match)818 if (isMatch) 880 819 RRETURN; 881 if (stack.currentFrame->locals.fi >= stack.currentFrame->locals.max || !match _ref(stack.currentFrame->locals.offset, stack.currentFrame->args.subjectPtr, stack.currentFrame->locals.length, md))820 if (stack.currentFrame->locals.fi >= stack.currentFrame->locals.max || !matchRef(stack.currentFrame->locals.offset, stack.currentFrame->args.subjectPtr, stack.currentFrame->locals.length, md)) 882 821 RRETURN; 883 822 stack.currentFrame->args.subjectPtr += stack.currentFrame->locals.length; … … 891 830 stack.currentFrame->locals.subjectPtrAtStartOfInstruction = stack.currentFrame->args.subjectPtr; 892 831 for (int i = min; i < stack.currentFrame->locals.max; i++) { 893 if (!match _ref(stack.currentFrame->locals.offset, stack.currentFrame->args.subjectPtr, stack.currentFrame->locals.length, md))832 if (!matchRef(stack.currentFrame->locals.offset, stack.currentFrame->args.subjectPtr, stack.currentFrame->locals.length, md)) 894 833 break; 895 834 stack.currentFrame->args.subjectPtr += stack.currentFrame->locals.length; … … 897 836 while (stack.currentFrame->args.subjectPtr >= stack.currentFrame->locals.subjectPtrAtStartOfInstruction) { 898 837 RECURSIVE_MATCH(21, stack.currentFrame->args.instructionPtr, stack.currentFrame->args.subpatternStart); 899 if (is _match)838 if (isMatch) 900 839 RRETURN; 901 840 stack.currentFrame->args.subjectPtr -= stack.currentFrame->locals.length; … … 934 873 case OP_CRMINRANGE: 935 874 minimize = (*stack.currentFrame->args.instructionPtr == OP_CRMINRANGE); 936 min = get2Byte OpcodeValueAtOffset(stack.currentFrame->args.instructionPtr,1);937 stack.currentFrame->locals.max = get2Byte OpcodeValueAtOffset(stack.currentFrame->args.instructionPtr,3);875 min = get2ByteValue(stack.currentFrame->args.instructionPtr + 1); 876 stack.currentFrame->locals.max = get2ByteValue(stack.currentFrame->args.instructionPtr + 3); 938 877 if (stack.currentFrame->locals.max == 0) 939 878 stack.currentFrame->locals.max = INT_MAX; … … 949 888 950 889 for (int i = 1; i <= min; i++) { 951 if (stack.currentFrame->args.subjectPtr >= md.end _subject)890 if (stack.currentFrame->args.subjectPtr >= md.endSubject) 952 891 RRETURN_NO_MATCH; 953 892 int c = *stack.currentFrame->args.subjectPtr++; … … 972 911 for (stack.currentFrame->locals.fi = min;; stack.currentFrame->locals.fi++) { 973 912 RECURSIVE_MATCH(22, stack.currentFrame->args.instructionPtr, stack.currentFrame->args.subpatternStart); 974 if (is _match)913 if (isMatch) 975 914 RRETURN; 976 if (stack.currentFrame->locals.fi >= stack.currentFrame->locals.max || stack.currentFrame->args.subjectPtr >= md.end _subject)915 if (stack.currentFrame->locals.fi >= stack.currentFrame->locals.max || stack.currentFrame->args.subjectPtr >= md.endSubject) 977 916 RRETURN; 978 917 int c = *stack.currentFrame->args.subjectPtr++; … … 992 931 993 932 for (int i = min; i < stack.currentFrame->locals.max; i++) { 994 if (stack.currentFrame->args.subjectPtr >= md.end _subject)933 if (stack.currentFrame->args.subjectPtr >= md.endSubject) 995 934 break; 996 935 int c = *stack.currentFrame->args.subjectPtr; … … 1006 945 for (;;) { 1007 946 RECURSIVE_MATCH(24, stack.currentFrame->args.instructionPtr, stack.currentFrame->args.subpatternStart); 1008 if (is _match)947 if (isMatch) 1009 948 RRETURN; 1010 949 if (stack.currentFrame->args.subjectPtr-- == stack.currentFrame->locals.subjectPtrAtStartOfInstruction) … … 1020 959 BEGIN_OPCODE(XCLASS): 1021 960 stack.currentFrame->locals.data = stack.currentFrame->args.instructionPtr + 1 + LINK_SIZE; /* Save for matching */ 1022 stack.currentFrame->args.instructionPtr += get OpcodeValueAtOffset(stack.currentFrame->args.instructionPtr,1); /* Advance past the item */961 stack.currentFrame->args.instructionPtr += getLinkValue(stack.currentFrame->args.instructionPtr + 1); /* Advance past the item */ 1023 962 1024 963 switch (*stack.currentFrame->args.instructionPtr) { … … 1035 974 case OP_CRMINRANGE: 1036 975 minimize = (*stack.currentFrame->args.instructionPtr == OP_CRMINRANGE); 1037 min = get2Byte OpcodeValueAtOffset(stack.currentFrame->args.instructionPtr,1);1038 stack.currentFrame->locals.max = get2Byte OpcodeValueAtOffset(stack.currentFrame->args.instructionPtr,3);976 min = get2ByteValue(stack.currentFrame->args.instructionPtr + 1); 977 stack.currentFrame->locals.max = get2ByteValue(stack.currentFrame->args.instructionPtr + 3); 1039 978 if (stack.currentFrame->locals.max == 0) 1040 979 stack.currentFrame->locals.max = INT_MAX; … … 1044 983 default: /* No repeat follows */ 1045 984 min = stack.currentFrame->locals.max = 1; 1046 }985 } 1047 986 1048 987 /* First, ensure the minimum number of matches are present. */ 1049 988 1050 989 for (int i = 1; i <= min; i++) { 1051 if (stack.currentFrame->args.subjectPtr >= md.end _subject)990 if (stack.currentFrame->args.subjectPtr >= md.endSubject) 1052 991 RRETURN_NO_MATCH; 1053 992 int c = *stack.currentFrame->args.subjectPtr++; 1054 if (! _pcre_xclass(c, stack.currentFrame->locals.data))993 if (!kjs_pcre_xclass(c, stack.currentFrame->locals.data)) 1055 994 RRETURN_NO_MATCH; 1056 995 } … … 1068 1007 for (stack.currentFrame->locals.fi = min;; stack.currentFrame->locals.fi++) { 1069 1008 RECURSIVE_MATCH(26, stack.currentFrame->args.instructionPtr, stack.currentFrame->args.subpatternStart); 1070 if (is _match)1009 if (isMatch) 1071 1010 RRETURN; 1072 if (stack.currentFrame->locals.fi >= stack.currentFrame->locals.max || stack.currentFrame->args.subjectPtr >= md.end _subject)1011 if (stack.currentFrame->locals.fi >= stack.currentFrame->locals.max || stack.currentFrame->args.subjectPtr >= md.endSubject) 1073 1012 RRETURN; 1074 1013 int c = *stack.currentFrame->args.subjectPtr++; 1075 if (! _pcre_xclass(c, stack.currentFrame->locals.data))1014 if (!kjs_pcre_xclass(c, stack.currentFrame->locals.data)) 1076 1015 RRETURN; 1077 1016 } … … 1084 1023 stack.currentFrame->locals.subjectPtrAtStartOfInstruction = stack.currentFrame->args.subjectPtr; 1085 1024 for (int i = min; i < stack.currentFrame->locals.max; i++) { 1086 if (stack.currentFrame->args.subjectPtr >= md.end _subject)1025 if (stack.currentFrame->args.subjectPtr >= md.endSubject) 1087 1026 break; 1088 1027 int c = *stack.currentFrame->args.subjectPtr; 1089 if (! _pcre_xclass(c, stack.currentFrame->locals.data))1028 if (!kjs_pcre_xclass(c, stack.currentFrame->locals.data)) 1090 1029 break; 1091 1030 ++stack.currentFrame->args.subjectPtr; … … 1093 1032 for(;;) { 1094 1033 RECURSIVE_MATCH(27, stack.currentFrame->args.instructionPtr, stack.currentFrame->args.subpatternStart); 1095 if (is _match)1034 if (isMatch) 1096 1035 RRETURN; 1097 1036 if (stack.currentFrame->args.subjectPtr-- == stack.currentFrame->locals.subjectPtrAtStartOfInstruction) … … 1110 1049 getUTF8CharAndIncrementLength(stack.currentFrame->locals.fc, stack.currentFrame->args.instructionPtr, stack.currentFrame->locals.length); 1111 1050 stack.currentFrame->args.instructionPtr += stack.currentFrame->locals.length; 1112 if (stack.currentFrame->args.subjectPtr >= md.end _subject)1051 if (stack.currentFrame->args.subjectPtr >= md.endSubject) 1113 1052 RRETURN_NO_MATCH; 1114 1053 if (stack.currentFrame->locals.fc != *stack.currentFrame->args.subjectPtr++) … … 1123 1062 getUTF8CharAndIncrementLength(stack.currentFrame->locals.fc, stack.currentFrame->args.instructionPtr, stack.currentFrame->locals.length); 1124 1063 stack.currentFrame->args.instructionPtr += stack.currentFrame->locals.length; 1125 if (stack.currentFrame->args.subjectPtr >= md.end _subject)1064 if (stack.currentFrame->args.subjectPtr >= md.endSubject) 1126 1065 RRETURN_NO_MATCH; 1127 1066 int dc = *stack.currentFrame->args.subjectPtr++; 1128 if (stack.currentFrame->locals.fc != dc && _pcre_ucp_othercase(stack.currentFrame->locals.fc) != dc)1067 if (stack.currentFrame->locals.fc != dc && kjs_pcre_ucp_othercase(stack.currentFrame->locals.fc) != dc) 1129 1068 RRETURN_NO_MATCH; 1130 1069 NEXT_OPCODE; … … 1134 1073 1135 1074 BEGIN_OPCODE(ASCII_CHAR): 1136 if (md.end _subject == stack.currentFrame->args.subjectPtr)1075 if (md.endSubject == stack.currentFrame->args.subjectPtr) 1137 1076 RRETURN_NO_MATCH; 1138 1077 if (*stack.currentFrame->args.subjectPtr != stack.currentFrame->args.instructionPtr[1]) … … 1145 1084 1146 1085 BEGIN_OPCODE(ASCII_LETTER_IGNORING_CASE): 1147 if (md.end _subject == stack.currentFrame->args.subjectPtr)1086 if (md.endSubject == stack.currentFrame->args.subjectPtr) 1148 1087 RRETURN_NO_MATCH; 1149 1088 if ((*stack.currentFrame->args.subjectPtr | 0x20) != stack.currentFrame->args.instructionPtr[1]) … … 1156 1095 1157 1096 BEGIN_OPCODE(EXACT): 1158 min = stack.currentFrame->locals.max = get2Byte OpcodeValueAtOffset(stack.currentFrame->args.instructionPtr,1);1097 min = stack.currentFrame->locals.max = get2ByteValue(stack.currentFrame->args.instructionPtr + 1); 1159 1098 minimize = false; 1160 1099 stack.currentFrame->args.instructionPtr += 3; … … 1164 1103 BEGIN_OPCODE(MINUPTO): 1165 1104 min = 0; 1166 stack.currentFrame->locals.max = get2Byte OpcodeValueAtOffset(stack.currentFrame->args.instructionPtr,1);1105 stack.currentFrame->locals.max = get2ByteValue(stack.currentFrame->args.instructionPtr + 1); 1167 1106 minimize = *stack.currentFrame->args.instructionPtr == OP_MINUPTO; 1168 1107 stack.currentFrame->args.instructionPtr += 3; … … 1185 1124 stack.currentFrame->locals.length = 1; 1186 1125 getUTF8CharAndIncrementLength(stack.currentFrame->locals.fc, stack.currentFrame->args.instructionPtr, stack.currentFrame->locals.length); 1187 if (min * (stack.currentFrame->locals.fc > 0xFFFF ? 2 : 1) > md.end _subject - stack.currentFrame->args.subjectPtr)1126 if (min * (stack.currentFrame->locals.fc > 0xFFFF ? 2 : 1) > md.endSubject - stack.currentFrame->args.subjectPtr) 1188 1127 RRETURN_NO_MATCH; 1189 1128 stack.currentFrame->args.instructionPtr += stack.currentFrame->locals.length; 1190 1129 1191 1130 if (stack.currentFrame->locals.fc <= 0xFFFF) { 1192 int othercase = md.ignoreCase ? _pcre_ucp_othercase(stack.currentFrame->locals.fc) : -1;1131 int othercase = md.ignoreCase ? kjs_pcre_ucp_othercase(stack.currentFrame->locals.fc) : -1; 1193 1132 1194 1133 for (int i = 1; i <= min; i++) { … … 1202 1141 1203 1142 if (minimize) { 1204 stack.currentFrame->locals.repeat _othercase = othercase;1143 stack.currentFrame->locals.repeatOthercase = othercase; 1205 1144 for (stack.currentFrame->locals.fi = min;; stack.currentFrame->locals.fi++) { 1206 1145 RECURSIVE_MATCH(28, stack.currentFrame->args.instructionPtr, stack.currentFrame->args.subpatternStart); 1207 if (is _match)1146 if (isMatch) 1208 1147 RRETURN; 1209 if (stack.currentFrame->locals.fi >= stack.currentFrame->locals.max || stack.currentFrame->args.subjectPtr >= md.end _subject)1148 if (stack.currentFrame->locals.fi >= stack.currentFrame->locals.max || stack.currentFrame->args.subjectPtr >= md.endSubject) 1210 1149 RRETURN; 1211 if (*stack.currentFrame->args.subjectPtr != stack.currentFrame->locals.fc && *stack.currentFrame->args.subjectPtr != stack.currentFrame->locals.repeat _othercase)1150 if (*stack.currentFrame->args.subjectPtr != stack.currentFrame->locals.fc && *stack.currentFrame->args.subjectPtr != stack.currentFrame->locals.repeatOthercase) 1212 1151 RRETURN; 1213 1152 ++stack.currentFrame->args.subjectPtr; … … 1217 1156 stack.currentFrame->locals.subjectPtrAtStartOfInstruction = stack.currentFrame->args.subjectPtr; 1218 1157 for (int i = min; i < stack.currentFrame->locals.max; i++) { 1219 if (stack.currentFrame->args.subjectPtr >= md.end _subject)1158 if (stack.currentFrame->args.subjectPtr >= md.endSubject) 1220 1159 break; 1221 1160 if (*stack.currentFrame->args.subjectPtr != stack.currentFrame->locals.fc && *stack.currentFrame->args.subjectPtr != othercase) … … 1225 1164 while (stack.currentFrame->args.subjectPtr >= stack.currentFrame->locals.subjectPtrAtStartOfInstruction) { 1226 1165 RECURSIVE_MATCH(29, stack.currentFrame->args.instructionPtr, stack.currentFrame->args.subpatternStart); 1227 if (is _match)1166 if (isMatch) 1228 1167 RRETURN; 1229 1168 --stack.currentFrame->args.subjectPtr; … … 1247 1186 for (stack.currentFrame->locals.fi = min;; stack.currentFrame->locals.fi++) { 1248 1187 RECURSIVE_MATCH(30, stack.currentFrame->args.instructionPtr, stack.currentFrame->args.subpatternStart); 1249 if (is _match)1188 if (isMatch) 1250 1189 RRETURN; 1251 if (stack.currentFrame->locals.fi >= stack.currentFrame->locals.max || stack.currentFrame->args.subjectPtr >= md.end _subject)1190 if (stack.currentFrame->locals.fi >= stack.currentFrame->locals.max || stack.currentFrame->args.subjectPtr >= md.endSubject) 1252 1191 RRETURN; 1253 1192 if (*stack.currentFrame->args.subjectPtr != stack.currentFrame->locals.fc) … … 1259 1198 stack.currentFrame->locals.subjectPtrAtStartOfInstruction = stack.currentFrame->args.subjectPtr; 1260 1199 for (int i = min; i < stack.currentFrame->locals.max; i++) { 1261 if (stack.currentFrame->args.subjectPtr > md.end _subject - 2)1200 if (stack.currentFrame->args.subjectPtr > md.endSubject - 2) 1262 1201 break; 1263 1202 if (*stack.currentFrame->args.subjectPtr != stack.currentFrame->locals.fc) … … 1267 1206 while (stack.currentFrame->args.subjectPtr >= stack.currentFrame->locals.subjectPtrAtStartOfInstruction) { 1268 1207 RECURSIVE_MATCH(31, stack.currentFrame->args.instructionPtr, stack.currentFrame->args.subpatternStart); 1269 if (is _match)1208 if (isMatch) 1270 1209 RRETURN; 1271 1210 stack.currentFrame->args.subjectPtr -= 2; … … 1280 1219 1281 1220 BEGIN_OPCODE(NOT): { 1282 if (stack.currentFrame->args.subjectPtr >= md.end _subject)1221 if (stack.currentFrame->args.subjectPtr >= md.endSubject) 1283 1222 RRETURN_NO_MATCH; 1284 1223 stack.currentFrame->args.instructionPtr++; … … 1304 1243 1305 1244 BEGIN_OPCODE(NOTEXACT): 1306 min = stack.currentFrame->locals.max = get2Byte OpcodeValueAtOffset(stack.currentFrame->args.instructionPtr,1);1245 min = stack.currentFrame->locals.max = get2ByteValue(stack.currentFrame->args.instructionPtr + 1); 1307 1246 minimize = false; 1308 1247 stack.currentFrame->args.instructionPtr += 3; … … 1312 1251 BEGIN_OPCODE(NOTMINUPTO): 1313 1252 min = 0; 1314 stack.currentFrame->locals.max = get2Byte OpcodeValueAtOffset(stack.currentFrame->args.instructionPtr,1);1253 stack.currentFrame->locals.max = get2ByteValue(stack.currentFrame->args.instructionPtr + 1); 1315 1254 minimize = *stack.currentFrame->args.instructionPtr == OP_NOTMINUPTO; 1316 1255 stack.currentFrame->args.instructionPtr += 3; … … 1330 1269 1331 1270 REPEATNOTCHAR: 1332 if (min > md.end _subject - stack.currentFrame->args.subjectPtr)1271 if (min > md.endSubject - stack.currentFrame->args.subjectPtr) 1333 1272 RRETURN_NO_MATCH; 1334 1273 stack.currentFrame->locals.fc = *stack.currentFrame->args.instructionPtr++; … … 1362 1301 for (stack.currentFrame->locals.fi = min;; stack.currentFrame->locals.fi++) { 1363 1302 RECURSIVE_MATCH(38, stack.currentFrame->args.instructionPtr, stack.currentFrame->args.subpatternStart); 1364 if (is _match)1303 if (isMatch) 1365 1304 RRETURN; 1366 1305 int d = *stack.currentFrame->args.subjectPtr++; 1367 1306 if (d < 128) 1368 1307 d = toLowerCase(d); 1369 if (stack.currentFrame->locals.fi >= stack.currentFrame->locals.max || stack.currentFrame->args.subjectPtr >= md.end _subject || stack.currentFrame->locals.fc == d)1308 if (stack.currentFrame->locals.fi >= stack.currentFrame->locals.max || stack.currentFrame->args.subjectPtr >= md.endSubject || stack.currentFrame->locals.fc == d) 1370 1309 RRETURN; 1371 1310 } … … 1379 1318 1380 1319 for (int i = min; i < stack.currentFrame->locals.max; i++) { 1381 if (stack.currentFrame->args.subjectPtr >= md.end _subject)1320 if (stack.currentFrame->args.subjectPtr >= md.endSubject) 1382 1321 break; 1383 1322 int d = *stack.currentFrame->args.subjectPtr; … … 1390 1329 for (;;) { 1391 1330 RECURSIVE_MATCH(40, stack.currentFrame->args.instructionPtr, stack.currentFrame->args.subpatternStart); 1392 if (is _match)1331 if (isMatch) 1393 1332 RRETURN; 1394 1333 if (stack.currentFrame->args.subjectPtr-- == stack.currentFrame->locals.subjectPtrAtStartOfInstruction) … … 1416 1355 for (stack.currentFrame->locals.fi = min;; stack.currentFrame->locals.fi++) { 1417 1356 RECURSIVE_MATCH(42, stack.currentFrame->args.instructionPtr, stack.currentFrame->args.subpatternStart); 1418 if (is _match)1357 if (isMatch) 1419 1358 RRETURN; 1420 1359 int d = *stack.currentFrame->args.subjectPtr++; 1421 if (stack.currentFrame->locals.fi >= stack.currentFrame->locals.max || stack.currentFrame->args.subjectPtr >= md.end _subject || stack.currentFrame->locals.fc == d)1360 if (stack.currentFrame->locals.fi >= stack.currentFrame->locals.max || stack.currentFrame->args.subjectPtr >= md.endSubject || stack.currentFrame->locals.fc == d) 1422 1361 RRETURN; 1423 1362 } … … 1431 1370 1432 1371 for (int i = min; i < stack.currentFrame->locals.max; i++) { 1433 if (stack.currentFrame->args.subjectPtr >= md.end _subject)1372 if (stack.currentFrame->args.subjectPtr >= md.endSubject) 1434 1373 break; 1435 1374 int d = *stack.currentFrame->args.subjectPtr; … … 1440 1379 for (;;) { 1441 1380 RECURSIVE_MATCH(44, stack.currentFrame->args.instructionPtr, stack.currentFrame->args.subpatternStart); 1442 if (is _match)1381 if (isMatch) 1443 1382 RRETURN; 1444 1383 if (stack.currentFrame->args.subjectPtr-- == stack.currentFrame->locals.subjectPtrAtStartOfInstruction) … … 1456 1395 1457 1396 BEGIN_OPCODE(TYPEEXACT): 1458 min = stack.currentFrame->locals.max = get2Byte OpcodeValueAtOffset(stack.currentFrame->args.instructionPtr,1);1397 min = stack.currentFrame->locals.max = get2ByteValue(stack.currentFrame->args.instructionPtr + 1); 1459 1398 minimize = true; 1460 1399 stack.currentFrame->args.instructionPtr += 3; … … 1464 1403 BEGIN_OPCODE(TYPEMINUPTO): 1465 1404 min = 0; 1466 stack.currentFrame->locals.max = get2Byte OpcodeValueAtOffset(stack.currentFrame->args.instructionPtr,1);1405 stack.currentFrame->locals.max = get2ByteValue(stack.currentFrame->args.instructionPtr + 1); 1467 1406 minimize = *stack.currentFrame->args.instructionPtr == OP_TYPEMINUPTO; 1468 1407 stack.currentFrame->args.instructionPtr += 3; … … 1489 1428 the minimum number of characters before we start. */ 1490 1429 1491 if (min > md.end _subject - stack.currentFrame->args.subjectPtr)1430 if (min > md.endSubject - stack.currentFrame->args.subjectPtr) 1492 1431 RRETURN_NO_MATCH; 1493 1432 if (min > 0) { … … 1566 1505 for (stack.currentFrame->locals.fi = min;; stack.currentFrame->locals.fi++) { 1567 1506 RECURSIVE_MATCH(48, stack.currentFrame->args.instructionPtr, stack.currentFrame->args.subpatternStart); 1568 if (is _match)1507 if (isMatch) 1569 1508 RRETURN; 1570 if (stack.currentFrame->locals.fi >= stack.currentFrame->locals.max || stack.currentFrame->args.subjectPtr >= md.end _subject)1509 if (stack.currentFrame->locals.fi >= stack.currentFrame->locals.max || stack.currentFrame->args.subjectPtr >= md.endSubject) 1571 1510 RRETURN; 1572 1511 … … 1625 1564 case OP_NOT_NEWLINE: 1626 1565 for (int i = min; i < stack.currentFrame->locals.max; i++) { 1627 if (stack.currentFrame->args.subjectPtr >= md.end _subject || isNewline(*stack.currentFrame->args.subjectPtr))1566 if (stack.currentFrame->args.subjectPtr >= md.endSubject || isNewline(*stack.currentFrame->args.subjectPtr)) 1628 1567 break; 1629 1568 stack.currentFrame->args.subjectPtr++; … … 1633 1572 case OP_NOT_DIGIT: 1634 1573 for (int i = min; i < stack.currentFrame->locals.max; i++) { 1635 if (stack.currentFrame->args.subjectPtr >= md.end _subject)1574 if (stack.currentFrame->args.subjectPtr >= md.endSubject) 1636 1575 break; 1637 1576 int c = *stack.currentFrame->args.subjectPtr; … … 1644 1583 case OP_DIGIT: 1645 1584 for (int i = min; i < stack.currentFrame->locals.max; i++) { 1646 if (stack.currentFrame->args.subjectPtr >= md.end _subject)1585 if (stack.currentFrame->args.subjectPtr >= md.endSubject) 1647 1586 break; 1648 1587 int c = *stack.currentFrame->args.subjectPtr; … … 1655 1594 case OP_NOT_WHITESPACE: 1656 1595 for (int i = min; i < stack.currentFrame->locals.max; i++) { 1657 if (stack.currentFrame->args.subjectPtr >= md.end _subject)1596 if (stack.currentFrame->args.subjectPtr >= md.endSubject) 1658 1597 break; 1659 1598 int c = *stack.currentFrame->args.subjectPtr; … … 1666 1605 case OP_WHITESPACE: 1667 1606 for (int i = min; i < stack.currentFrame->locals.max; i++) { 1668 if (stack.currentFrame->args.subjectPtr >= md.end _subject)1607 if (stack.currentFrame->args.subjectPtr >= md.endSubject) 1669 1608 break; 1670 1609 int c = *stack.currentFrame->args.subjectPtr; … … 1677 1616 case OP_NOT_WORDCHAR: 1678 1617 for (int i = min; i < stack.currentFrame->locals.max; i++) { 1679 if (stack.currentFrame->args.subjectPtr >= md.end _subject)1618 if (stack.currentFrame->args.subjectPtr >= md.endSubject) 1680 1619 break; 1681 1620 int c = *stack.currentFrame->args.subjectPtr; … … 1688 1627 case OP_WORDCHAR: 1689 1628 for (int i = min; i < stack.currentFrame->locals.max; i++) { 1690 if (stack.currentFrame->args.subjectPtr >= md.end _subject)1629 if (stack.currentFrame->args.subjectPtr >= md.endSubject) 1691 1630 break; 1692 1631 int c = *stack.currentFrame->args.subjectPtr; … … 1706 1645 for (;;) { 1707 1646 RECURSIVE_MATCH(52, stack.currentFrame->args.instructionPtr, stack.currentFrame->args.subpatternStart); 1708 if (is _match)1647 if (isMatch) 1709 1648 RRETURN; 1710 1649 if (stack.currentFrame->args.subjectPtr-- == stack.currentFrame->locals.subjectPtrAtStartOfInstruction) … … 1756 1695 1757 1696 if (stack.currentFrame->locals.number > EXTRACT_BASIC_MAX) 1758 stack.currentFrame->locals.number = get2Byte OpcodeValueAtOffset(stack.currentFrame->args.instructionPtr, 2+LINK_SIZE);1697 stack.currentFrame->locals.number = get2ByteValue(stack.currentFrame->args.instructionPtr + 2 + LINK_SIZE); 1759 1698 stack.currentFrame->locals.offset = stack.currentFrame->locals.number << 1; 1760 1699 … … 1765 1704 #endif 1766 1705 1767 if (stack.currentFrame->locals.offset < md.offset _max) {1768 stack.currentFrame->locals.save _offset1 = md.offset_vector[stack.currentFrame->locals.offset];1769 stack.currentFrame->locals.save _offset2 = md.offset_vector[stack.currentFrame->locals.offset + 1];1770 stack.currentFrame->locals.save _offset3 = md.offset_vector[md.offset_end - stack.currentFrame->locals.number];1771 1772 DPRINTF(("saving %d %d %d\n", stack.currentFrame->locals.save _offset1, stack.currentFrame->locals.save_offset2, stack.currentFrame->locals.save_offset3));1773 md.offset _vector[md.offset_end - stack.currentFrame->locals.number] = stack.currentFrame->args.subjectPtr - md.start_subject;1706 if (stack.currentFrame->locals.offset < md.offsetMax) { 1707 stack.currentFrame->locals.saveOffset1 = md.offsetVector[stack.currentFrame->locals.offset]; 1708 stack.currentFrame->locals.saveOffset2 = md.offsetVector[stack.currentFrame->locals.offset + 1]; 1709 stack.currentFrame->locals.saveOffset3 = md.offsetVector[md.offsetEnd - stack.currentFrame->locals.number]; 1710 1711 DPRINTF(("saving %d %d %d\n", stack.currentFrame->locals.saveOffset1, stack.currentFrame->locals.saveOffset2, stack.currentFrame->locals.saveOffset3)); 1712 md.offsetVector[md.offsetEnd - stack.currentFrame->locals.number] = stack.currentFrame->args.subjectPtr - md.startSubject; 1774 1713 1775 1714 do { 1776 1715 RECURSIVE_MATCH_STARTNG_NEW_GROUP(1, stack.currentFrame->args.instructionPtr + 1 + LINK_SIZE, stack.currentFrame->args.subpatternStart); 1777 if (is _match)1716 if (isMatch) 1778 1717 RRETURN; 1779 stack.currentFrame->args.instructionPtr += get OpcodeValueAtOffset(stack.currentFrame->args.instructionPtr,1);1718 stack.currentFrame->args.instructionPtr += getLinkValue(stack.currentFrame->args.instructionPtr + 1); 1780 1719 } while (*stack.currentFrame->args.instructionPtr == OP_ALT); 1781 1720 1782 1721 DPRINTF(("bracket %d failed\n", stack.currentFrame->locals.number)); 1783 1722 1784 md.offset _vector[stack.currentFrame->locals.offset] = stack.currentFrame->locals.save_offset1;1785 md.offset _vector[stack.currentFrame->locals.offset + 1] = stack.currentFrame->locals.save_offset2;1786 md.offset _vector[md.offset_end - stack.currentFrame->locals.number] = stack.currentFrame->locals.save_offset3;1723 md.offsetVector[stack.currentFrame->locals.offset] = stack.currentFrame->locals.saveOffset1; 1724 md.offsetVector[stack.currentFrame->locals.offset + 1] = stack.currentFrame->locals.saveOffset2; 1725 md.offsetVector[md.offsetEnd - stack.currentFrame->locals.number] = stack.currentFrame->locals.saveOffset3; 1787 1726 1788 1727 RRETURN; … … 1846 1785 1847 1786 RETURN: 1848 ASSERT(is _match == MATCH_MATCH || is_match == MATCH_NOMATCH);1849 return is _match;1787 ASSERT(isMatch == MATCH_MATCH || isMatch == MATCH_NOMATCH); 1788 return isMatch; 1850 1789 } 1851 1790 … … 1904 1843 } 1905 1844 1906 static bool tryRequiredByteOptimization(const UChar*& subjectPtr, const UChar* endSubject, int req_byte, int req_byte2, bool req_byte_caseless, bool hasFirstByte, const UChar*& req _byte_ptr)1845 static bool tryRequiredByteOptimization(const UChar*& subjectPtr, const UChar* endSubject, int req_byte, int req_byte2, bool req_byte_caseless, bool hasFirstByte, const UChar*& reqBytePtr) 1907 1846 { 1908 1847 /* If req_byte is set, we know that that character must appear in the subject … … 1926 1865 place we found it at last time. */ 1927 1866 1928 if (p > req _byte_ptr) {1867 if (p > reqBytePtr) { 1929 1868 if (req_byte_caseless) { 1930 1869 while (p < endSubject) { … … 1953 1892 the start hasn't passed this character yet. */ 1954 1893 1955 req _byte_ptr = p;1894 reqBytePtr = p; 1956 1895 } 1957 1896 } … … 1968 1907 ASSERT(offsets || offsetcount == 0); 1969 1908 1970 MatchData match _block;1971 match _block.start_subject = subject;1972 match _block.end_subject = match_block.start_subject + length;1973 const UChar* end _subject = match_block.end_subject;1974 1975 match _block.multiline = (re->options & MatchAcrossMultipleLinesOption);1976 match _block.ignoreCase = (re->options & IgnoreCaseOption);1909 MatchData matchBlock; 1910 matchBlock.startSubject = subject; 1911 matchBlock.endSubject = matchBlock.startSubject + length; 1912 const UChar* endSubject = matchBlock.endSubject; 1913 1914 matchBlock.multiline = (re->options & MatchAcrossMultipleLinesOption); 1915 matchBlock.ignoreCase = (re->options & IgnoreCaseOption); 1977 1916 1978 1917 /* If the expression has got more back references than the offsets supplied can … … 1989 1928 if (re->top_backref > 0 && re->top_backref >= ocount/3) { 1990 1929 ocount = re->top_backref * 3 + 3; 1991 match _block.offset_vector = new int[ocount];1992 if (!match _block.offset_vector)1930 matchBlock.offsetVector = new int[ocount]; 1931 if (!matchBlock.offsetVector) 1993 1932 return JSRegExpErrorNoMemory; 1994 1933 using_temporary_offsets = true; 1995 1934 } else 1996 match _block.offset_vector = offsets;1997 1998 match _block.offset_end = ocount;1999 match _block.offset_max = (2*ocount)/3;2000 match _block.offset_overflow = false;1935 matchBlock.offsetVector = offsets; 1936 1937 matchBlock.offsetEnd = ocount; 1938 matchBlock.offsetMax = (2*ocount)/3; 1939 matchBlock.offsetOverflow = false; 2001 1940 2002 1941 /* Compute the minimum number of offsets that we need to reset each time. Doing … … 2012 1951 initialize them to avoid reading uninitialized locations. */ 2013 1952 2014 if (match _block.offset_vector) {2015 int* iptr = match _block.offset_vector + ocount;1953 if (matchBlock.offsetVector) { 1954 int* iptr = matchBlock.offsetVector + ocount; 2016 1955 int* iend = iptr - resetcount/2 + 1; 2017 1956 while (--iptr >= iend) … … 2048 1987 the loop runs just once. */ 2049 1988 2050 const UChar* start _match = subject + start_offset;2051 const UChar* req _byte_ptr = start_match - 1;1989 const UChar* startMatch = subject + start_offset; 1990 const UChar* reqBytePtr = startMatch - 1; 2052 1991 bool useMultiLineFirstCharOptimization = re->options & UseMultiLineFirstByteOptimizationOption; 2053 1992 2054 1993 do { 2055 1994 /* Reset the maximum number of extractions we might see. */ 2056 if (match _block.offset_vector) {2057 int* iptr = match _block.offset_vector;1995 if (matchBlock.offsetVector) { 1996 int* iptr = matchBlock.offsetVector; 2058 1997 int* iend = iptr + resetcount; 2059 1998 while (iptr < iend) … … 2061 2000 } 2062 2001 2063 tryFirstByteOptimization(start _match, end_subject, first_byte, first_byte_caseless, useMultiLineFirstCharOptimization, match_block.start_subject + start_offset);2064 if (tryRequiredByteOptimization(start _match, end_subject, req_byte, req_byte2, req_byte_caseless, first_byte >= 0, req_byte_ptr))2002 tryFirstByteOptimization(startMatch, endSubject, first_byte, first_byte_caseless, useMultiLineFirstCharOptimization, matchBlock.startSubject + start_offset); 2003 if (tryRequiredByteOptimization(startMatch, endSubject, req_byte, req_byte2, req_byte_caseless, first_byte >= 0, reqBytePtr)) 2065 2004 break; 2066 2005 … … 2073 2012 2074 2013 /* The code starts after the JSRegExp block and the capture name table. */ 2075 const u schar* start_code = (const uschar*)(re + 1);2014 const unsigned char* start_code = (const unsigned char*)(re + 1); 2076 2015 2077 int returnCode = match(start _match, start_code, 2, match_block);2016 int returnCode = match(startMatch, start_code, 2, matchBlock); 2078 2017 2079 2018 /* When the result is no match, advance the pointer to the next character … … 2081 2020 2082 2021 if (returnCode == MATCH_NOMATCH) { 2083 start _match++;2022 startMatch++; 2084 2023 continue; 2085 2024 } … … 2095 2034 if (using_temporary_offsets) { 2096 2035 if (offsetcount >= 4) { 2097 memcpy(offsets + 2, match _block.offset_vector + 2, (offsetcount - 2) * sizeof(int));2036 memcpy(offsets + 2, matchBlock.offsetVector + 2, (offsetcount - 2) * sizeof(int)); 2098 2037 DPRINTF(("Copied offsets from temporary memory\n")); 2099 2038 } 2100 if (match _block.end_offset_top > offsetcount)2101 match _block.offset_overflow = true;2039 if (matchBlock.endOffsetTop > offsetcount) 2040 matchBlock.offsetOverflow = true; 2102 2041 2103 2042 DPRINTF(("Freeing temporary memory\n")); 2104 delete [] match _block.offset_vector;2043 delete [] matchBlock.offsetVector; 2105 2044 } 2106 2045 2107 returnCode = match _block.offset_overflow ? 0 : match_block.end_offset_top / 2;2046 returnCode = matchBlock.offsetOverflow ? 0 : matchBlock.endOffsetTop / 2; 2108 2047 2109 2048 if (offsetcount < 2) 2110 2049 returnCode = 0; 2111 2050 else { 2112 offsets[0] = start _match - match_block.start_subject;2113 offsets[1] = match _block.end_match_ptr - match_block.start_subject;2051 offsets[0] = startMatch - matchBlock.startSubject; 2052 offsets[1] = matchBlock.endMatchPtr - matchBlock.startSubject; 2114 2053 } 2115 2054 2116 2055 DPRINTF((">>>> returning %d\n", rc)); 2117 2056 return returnCode; 2118 } while (start _match <= end_subject);2057 } while (startMatch <= endSubject); 2119 2058 2120 2059 if (using_temporary_offsets) { 2121 2060 DPRINTF(("Freeing temporary memory\n")); 2122 delete [] match _block.offset_vector;2061 delete [] matchBlock.offsetVector; 2123 2062 } 2124 2063 -
trunk/JavaScriptCore/pcre/pcre_internal.h
r28525 r28793 77 77 #endif 78 78 79 #include "pcre.h" 80 79 81 /* The value of LINK_SIZE determines the number of bytes used to store links as 80 82 offsets within the compiled regex. The default is 2, which allows for compiled 81 patterns up to 64K long. This covers the vast majority of cases. However, PCRE 82 can also be compiled to use 3 or 4 bytes instead. This allows for longer 83 patterns in extreme cases. On systems that support it, "configure" can be used 84 to override this default. */ 83 patterns up to 64K long. */ 85 84 86 85 #define LINK_SIZE 2 87 88 /* The below limit restricts the number of recursive match calls in order to89 limit the maximum amount of stack (or heap, if NO_RECURSE is defined) that is used. The90 value of MATCH_LIMIT_RECURSION applies only to recursive calls of match().91 92 This limit is tied to the size of MatchFrame. Right now we allow PCRE to allocate up93 to MATCH_LIMIT_RECURSION - 16 * sizeof(MatchFrame) bytes of "stack" space before we give up.94 Currently that's 100000 - 16 * (23 * 4) ~ 90MB95 */96 97 #define MATCH_LIMIT_RECURSION 10000098 99 #define _pcre_default_tables kjs_pcre_default_tables100 #define _pcre_ord2utf8 kjs_pcre_ord2utf8101 #define _pcre_utf8_table1 kjs_pcre_utf8_table1102 #define _pcre_utf8_table2 kjs_pcre_utf8_table2103 #define _pcre_utf8_table3 kjs_pcre_utf8_table3104 #define _pcre_utf8_table4 kjs_pcre_utf8_table4105 #define _pcre_xclass kjs_pcre_xclass106 86 107 87 /* Define DEBUG to get debugging output on stdout. */ … … 121 101 #define DPRINTF(p) /*nothing*/ 122 102 #endif 123 124 /* Standard C headers plus the external interface definition. The only time125 setjmp and stdarg are used is when NO_RECURSE is set. */126 127 #include <ctype.h>128 #include <limits.h>129 #include <setjmp.h>130 #include <stdarg.h>131 #include <stddef.h>132 #include <stdio.h>133 #include <stdlib.h>134 #include <string.h>135 136 /* Include the public PCRE header and the definitions of UCP character property137 values. */138 139 #include "pcre.h"140 141 typedef unsigned short pcre_uint16;142 typedef unsigned pcre_uint32;143 typedef unsigned char uschar;144 103 145 104 /* PCRE keeps offsets in its compiled code as 2-byte quantities (always stored … … 149 108 for almost everybody. However, I received a request for an even bigger limit. 150 109 For this reason, and also to make the code easier to maintain, the storing and 151 loading of offsets from the byte string is now handled by the macros that are 152 defined here. 153 154 The macros are controlled by the value of LINK_SIZE. This defaults to 2 in 155 the config.h file, but can be overridden by using -D on the command line. This 156 is automated on Unix systems via the "configure" command. */ 157 158 #if LINK_SIZE == 2 159 160 static inline void putOpcodeValueAtOffset(uschar* opcodePtr, size_t offset, unsigned short value) 161 { 162 opcodePtr[offset] = value >> 8; 163 opcodePtr[offset + 1] = value & 255; 164 } 165 166 static inline short getOpcodeValueAtOffset(const uschar* opcodePtr, size_t offset) 167 { 168 return ((opcodePtr[offset] << 8) | opcodePtr[offset + 1]); 169 } 170 171 #define MAX_PATTERN_SIZE (1 << 16) 172 173 #elif LINK_SIZE == 3 174 175 static inline void putOpcodeValueAtOffset(uschar* opcodePtr, size_t offset, unsigned value) 176 { 177 ASSERT(!(value & 0xFF000000)); // This function only allows values < 2^24 178 opcodePtr[offset] = value >> 16; 179 opcodePtr[offset + 1] = value >> 8; 180 opcodePtr[offset + 2] = value & 255; 181 } 182 183 static inline int getOpcodeValueAtOffset(const uschar* opcodePtr, size_t offset) 184 { 185 return ((opcodePtr[offset] << 16) | (opcodePtr[offset + 1] << 8) | opcodePtr[offset + 2]); 186 } 187 188 #define MAX_PATTERN_SIZE (1 << 24) 189 190 #elif LINK_SIZE == 4 191 192 static inline void putOpcodeValueAtOffset(uschar* opcodePtr, size_t offset, unsigned value) 193 { 194 opcodePtr[offset] = value >> 24; 195 opcodePtr[offset + 1] = value >> 16; 196 opcodePtr[offset + 2] = value >> 8; 197 opcodePtr[offset + 3] = value & 255; 198 } 199 200 static inline int getOpcodeValueAtOffset(const uschar* opcodePtr, size_t offset) 201 { 202 return ((opcodePtr[offset] << 24) | (opcodePtr[offset + 1] << 16) | (opcodePtr[offset + 2] << 8) | opcodePtr[offset + 3]); 203 } 204 205 #define MAX_PATTERN_SIZE (1 << 30) /* Keep it positive */ 206 207 #else 208 #error LINK_SIZE must be either 2, 3, or 4 209 #endif 210 211 static inline void putOpcodeValueAtOffsetAndAdvance(uschar*& opcodePtr, size_t offset, unsigned short value) 212 { 213 putOpcodeValueAtOffset(opcodePtr, offset, value); 214 opcodePtr += LINK_SIZE; 215 } 110 loading of offsets from the byte string is now handled by the functions that are 111 defined here. */ 216 112 217 113 /* PCRE uses some other 2-byte quantities that do not change when the size of … … 219 115 capturing parenthesis numbers in back references. */ 220 116 221 static inline void put2ByteOpcodeValueAtOffset(uschar* opcodePtr, size_t offset, unsigned short value) 222 { 223 opcodePtr[offset] = value >> 8; 224 opcodePtr[offset + 1] = value & 255; 225 } 226 227 static inline short get2ByteOpcodeValueAtOffset(const uschar* opcodePtr, size_t offset) 228 { 229 return ((opcodePtr[offset] << 8) | opcodePtr[offset + 1]); 230 } 231 232 static inline void put2ByteOpcodeValueAtOffsetAndAdvance(uschar*& opcodePtr, size_t offset, unsigned short value) 233 { 234 put2ByteOpcodeValueAtOffset(opcodePtr, offset, value); 117 static inline void put2ByteValue(unsigned char* opcodePtr, int value) 118 { 119 ASSERT(value >= 0 && value <= 0xFFFF); 120 opcodePtr[0] = value >> 8; 121 opcodePtr[1] = value; 122 } 123 124 static inline int get2ByteValue(const unsigned char* opcodePtr) 125 { 126 return (opcodePtr[0] << 8) | opcodePtr[1]; 127 } 128 129 static inline void put2ByteValueAndAdvance(unsigned char*& opcodePtr, int value) 130 { 131 put2ByteValue(opcodePtr, value); 235 132 opcodePtr += 2; 133 } 134 135 static inline void putLinkValueAllowZero(unsigned char* opcodePtr, int value) 136 { 137 put2ByteValue(opcodePtr, value); 138 } 139 140 static inline int getLinkValueAllowZero(const unsigned char* opcodePtr) 141 { 142 return get2ByteValue(opcodePtr); 143 } 144 145 #define MAX_PATTERN_SIZE (1 << 16) 146 147 static inline void putLinkValue(unsigned char* opcodePtr, int value) 148 { 149 ASSERT(value); 150 putLinkValueAllowZero(opcodePtr, value); 151 } 152 153 static inline int getLinkValue(const unsigned char* opcodePtr) 154 { 155 int value = getLinkValueAllowZero(opcodePtr); 156 ASSERT(value); 157 return value; 158 } 159 160 static inline void putLinkValueAndAdvance(unsigned char*& opcodePtr, int value) 161 { 162 putLinkValue(opcodePtr, value); 163 opcodePtr += LINK_SIZE; 164 } 165 166 static inline void putLinkValueAllowZeroAndAdvance(unsigned char*& opcodePtr, int value) 167 { 168 putLinkValueAllowZero(opcodePtr, value); 169 opcodePtr += LINK_SIZE; 236 170 } 237 171 … … 246 180 }; 247 181 248 /* Negative values for the firstchar and reqchar variables */249 250 #define REQ_UNSET (-2)251 #define REQ_NONE (-1)252 253 /* The maximum remaining length of subject we are prepared to search for a254 req_byte match. */255 256 #define REQ_BYTE_MAX 1000257 258 182 /* Flags added to firstbyte or reqbyte; a "non-literal" item is either a 259 183 variable-length repeat, or a anything other than literal characters. */ … … 367 291 macro(ASSERT_NOT) \ 368 292 \ 369 macro(ONCE) \370 \371 293 macro(BRAZERO) \ 372 294 macro(BRAMINZERO) \ … … 382 304 383 305 /* The highest extraction number before we have to start using additional 384 bytes. (Originally PCRE didn't have support for extraction counts high ter than306 bytes. (Originally PCRE didn't have support for extraction counts higher than 385 307 this number.) The value is limited by the number of opcodes left after OP_BRA, 386 308 i.e. 255 - OP_BRA. We actually set it a bit lower to leave room for additional 387 309 opcodes. */ 388 310 311 /* FIXME: Note that OP_BRA + 100 is > 128, so the two comments above 312 are in conflict! */ 313 389 314 #define EXTRACT_BASIC_MAX 100 390 391 /* This macro defines the length of fixed length operations in the compiled392 regex. The lengths are used when searching for specific things, and also in the393 debugging printing of a compiled regex. We use a macro so that it can be394 defined close to the definitions of the opcodes themselves.395 396 As things have been extended, some of these are no longer fixed lenths, but are397 minima instead. For example, the length of a single-character repeat may vary398 in UTF-8 mode. The code that uses this table must know about such things. */399 400 #define OP_LENGTHS \401 1, /* End */ \402 1, 1, 1, 1, 1, 1, 1, 1, /* \B, \b, \D, \d, \S, \s, \W, \w */ \403 1, /* Any */ \404 1, 1, /* ^, $ */ \405 2, 2, /* Char, Charnc - minimum lengths */ \406 2, 2, /* ASCII char or non-cased */ \407 2, /* not */ \408 /* Positive single-char repeats ** These are */ \409 2, 2, 2, 2, 2, 2, /* *, *?, +, +?, ?, ?? ** minima in */ \410 4, 4, 4, /* upto, minupto, exact ** UTF-8 mode */ \411 /* Negative single-char repeats - only for chars < 256 */ \412 2, 2, 2, 2, 2, 2, /* NOT *, *?, +, +?, ?, ?? */ \413 4, 4, 4, /* NOT upto, minupto, exact */ \414 /* Positive type repeats */ \415 2, 2, 2, 2, 2, 2, /* Type *, *?, +, +?, ?, ?? */ \416 4, 4, 4, /* Type upto, minupto, exact */ \417 /* Character class & ref repeats */ \418 1, 1, 1, 1, 1, 1, /* *, *?, +, +?, ?, ?? */ \419 5, 5, /* CRRANGE, CRMINRANGE */ \420 33, /* CLASS */ \421 33, /* NCLASS */ \422 0, /* XCLASS - variable length */ \423 3, /* REF */ \424 1 + LINK_SIZE, /* Alt */ \425 1 + LINK_SIZE, /* Ket */ \426 1 + LINK_SIZE, /* KetRmax */ \427 1 + LINK_SIZE, /* KetRmin */ \428 1 + LINK_SIZE, /* Assert */ \429 1 + LINK_SIZE, /* Assert not */ \430 1 + LINK_SIZE, /* Once */ \431 1, 1, /* BRAZERO, BRAMINZERO */ \432 3, /* BRANUMBER */ \433 1 + LINK_SIZE /* BRA */ \434 435 315 436 316 /* The index of names and the … … 443 323 444 324 struct JSRegExp { 445 pcre_uint32options;446 447 pcre_uint16top_bracket;448 pcre_uint16top_backref;325 unsigned options; 326 327 unsigned short top_bracket; 328 unsigned short top_backref; 449 329 450 // jsRegExpExecute && jsRegExpCompile currently only how to handle ASCII 451 // chars for thse optimizations, however it would be trivial to add support 452 // for optimized UChar first_byte/req_byte scans 453 pcre_uint16 first_byte; 454 pcre_uint16 req_byte; 330 unsigned short first_byte; 331 unsigned short req_byte; 455 332 }; 456 333 … … 460 337 pcre_tables.c module. */ 461 338 462 #define _pcre_utf8_table1_size 6463 464 extern const int _pcre_utf8_table1[6];465 extern const int _pcre_utf8_table2[6];466 extern const int _pcre_utf8_table3[6];467 extern const u schar_pcre_utf8_table4[0x40];468 469 extern const u schar_pcre_default_tables[tables_length];470 471 static inline u schar toLowerCase(uschar c)472 { 473 static const u schar* lowerCaseChars =_pcre_default_tables + lcc_offset;339 #define kjs_pcre_utf8_table1_size 6 340 341 extern const int kjs_pcre_utf8_table1[6]; 342 extern const int kjs_pcre_utf8_table2[6]; 343 extern const int kjs_pcre_utf8_table3[6]; 344 extern const unsigned char kjs_pcre_utf8_table4[0x40]; 345 346 extern const unsigned char kjs_pcre_default_tables[tables_length]; 347 348 static inline unsigned char toLowerCase(unsigned char c) 349 { 350 static const unsigned char* lowerCaseChars = kjs_pcre_default_tables + lcc_offset; 474 351 return lowerCaseChars[c]; 475 352 } 476 353 477 static inline u schar flipCase(uschar c)478 { 479 static const u schar* flippedCaseChars =_pcre_default_tables + fcc_offset;354 static inline unsigned char flipCase(unsigned char c) 355 { 356 static const unsigned char* flippedCaseChars = kjs_pcre_default_tables + fcc_offset; 480 357 return flippedCaseChars[c]; 481 358 } 482 359 483 static inline u schar classBitmapForChar(uschar c)484 { 485 static const u schar* charClassBitmaps =_pcre_default_tables + cbits_offset;360 static inline unsigned char classBitmapForChar(unsigned char c) 361 { 362 static const unsigned char* charClassBitmaps = kjs_pcre_default_tables + cbits_offset; 486 363 return charClassBitmaps[c]; 487 364 } 488 365 489 static inline u schar charTypeForChar(uschar c)490 { 491 const u schar* charTypeMap =_pcre_default_tables + ctypes_offset;366 static inline unsigned char charTypeForChar(unsigned char c) 367 { 368 const unsigned char* charTypeMap = kjs_pcre_default_tables + ctypes_offset; 492 369 return charTypeMap[c]; 493 370 } … … 495 372 static inline bool isWordChar(UChar c) 496 373 { 497 /* UTF8 Characters > 128 are assumed to be "non-word" characters. */ 498 return (c < 128 && (charTypeForChar(c) & ctype_word)); 374 return c < 128 && (charTypeForChar(c) & ctype_word); 499 375 } 500 376 501 377 static inline bool isSpaceChar(UChar c) 502 378 { 503 return (c < 128 && (charTypeForChar(c) & ctype_space)); 504 } 505 506 /* Internal shared functions. These are functions that are used by more than 507 one of the exported public functions. They have to be "external" in the C 508 sense, but are not part of the PCRE public API. */ 509 510 extern int _pcre_ucp_othercase(const unsigned int); 511 extern bool _pcre_xclass(int, const uschar*); 379 return c < 128 && (charTypeForChar(c) & ctype_space); 380 } 512 381 513 382 static inline bool isNewline(UChar nl) … … 516 385 } 517 386 518 // FIXME: It's unclear to me if this moves the opcode ptr to the start of all branches 519 // or to the end of all branches -- ecs 520 // FIXME: This abstraction is poor since it assumes that you want to jump based on whatever 521 // the next value in the stream is, and *then* follow any OP_ALT branches. 522 static inline void moveOpcodePtrPastAnyAlternateBranches(const uschar*& opcodePtr) 523 { 524 do { 525 opcodePtr += getOpcodeValueAtOffset(opcodePtr, 1); 526 } while (*opcodePtr == OP_ALT); 527 } 387 static inline bool isBracketStartOpcode(unsigned char opcode) 388 { 389 if (opcode >= OP_BRA) 390 return true; 391 switch (opcode) { 392 case OP_ASSERT: 393 case OP_ASSERT_NOT: 394 return true; 395 default: 396 return false; 397 } 398 } 399 400 static inline void advanceToEndOfBracket(const unsigned char*& opcodePtr) 401 { 402 ASSERT(isBracketStartOpcode(*opcodePtr) || *opcodePtr == OP_ALT); 403 do 404 opcodePtr += getLinkValue(opcodePtr + 1); 405 while (*opcodePtr == OP_ALT); 406 } 407 408 /* Internal shared functions. These are functions that are used in more 409 that one of the source files. They have to have external linkage, but 410 but are not part of the public API and so not exported from the library. */ 411 412 extern int kjs_pcre_ucp_othercase(unsigned); 413 extern bool kjs_pcre_xclass(int, const unsigned char*); 528 414 529 415 #endif -
trunk/JavaScriptCore/pcre/pcre_tables.cpp
r27730 r28793 50 50 character. */ 51 51 52 const int _pcre_utf8_table1[6] =52 const int kjs_pcre_utf8_table1[6] = 53 53 { 0x7f, 0x7ff, 0xffff, 0x1fffff, 0x3ffffff, 0x7fffffff}; 54 54 … … 56 56 first byte of a character, indexed by the number of additional bytes. */ 57 57 58 const int _pcre_utf8_table2[6] = { 0, 0xc0, 0xe0, 0xf0, 0xf8, 0xfc};59 const int _pcre_utf8_table3[6] = { 0xff, 0x1f, 0x0f, 0x07, 0x03, 0x01};58 const int kjs_pcre_utf8_table2[6] = { 0, 0xc0, 0xe0, 0xf0, 0xf8, 0xfc}; 59 const int kjs_pcre_utf8_table3[6] = { 0xff, 0x1f, 0x0f, 0x07, 0x03, 0x01}; 60 60 61 61 /* Table of the number of extra characters, indexed by the first character … … 63 63 0x3d. */ 64 64 65 const u schar_pcre_utf8_table4[0x40] = {65 const unsigned char kjs_pcre_utf8_table4[0x40] = { 66 66 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, 67 67 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, -
trunk/JavaScriptCore/pcre/pcre_ucp_searchfuncs.cpp
r28161 r28793 60 60 */ 61 61 62 int _pcre_ucp_othercase(constunsigned c)62 int kjs_pcre_ucp_othercase(unsigned c) 63 63 { 64 64 int bot = 0; -
trunk/JavaScriptCore/pcre/pcre_xclass.cpp
r28169 r28793 60 60 know we are in UTF-8 mode. */ 61 61 62 static inline void getUTF8CharAndAdvancePointer(int& c, const u schar*& subjectPtr)62 static inline void getUTF8CharAndAdvancePointer(int& c, const unsigned char*& subjectPtr) 63 63 { 64 64 c = *subjectPtr++; 65 65 if ((c & 0xc0) == 0xc0) { 66 int gcaa = _pcre_utf8_table4[c & 0x3f]; /* Number of additional bytes */66 int gcaa = kjs_pcre_utf8_table4[c & 0x3f]; /* Number of additional bytes */ 67 67 int gcss = 6 * gcaa; 68 c = (c & _pcre_utf8_table3[gcaa]) << gcss;68 c = (c & kjs_pcre_utf8_table3[gcaa]) << gcss; 69 69 while (gcaa-- > 0) { 70 70 gcss -= 6; … … 74 74 } 75 75 76 bool _pcre_xclass(int c, const uschar* data)76 bool kjs_pcre_xclass(int c, const unsigned char* data) 77 77 { 78 78 bool negated = (*data & XCL_NOT); -
trunk/JavaScriptCore/pcre/ucpinternal.h
r27686 r28793 46 46 47 47 typedef struct cnode { 48 pcre_uint32f0;49 pcre_uint32f1;48 unsigned f0; 49 unsigned f1; 50 50 } cnode; 51 51 -
trunk/JavaScriptCore/wtf/ASCIICType.h
r27686 r28793 49 49 inline bool isASCIIAlpha(wchar_t c) { return (c | 0x20) >= 'a' && (c | 0x20) <= 'z'; } 50 50 #endif 51 inline bool isASCIIAlpha(int c) { return (c | 0x20) >= 'a' && (c | 0x20) <= 'z'; } 51 52 52 53 inline bool isASCIIAlphanumeric(char c) { return c >= '0' && c <= '9' || (c | 0x20) >= 'a' && (c | 0x20) <= 'z'; } … … 55 56 inline bool isASCIIAlphanumeric(wchar_t c) { return c >= '0' && c <= '9' || (c | 0x20) >= 'a' && (c | 0x20) <= 'z'; } 56 57 #endif 58 inline bool isASCIIAlphanumeric(int c) { return c >= '0' && c <= '9' || (c | 0x20) >= 'a' && (c | 0x20) <= 'z'; } 57 59 58 60 inline bool isASCIIDigit(char c) { return (c >= '0') & (c <= '9'); } … … 68 70 inline bool isASCIIHexDigit(wchar_t c) { return c >= '0' && c <= '9' || (c | 0x20) >= 'a' && (c | 0x20) <= 'f'; } 69 71 #endif 72 inline bool isASCIIHexDigit(int c) { return c >= '0' && c <= '9' || (c | 0x20) >= 'a' && (c | 0x20) <= 'f'; } 70 73 71 74 inline bool isASCIILower(char c) { return c >= 'a' && c <= 'z'; } … … 74 77 inline bool isASCIILower(wchar_t c) { return c >= 'a' && c <= 'z'; } 75 78 #endif 79 inline bool isASCIILower(int c) { return c >= 'a' && c <= 'z'; } 76 80 77 81 inline bool isASCIISpace(char c) { return c == '\t' || c == '\n' || c == '\v' || c =='\f' || c == '\r' || c == ' '; } … … 80 84 inline bool isASCIISpace(wchar_t c) { return c == '\t' || c == '\n' || c == '\v' || c =='\f' || c == '\r' || c == ' '; } 81 85 #endif 86 inline bool isASCIISpace(int c) { return c == '\t' || c == '\n' || c == '\v' || c =='\f' || c == '\r' || c == ' '; } 82 87 83 88 inline char toASCIILower(char c) { return c | ((c >= 'A' && c <= 'Z') << 5); } … … 86 91 inline wchar_t toASCIILower(wchar_t c) { return c | ((c >= 'A' && c <= 'Z') << 5); } 87 92 #endif 93 inline int toASCIILower(int c) { return c | ((c >= 'A' && c <= 'Z') << 5); } 88 94 89 95 inline char toASCIIUpper(char c) { return static_cast<char>(c & ~((c >= 'a' && c <= 'z') << 5)); } … … 92 98 inline wchar_t toASCIIUpper(wchar_t c) { return static_cast<wchar_t>(c & ~((c >= 'a' && c <= 'z') << 5)); } 93 99 #endif 100 inline int toASCIIUpper(int c) { return static_cast<int>(c & ~((c >= 'a' && c <= 'z') << 5)); } 94 101 95 102 }
Note: See TracChangeset
for help on using the changeset viewer.