Changeset 198866 in webkit


Ignore:
Timestamp:
Mar 30, 2016 5:38:20 PM (8 years ago)
Author:
msaboff@apple.com
Message:

[ES6] Quantified unicode regular expressions do not work for counts greater than 1
https://bugs.webkit.org/show_bug.cgi?id=156044

Reviewed by Mark Lam.

Source/JavaScriptCore:

Fixed incorrect indexing of non-BMP characters in fixed patterns. The old code
was indexing by character units, a single JS character, instead of code points
which is 2 JS characters.

  • yarr/YarrInterpreter.cpp:

(JSC::Yarr::Interpreter::matchDisjunction):

LayoutTests:

Added new test cases.

  • js/regexp-unicode-expected.txt:
  • js/script-tests/regexp-unicode.js:
Location:
trunk
Files:
5 edited

Legend:

Unmodified
Added
Removed
  • trunk/LayoutTests/ChangeLog

    r198859 r198866  
     12016-03-30  Michael Saboff  <msaboff@apple.com>
     2
     3        [ES6] Quantified unicode regular expressions do not work for counts greater than 1
     4        https://bugs.webkit.org/show_bug.cgi?id=156044
     5
     6        Reviewed by Mark Lam.
     7
     8        Added new test cases.
     9
     10        * js/regexp-unicode-expected.txt:
     11        * js/script-tests/regexp-unicode.js:
     12
    1132016-03-30  Myles C. Maxfield  <mmaxfield@apple.com>
    214
  • trunk/LayoutTests/js/regexp-unicode-expected.txt

    r198624 r198866  
    5858PASS re2.test("￿") is false
    5959PASS re2.test("𒍅") is true
     60PASS /𝌆{2}/u.test("𝌆𝌆") is true
     61PASS /𝌆{2}/u.test("𝌆𝌆") is true
     62PASS "𐐁𐐁𐐀".match(/𐐁{1,3}/u)[0] is "𐐁𐐁"
     63PASS "𐐁𐐩".match(/𐐁{1,3}/iu)[0] is "𐐁𐐩"
     64PASS "𐐁𐐩𐐪𐐩".match(/𐐁{1,}/iu)[0] is "𐐁𐐩"
    6065PASS "𐌑𐌑𐌑".match(/𐌑*a|𐌑*./u)[0] is "𐌑𐌑𐌑"
    6166PASS "a𐌑𐌑".match(/a𐌑*?$/u)[0] is "a𐌑𐌑"
  • trunk/LayoutTests/js/script-tests/regexp-unicode.js

    r198624 r198866  
    9393// shouldBe('"\uD803\u{10c01}".match(/\uD803\u{10c01}/u)[0].length', '3');
    9494
     95// Check quantified matches
     96shouldBeTrue('/\u{1d306}{2}/u.test("\u{1d306}\u{1d306}")');
     97shouldBeTrue('/\uD834\uDF06{2}/u.test("\uD834\uDF06\uD834\uDF06")');
     98shouldBe('"\u{10401}\u{10401}\u{10400}".match(/\u{10401}{1,3}/u)[0]', '"\u{10401}\u{10401}"');
     99shouldBe('"\u{10401}\u{10429}".match(/\u{10401}{1,3}/iu)[0]', '"\u{10401}\u{10429}"');
     100shouldBe('"\u{10401}\u{10429}\u{1042a}\u{10429}".match(/\u{10401}{1,}/iu)[0]', '"\u{10401}\u{10429}"');
     101
    95102// Check back tracking on partial matches
    96103shouldBe('"\u{10311}\u{10311}\u{10311}".match(/\u{10311}*a|\u{10311}*./u)[0]', '"\u{10311}\u{10311}\u{10311}"');
  • trunk/Source/JavaScriptCore/ChangeLog

    r198855 r198866  
     12016-03-30  Michael Saboff  <msaboff@apple.com>
     2
     3        [ES6] Quantified unicode regular expressions do not work for counts greater than 1
     4        https://bugs.webkit.org/show_bug.cgi?id=156044
     5
     6        Reviewed by Mark Lam.
     7
     8        Fixed incorrect indexing of non-BMP characters in fixed patterns.  The old code
     9        was indexing by character units, a single JS character, instead of code points
     10        which is 2 JS characters.
     11
     12        * yarr/YarrInterpreter.cpp:
     13        (JSC::Yarr::Interpreter::matchDisjunction):
     14
    1152016-03-30  Mark Lam  <mark.lam@apple.com>
    216
  • trunk/Source/JavaScriptCore/yarr/YarrInterpreter.cpp

    r198624 r198866  
    12251225                if (!U_IS_BMP(currentTerm().atom.patternCharacter)) {
    12261226                    for (unsigned matchAmount = 0; matchAmount < currentTerm().atom.quantityCount; ++matchAmount) {
    1227                         if (!checkSurrogatePair(currentTerm().atom.patternCharacter, currentTerm().inputPosition - matchAmount)) {
     1227                        if (!checkSurrogatePair(currentTerm().atom.patternCharacter, currentTerm().inputPosition - 2 * matchAmount)) {
    12281228                            BACKTRACK();
    12291229                        }
Note: See TracChangeset for help on using the changeset viewer.