Changeset 148849 in webkit


Ignore:
Timestamp:
Apr 21, 2013 4:26:56 PM (11 years ago)
Author:
oliver@apple.com
Message:

JS Lexer and Parser should be more informative when they encounter errors
https://bugs.webkit.org/show_bug.cgi?id=114924

Reviewed by Filip Pizlo.

Source/JavaScriptCore:

Add new tokens to represent the various ways that parsing and lexing have failed.
This gives us the ability to produce better error messages in some cases,
and to indicate whether or not the failure was due to invalid source, or simply
early termination.

The jsc prompt now makes use of this so that you can write functions that
are more than one line long.

  • bytecompiler/BytecodeGenerator.cpp:

(JSC::BytecodeGenerator::generate):

  • jsc.cpp:

(stringFromUTF):
(jscSource):
(runInteractive):

  • parser/Lexer.cpp:

(JSC::::parseFourDigitUnicodeHex):
(JSC::::parseIdentifierSlowCase):
(JSC::::parseString):
(JSC::::parseStringSlowCase):
(JSC::::lex):

  • parser/Lexer.h:

(UnicodeHexValue):
(JSC::Lexer::UnicodeHexValue::UnicodeHexValue):
(JSC::Lexer::UnicodeHexValue::valueType):
(JSC::Lexer::UnicodeHexValue::isValid):
(JSC::Lexer::UnicodeHexValue::value):
(Lexer):

  • parser/Parser.h:

(JSC::Parser::getTokenName):
(JSC::Parser::updateErrorMessageSpecialCase):
(JSC::::parse):

  • parser/ParserError.h:

(ParserError):
(JSC::ParserError::ParserError):

  • parser/ParserTokens.h:
  • runtime/Completion.cpp:

(JSC):
(JSC::checkSyntax):

  • runtime/Completion.h:

(JSC):

LayoutTests:

Update test results to cover improved error messages.

  • fast/js/kde/parse-expected.txt:
  • sputnik/Conformance/07_Lexical_Conventions/7.2_White_Space/S7.2_A5_T1-expected.txt:
  • sputnik/Conformance/07_Lexical_Conventions/7.2_White_Space/S7.2_A5_T2-expected.txt:
  • sputnik/Conformance/07_Lexical_Conventions/7.2_White_Space/S7.2_A5_T3-expected.txt:
  • sputnik/Conformance/07_Lexical_Conventions/7.2_White_Space/S7.2_A5_T4-expected.txt:
  • sputnik/Conformance/07_Lexical_Conventions/7.2_White_Space/S7.2_A5_T5-expected.txt:
  • sputnik/Conformance/07_Lexical_Conventions/7.3_Line_Terminators/S7.3_A6_T1-expected.txt:
  • sputnik/Conformance/07_Lexical_Conventions/7.3_Line_Terminators/S7.3_A6_T2-expected.txt:
  • sputnik/Conformance/07_Lexical_Conventions/7.3_Line_Terminators/S7.3_A6_T3-expected.txt:
  • sputnik/Conformance/07_Lexical_Conventions/7.3_Line_Terminators/S7.3_A6_T4-expected.txt:
  • sputnik/Conformance/07_Lexical_Conventions/7.7_Punctuators/S7.7_A2_T1-expected.txt:
  • sputnik/Conformance/07_Lexical_Conventions/7.7_Punctuators/S7.7_A2_T10-expected.txt:
  • sputnik/Conformance/07_Lexical_Conventions/7.7_Punctuators/S7.7_A2_T2-expected.txt:
  • sputnik/Conformance/07_Lexical_Conventions/7.7_Punctuators/S7.7_A2_T3-expected.txt:
  • sputnik/Conformance/07_Lexical_Conventions/7.7_Punctuators/S7.7_A2_T4-expected.txt:
  • sputnik/Conformance/07_Lexical_Conventions/7.7_Punctuators/S7.7_A2_T5-expected.txt:
  • sputnik/Conformance/07_Lexical_Conventions/7.7_Punctuators/S7.7_A2_T6-expected.txt:
  • sputnik/Conformance/07_Lexical_Conventions/7.7_Punctuators/S7.7_A2_T7-expected.txt:
  • sputnik/Conformance/07_Lexical_Conventions/7.7_Punctuators/S7.7_A2_T8-expected.txt:
  • sputnik/Conformance/07_Lexical_Conventions/7.7_Punctuators/S7.7_A2_T9-expected.txt:
  • sputnik/Conformance/13_Function_Definition/S13_A7_T3-expected.txt:
Location:
trunk
Files:
32 edited

Legend:

Unmodified
Added
Removed
  • trunk/LayoutTests/ChangeLog

    r148847 r148849  
     12013-04-21  Oliver Hunt  <oliver@apple.com>
     2
     3        JS Lexer and Parser should be more informative when they encounter errors
     4        https://bugs.webkit.org/show_bug.cgi?id=114924
     5
     6        Reviewed by Filip Pizlo.
     7
     8        Update test results to cover improved error messages.
     9
     10        * fast/js/kde/parse-expected.txt:
     11        * sputnik/Conformance/07_Lexical_Conventions/7.2_White_Space/S7.2_A5_T1-expected.txt:
     12        * sputnik/Conformance/07_Lexical_Conventions/7.2_White_Space/S7.2_A5_T2-expected.txt:
     13        * sputnik/Conformance/07_Lexical_Conventions/7.2_White_Space/S7.2_A5_T3-expected.txt:
     14        * sputnik/Conformance/07_Lexical_Conventions/7.2_White_Space/S7.2_A5_T4-expected.txt:
     15        * sputnik/Conformance/07_Lexical_Conventions/7.2_White_Space/S7.2_A5_T5-expected.txt:
     16        * sputnik/Conformance/07_Lexical_Conventions/7.3_Line_Terminators/S7.3_A6_T1-expected.txt:
     17        * sputnik/Conformance/07_Lexical_Conventions/7.3_Line_Terminators/S7.3_A6_T2-expected.txt:
     18        * sputnik/Conformance/07_Lexical_Conventions/7.3_Line_Terminators/S7.3_A6_T3-expected.txt:
     19        * sputnik/Conformance/07_Lexical_Conventions/7.3_Line_Terminators/S7.3_A6_T4-expected.txt:
     20        * sputnik/Conformance/07_Lexical_Conventions/7.7_Punctuators/S7.7_A2_T1-expected.txt:
     21        * sputnik/Conformance/07_Lexical_Conventions/7.7_Punctuators/S7.7_A2_T10-expected.txt:
     22        * sputnik/Conformance/07_Lexical_Conventions/7.7_Punctuators/S7.7_A2_T2-expected.txt:
     23        * sputnik/Conformance/07_Lexical_Conventions/7.7_Punctuators/S7.7_A2_T3-expected.txt:
     24        * sputnik/Conformance/07_Lexical_Conventions/7.7_Punctuators/S7.7_A2_T4-expected.txt:
     25        * sputnik/Conformance/07_Lexical_Conventions/7.7_Punctuators/S7.7_A2_T5-expected.txt:
     26        * sputnik/Conformance/07_Lexical_Conventions/7.7_Punctuators/S7.7_A2_T6-expected.txt:
     27        * sputnik/Conformance/07_Lexical_Conventions/7.7_Punctuators/S7.7_A2_T7-expected.txt:
     28        * sputnik/Conformance/07_Lexical_Conventions/7.7_Punctuators/S7.7_A2_T8-expected.txt:
     29        * sputnik/Conformance/07_Lexical_Conventions/7.7_Punctuators/S7.7_A2_T9-expected.txt:
     30        * sputnik/Conformance/13_Function_Definition/S13_A7_T3-expected.txt:
     31
    1322013-04-21  Christophe Dumez  <ch.dumez@sisa.samsung.com>
    233
  • trunk/LayoutTests/fast/js/kde/parse-expected.txt

    r90535 r148849  
    2121PASS var f\u0030 = 103; f0 is 103
    2222PASS var \u00E9\u0100\u02AF\u0388\u18A8 = 104; \u00E9\u0100\u02AF\u0388\u18A8; is 104
    23 PASS var f\u00F7; threw exception SyntaxError: Unrecognized token 'f\u00F7'.
    24 PASS var \u0030; threw exception SyntaxError: Unrecognized token '\u0030'.
    25 PASS var test = { }; test.i= 0; test.i\u002b= 1; test.i; threw exception SyntaxError: Unrecognized token 'i\u002b'.
     23PASS var f\u00F7; threw exception SyntaxError: Invalid unicode escape in identifier: 'f\u00F7'.
     24PASS var \u0030; threw exception SyntaxError: Invalid unicode escape in identifier: '\u0030'.
     25PASS var test = { }; test.i= 0; test.i\u002b= 1; test.i; threw exception SyntaxError: Invalid unicode escape in identifier: 'i\u002b'.
    2626PASS var test = { }; test.i= 0; test.i+= 1; test.i; is 1
    2727PASS successfullyParsed is true
  • trunk/LayoutTests/sputnik/Conformance/07_Lexical_Conventions/7.2_White_Space/S7.2_A5_T1-expected.txt

    r90535 r148849  
    1 CONSOLE MESSAGE: line 76: SyntaxError: Unrecognized token '\u0009'
     1CONSOLE MESSAGE: line 76: SyntaxError: Invalid unicode escape in identifier: '\u0009'
    22S7.2_A5_T1
    33
  • trunk/LayoutTests/sputnik/Conformance/07_Lexical_Conventions/7.2_White_Space/S7.2_A5_T2-expected.txt

    r90535 r148849  
    1 CONSOLE MESSAGE: line 76: SyntaxError: Unrecognized token '\u000B'
     1CONSOLE MESSAGE: line 76: SyntaxError: Invalid unicode escape in identifier: '\u000B'
    22S7.2_A5_T2
    33
  • trunk/LayoutTests/sputnik/Conformance/07_Lexical_Conventions/7.2_White_Space/S7.2_A5_T3-expected.txt

    r90535 r148849  
    1 CONSOLE MESSAGE: line 76: SyntaxError: Unrecognized token '\u000C'
     1CONSOLE MESSAGE: line 76: SyntaxError: Invalid unicode escape in identifier: '\u000C'
    22S7.2_A5_T3
    33
  • trunk/LayoutTests/sputnik/Conformance/07_Lexical_Conventions/7.2_White_Space/S7.2_A5_T4-expected.txt

    r90535 r148849  
    1 CONSOLE MESSAGE: line 76: SyntaxError: Unrecognized token '\u0020'
     1CONSOLE MESSAGE: line 76: SyntaxError: Invalid unicode escape in identifier: '\u0020'
    22S7.2_A5_T4
    33
  • trunk/LayoutTests/sputnik/Conformance/07_Lexical_Conventions/7.2_White_Space/S7.2_A5_T5-expected.txt

    r90535 r148849  
    1 CONSOLE MESSAGE: line 76: SyntaxError: Unrecognized token '\u00A0'
     1CONSOLE MESSAGE: line 76: SyntaxError: Invalid unicode escape in identifier: '\u00A0'
    22S7.2_A5_T5
    33
  • trunk/LayoutTests/sputnik/Conformance/07_Lexical_Conventions/7.3_Line_Terminators/S7.3_A6_T1-expected.txt

    r90535 r148849  
    1 CONSOLE MESSAGE: line 76: SyntaxError: Unrecognized token '\u000A'
     1CONSOLE MESSAGE: line 76: SyntaxError: Invalid unicode escape in identifier: '\u000A'
    22S7.3_A6_T1
    33
  • trunk/LayoutTests/sputnik/Conformance/07_Lexical_Conventions/7.3_Line_Terminators/S7.3_A6_T2-expected.txt

    r90535 r148849  
    1 CONSOLE MESSAGE: line 76: SyntaxError: Unrecognized token '\u000D'
     1CONSOLE MESSAGE: line 76: SyntaxError: Invalid unicode escape in identifier: '\u000D'
    22S7.3_A6_T2
    33
  • trunk/LayoutTests/sputnik/Conformance/07_Lexical_Conventions/7.3_Line_Terminators/S7.3_A6_T3-expected.txt

    r90535 r148849  
    1 CONSOLE MESSAGE: line 76: SyntaxError: Unrecognized token '\u2028'
     1CONSOLE MESSAGE: line 76: SyntaxError: Invalid unicode escape in identifier: '\u2028'
    22S7.3_A6_T3
    33
  • trunk/LayoutTests/sputnik/Conformance/07_Lexical_Conventions/7.3_Line_Terminators/S7.3_A6_T4-expected.txt

    r90535 r148849  
    1 CONSOLE MESSAGE: line 76: SyntaxError: Unrecognized token '\u2029'
     1CONSOLE MESSAGE: line 76: SyntaxError: Invalid unicode escape in identifier: '\u2029'
    22S7.3_A6_T4
    33
  • trunk/LayoutTests/sputnik/Conformance/07_Lexical_Conventions/7.7_Punctuators/S7.7_A2_T1-expected.txt

    r89257 r148849  
    1 CONSOLE MESSAGE: line 76: SyntaxError: Unrecognized token '\u007B'
     1CONSOLE MESSAGE: line 76: SyntaxError: Invalid unicode escape in identifier: '\u007B'
    22S7.7_A2_T1
    33
  • trunk/LayoutTests/sputnik/Conformance/07_Lexical_Conventions/7.7_Punctuators/S7.7_A2_T10-expected.txt

    r89257 r148849  
    1 CONSOLE MESSAGE: line 76: SyntaxError: Unrecognized token '\u002F'
     1CONSOLE MESSAGE: line 76: SyntaxError: Invalid unicode escape in identifier: '\u002F'
    22S7.7_A2_T10
    33
  • trunk/LayoutTests/sputnik/Conformance/07_Lexical_Conventions/7.7_Punctuators/S7.7_A2_T2-expected.txt

    r89257 r148849  
    1 CONSOLE MESSAGE: line 76: SyntaxError: Unrecognized token '\u0028'
     1CONSOLE MESSAGE: line 76: SyntaxError: Invalid unicode escape in identifier: '\u0028'
    22S7.7_A2_T2
    33
  • trunk/LayoutTests/sputnik/Conformance/07_Lexical_Conventions/7.7_Punctuators/S7.7_A2_T3-expected.txt

    r89257 r148849  
    1 CONSOLE MESSAGE: line 76: SyntaxError: Unrecognized token '\u005B'
     1CONSOLE MESSAGE: line 76: SyntaxError: Invalid unicode escape in identifier: '\u005B'
    22S7.7_A2_T3
    33
  • trunk/LayoutTests/sputnik/Conformance/07_Lexical_Conventions/7.7_Punctuators/S7.7_A2_T4-expected.txt

    r89257 r148849  
    1 CONSOLE MESSAGE: line 76: SyntaxError: Unrecognized token '\u003B'
     1CONSOLE MESSAGE: line 76: SyntaxError: Invalid unicode escape in identifier: '\u003B'
    22S7.7_A2_T4
    33
  • trunk/LayoutTests/sputnik/Conformance/07_Lexical_Conventions/7.7_Punctuators/S7.7_A2_T5-expected.txt

    r89257 r148849  
    1 CONSOLE MESSAGE: line 77: SyntaxError: Unrecognized token '\u002E'
     1CONSOLE MESSAGE: line 77: SyntaxError: Invalid unicode escape in identifier: '\u002E'
    22S7.7_A2_T5
    33
  • trunk/LayoutTests/sputnik/Conformance/07_Lexical_Conventions/7.7_Punctuators/S7.7_A2_T6-expected.txt

    r89257 r148849  
    1 CONSOLE MESSAGE: line 76: SyntaxError: Unrecognized token '\u002C'
     1CONSOLE MESSAGE: line 76: SyntaxError: Invalid unicode escape in identifier: '\u002C'
    22S7.7_A2_T6
    33
  • trunk/LayoutTests/sputnik/Conformance/07_Lexical_Conventions/7.7_Punctuators/S7.7_A2_T7-expected.txt

    r89257 r148849  
    1 CONSOLE MESSAGE: line 76: SyntaxError: Unrecognized token '\u002B'
     1CONSOLE MESSAGE: line 76: SyntaxError: Invalid unicode escape in identifier: '\u002B'
    22S7.7_A2_T7
    33
  • trunk/LayoutTests/sputnik/Conformance/07_Lexical_Conventions/7.7_Punctuators/S7.7_A2_T8-expected.txt

    r89257 r148849  
    1 CONSOLE MESSAGE: line 76: SyntaxError: Unrecognized token '\u002D'
     1CONSOLE MESSAGE: line 76: SyntaxError: Invalid unicode escape in identifier: '\u002D'
    22S7.7_A2_T8
    33
  • trunk/LayoutTests/sputnik/Conformance/07_Lexical_Conventions/7.7_Punctuators/S7.7_A2_T9-expected.txt

    r89257 r148849  
    1 CONSOLE MESSAGE: line 76: SyntaxError: Unrecognized token '\u002A'
     1CONSOLE MESSAGE: line 76: SyntaxError: Invalid unicode escape in identifier: '\u002A'
    22S7.7_A2_T9
    33
  • trunk/LayoutTests/sputnik/Conformance/13_Function_Definition/S13_A7_T3-expected.txt

    r89257 r148849  
    1 CONSOLE MESSAGE: line 76: SyntaxError: Unrecognized token '\'
     1CONSOLE MESSAGE: line 76: SyntaxError: Invalid escape in identifier: '\'
    22S13_A7_T3
    33
  • trunk/Source/JavaScriptCore/ChangeLog

    r148820 r148849  
     12013-04-21  Oliver Hunt  <oliver@apple.com>
     2
     3        JS Lexer and Parser should be more informative when they encounter errors
     4        https://bugs.webkit.org/show_bug.cgi?id=114924
     5
     6        Reviewed by Filip Pizlo.
     7
     8        Add new tokens to represent the various ways that parsing and lexing have failed.
     9        This gives us the ability to produce better error messages in some cases,
     10        and to indicate whether or not the failure was due to invalid source, or simply
     11        early termination.
     12
     13        The jsc prompt now makes use of this so that you can write functions that
     14        are more than one line long.
     15
     16        * bytecompiler/BytecodeGenerator.cpp:
     17        (JSC::BytecodeGenerator::generate):
     18        * jsc.cpp:
     19        (stringFromUTF):
     20        (jscSource):
     21        (runInteractive):
     22        * parser/Lexer.cpp:
     23        (JSC::::parseFourDigitUnicodeHex):
     24        (JSC::::parseIdentifierSlowCase):
     25        (JSC::::parseString):
     26        (JSC::::parseStringSlowCase):
     27        (JSC::::lex):
     28        * parser/Lexer.h:
     29        (UnicodeHexValue):
     30        (JSC::Lexer::UnicodeHexValue::UnicodeHexValue):
     31        (JSC::Lexer::UnicodeHexValue::valueType):
     32        (JSC::Lexer::UnicodeHexValue::isValid):
     33        (JSC::Lexer::UnicodeHexValue::value):
     34        (Lexer):
     35        * parser/Parser.h:
     36        (JSC::Parser::getTokenName):
     37        (JSC::Parser::updateErrorMessageSpecialCase):
     38        (JSC::::parse):
     39        * parser/ParserError.h:
     40        (ParserError):
     41        (JSC::ParserError::ParserError):
     42        * parser/ParserTokens.h:
     43        * runtime/Completion.cpp:
     44        (JSC):
     45        (JSC::checkSyntax):
     46        * runtime/Completion.h:
     47        (JSC):
     48
    1492013-04-21  Mark Lam  <mark.lam@apple.com>
    250
  • trunk/Source/JavaScriptCore/bytecompiler/BytecodeGenerator.cpp

    r148696 r148849  
    131131
    132132    if (m_expressionTooDeep)
    133         return ParserError::OutOfMemory;
    134     return ParserError::ErrorNone;
     133        return ParserError(ParserError::OutOfMemory);
     134    return ParserError(ParserError::ErrorNone);
    135135}
    136136
  • trunk/Source/JavaScriptCore/jsc.cpp

    r148696 r148849  
    147147};
    148148
    149 static const char interactivePrompt[] = "> ";
     149static const char interactivePrompt[] = ">>> ";
    150150
    151151class StopWatch {
     
    269269}
    270270
    271 static inline SourceCode jscSource(const char* utf8, const String& filename)
     271static inline String stringFromUTF(const char* utf8)
    272272{
    273273    // Find the the first non-ascii character, or nul.
     
    276276        pos++;
    277277    size_t asciiLength = pos - utf8;
    278 
     278   
    279279    // Fast case - string is all ascii.
    280280    if (!*pos)
    281         return makeSource(String(utf8, asciiLength), filename);
    282 
     281        return String(utf8, asciiLength);
     282   
    283283    // Slow case - contains non-ascii characters, use fromUTF8WithLatin1Fallback.
    284284    ASSERT(*pos < 0);
    285285    ASSERT(strlen(utf8) == asciiLength + strlen(pos));
    286     String source = String::fromUTF8WithLatin1Fallback(utf8, asciiLength + strlen(pos));
    287     return makeSource(source.impl(), filename);
     286    return String::fromUTF8WithLatin1Fallback(utf8, asciiLength + strlen(pos));
     287}
     288
     289static inline SourceCode jscSource(const char* utf8, const String& filename)
     290{
     291    String str = stringFromUTF(utf8);
     292    return makeSource(str, filename);
    288293}
    289294
     
    608613{
    609614    String interpreterName("Interpreter");
    610 
    611     while (true) {
     615   
     616    bool shouldQuit = false;
     617    while (!shouldQuit) {
    612618#if HAVE(READLINE) && !RUNNING_FROM_XCODE
    613         char* line = readline(interactivePrompt);
    614         if (!line)
    615             break;
    616         if (line[0])
    617             add_history(line);
     619        ParserError error;
     620        String source;
     621        do {
     622            error = ParserError();
     623            char* line = readline(source.isEmpty() ? interactivePrompt : "... ");
     624            source = source + line;
     625            source = source + '\n';
     626            checkSyntax(globalObject->globalExec(), makeSource(source, interpreterName), error);
     627            shouldQuit = !line;
     628            if (!line || !line[0])
     629                break;
     630            if (line[0])
     631                add_history(line);
     632        } while (error.m_syntaxErrorType == ParserError::SyntaxErrorRecoverable);
     633       
     634        if (error.m_type != ParserError::ErrorNone) {
     635            printf("%s:%d\n", error.m_message.utf8().data(), error.m_line);
     636            continue;
     637        }
     638       
     639       
    618640        JSValue evaluationException;
    619         JSValue returnValue = evaluate(globalObject->globalExec(), jscSource(line, interpreterName), JSValue(), &evaluationException);
    620         free(line);
     641        JSValue returnValue = evaluate(globalObject->globalExec(), makeSource(source, interpreterName), JSValue(), &evaluationException);
    621642#else
    622643        printf("%s", interactivePrompt);
  • trunk/Source/JavaScriptCore/parser/Lexer.cpp

    r148696 r148849  
    597597
    598598template <typename T>
    599 int Lexer<T>::parseFourDigitUnicodeHex()
     599typename Lexer<T>::UnicodeHexValue Lexer<T>::parseFourDigitUnicodeHex()
    600600{
    601601    T char1 = peek(1);
     
    604604
    605605    if (UNLIKELY(!isASCIIHexDigit(m_current) || !isASCIIHexDigit(char1) || !isASCIIHexDigit(char2) || !isASCIIHexDigit(char3)))
    606         return -1;
     606        return UnicodeHexValue((m_code + 4) >= m_codeEnd ? UnicodeHexValue::IncompleteHex : UnicodeHexValue::InvalidHex);
    607607
    608608    int result = convertUnicode(m_current, char1, char2, char3);
     
    611611    shift();
    612612    shift();
    613     return result;
     613    return UnicodeHexValue(result);
    614614}
    615615
     
    884884        shift();
    885885        if (UNLIKELY(m_current != 'u'))
    886             return ERRORTOK;
    887         shift();
    888         int character = parseFourDigitUnicodeHex();
    889         if (UNLIKELY(character == -1))
    890             return ERRORTOK;
    891         UChar ucharacter = static_cast<UChar>(character);
     886            return atEnd() ? UNTERMINATED_IDENTIFIER_ESCAPE_ERRORTOK : INVALID_IDENTIFIER_ESCAPE_ERRORTOK;
     887        shift();
     888        UnicodeHexValue character = parseFourDigitUnicodeHex();
     889        if (UNLIKELY(!character.isValid()))
     890            return character.valueType() == UnicodeHexValue::IncompleteHex ? UNTERMINATED_IDENTIFIER_UNICODE_ESCAPE_ERRORTOK : INVALID_IDENTIFIER_UNICODE_ESCAPE_ERRORTOK;
     891        UChar ucharacter = static_cast<UChar>(character.value());
    892892        if (UNLIKELY(m_buffer16.size() ? !isIdentPart(ucharacter) : !isIdentStart(ucharacter)))
    893             return ERRORTOK;
     893            return INVALID_IDENTIFIER_UNICODE_ESCAPE_ERRORTOK;
    894894        if (shouldCreateIdentifier)
    895895            record16(ucharacter);
     
    942942
    943943template <typename T>
    944 template <bool shouldBuildStrings> ALWAYS_INLINE bool Lexer<T>::parseString(JSTokenData* tokenData, bool strictMode)
     944template <bool shouldBuildStrings> ALWAYS_INLINE typename Lexer<T>::StringParseResult Lexer<T>::parseString(JSTokenData* tokenData, bool strictMode)
    945945{
    946946    int startingOffset = currentOffset();
     
    970970                if (!isASCIIHexDigit(m_current) || !isASCIIHexDigit(peek(1))) {
    971971                    m_lexErrorMessage = "\\x can only be followed by a hex character sequence";
    972                     return false;
     972                    return (atEnd() || (isASCIIHexDigit(m_current) && (m_code + 1 == m_codeEnd))) ? StringUnterminated : StringCannotBeParsed;
    973973                }
    974974                T prev = m_current;
     
    10051005        tokenData->ident = 0;
    10061006
    1007     return true;
    1008 }
    1009 
    1010 template <typename T>
    1011 template <bool shouldBuildStrings> bool Lexer<T>::parseStringSlowCase(JSTokenData* tokenData, bool strictMode)
     1007    return StringParsedSuccessfully;
     1008}
     1009
     1010template <typename T>
     1011template <bool shouldBuildStrings> typename Lexer<T>::StringParseResult Lexer<T>::parseStringSlowCase(JSTokenData* tokenData, bool strictMode)
    10121012{
    10131013    T stringQuoteCharacter = m_current;
     
    10351035                if (!isASCIIHexDigit(m_current) || !isASCIIHexDigit(peek(1))) {
    10361036                    m_lexErrorMessage = "\\x can only be followed by a hex character sequence";
    1037                     return false;
     1037                    return StringCannotBeParsed;
    10381038                }
    10391039                T prev = m_current;
     
    10441044            } else if (m_current == 'u') {
    10451045                shift();
    1046                 int character = parseFourDigitUnicodeHex();
    1047                 if (character != -1) {
     1046                UnicodeHexValue character = parseFourDigitUnicodeHex();
     1047                if (character.isValid()) {
    10481048                    if (shouldBuildStrings)
    1049                         record16(character);
     1049                        record16(character.value());
    10501050                } else if (m_current == stringQuoteCharacter) {
    10511051                    if (shouldBuildStrings)
     
    10531053                } else {
    10541054                    m_lexErrorMessage = "\\u can only be followed by a Unicode character sequence";
    1055                     return false;
     1055                    return character.valueType() == UnicodeHexValue::IncompleteHex ? StringUnterminated : StringCannotBeParsed;
    10561056                }
    10571057            } else if (strictMode && isASCIIDigit(m_current)) {
     
    10611061                if (character1 != '0' || isASCIIDigit(m_current)) {
    10621062                    m_lexErrorMessage = "The only valid numeric escape in strict mode is '\\0'";
    1063                     return false;
     1063                    return StringCannotBeParsed;
    10641064                }
    10651065                if (shouldBuildStrings)
     
    10911091            } else {
    10921092                m_lexErrorMessage = "Unterminated string constant";
    1093                 return false;
     1093                return StringUnterminated;
    10941094            }
    10951095
     
    11041104            if (atEnd() || isLineTerminator(m_current)) {
    11051105                m_lexErrorMessage = "Unexpected EOF";
    1106                 return false;
     1106                return atEnd() ? StringUnterminated : StringCannotBeParsed;
    11071107            }
    11081108            // Anything else is just a normal character
     
    11191119
    11201120    m_buffer16.resize(0);
    1121     return true;
     1121    return StringParsedSuccessfully;
    11221122}
    11231123
     
    14631463                goto start;
    14641464            m_lexErrorMessage = "Multiline comment was not closed properly";
     1465            token = UNTERMINATED_MULTILINE_COMMENT_ERRORTOK;
    14651466            goto returnError;
    14661467        }
     
    15821583                    if (strictMode) {
    15831584                        m_lexErrorMessage = "Octal escapes are forbidden in strict mode";
     1585                        token = INVALID_OCTAL_NUMBER_ERRORTOK;
    15841586                        goto returnError;
    15851587                    }
     
    16001602                    if (!parseNumberAfterExponentIndicator()) {
    16011603                        m_lexErrorMessage = "Non-number found after exponent indicator";
     1604                        token = atEnd() ? UNTERMINATED_NUMERIC_LITERAL_ERRORTOK : INVALID_NUMERIC_LITERAL_ERRORTOK;
    16021605                        goto returnError;
    16031606                    }
     
    16121615        if (UNLIKELY(isIdentStart(m_current))) {
    16131616            m_lexErrorMessage = "At least one digit must occur after a decimal point";
     1617            token = atEnd() ? UNTERMINATED_NUMERIC_LITERAL_ERRORTOK : INVALID_NUMERIC_LITERAL_ERRORTOK;
    16141618            goto returnError;
    16151619        }
     
    16181622    case CharacterQuote:
    16191623        if (lexerFlags & LexerFlagsDontBuildStrings) {
    1620             if (UNLIKELY(!parseString<false>(tokenData, strictMode)))
     1624            StringParseResult result = parseString<false>(tokenData, strictMode);
     1625            if (UNLIKELY(result != StringParsedSuccessfully)) {
     1626                token = result == StringUnterminated ? UNTERMINATED_STRING_LITERAL_ERRORTOK : INVALID_STRING_LITERAL_ERRORTOK;
    16211627                goto returnError;
     1628            }
    16221629        } else {
    1623             if (UNLIKELY(!parseString<true>(tokenData, strictMode)))
     1630            StringParseResult result = parseString<true>(tokenData, strictMode);
     1631            if (UNLIKELY(result != StringParsedSuccessfully)) {
     1632                token = result == StringUnterminated ? UNTERMINATED_STRING_LITERAL_ERRORTOK : INVALID_STRING_LITERAL_ERRORTOK;
    16241633                goto returnError;
     1634            }
    16251635        }
    16261636        shift();
     
    16441654    case CharacterInvalid:
    16451655        m_lexErrorMessage = invalidCharacterMessage();
     1656        token = ERRORTOK;
    16461657        goto returnError;
    16471658    default:
    16481659        RELEASE_ASSERT_NOT_REACHED();
    16491660        m_lexErrorMessage = "Internal Error";
     1661        token = ERRORTOK;
    16501662        goto returnError;
    16511663    }
     
    16791691    tokenLocation->line = m_lineNumber;
    16801692    tokenLocation->endOffset = currentOffset();
    1681     return ERRORTOK;
     1693    RELEASE_ASSERT(token & ErrorTokenFlag);
     1694    return token;
    16821695}
    16831696
  • trunk/Source/JavaScriptCore/parser/Lexer.h

    r148696 r148849  
    135135    ALWAYS_INLINE bool atEnd() const;
    136136    ALWAYS_INLINE T peek(int offset) const;
    137     int parseFourDigitUnicodeHex();
     137    struct UnicodeHexValue {
     138       
     139        enum ValueType { ValidHex, IncompleteHex, InvalidHex };
     140       
     141        explicit UnicodeHexValue(int value)
     142            : m_value(value)
     143        {
     144        }
     145        explicit UnicodeHexValue(ValueType type)
     146            : m_value(type == IncompleteHex ? -2 : -1)
     147        {
     148        }
     149
     150        ValueType valueType() const
     151        {
     152            if (m_value >= 0)
     153                return ValidHex;
     154            return m_value == -2 ? IncompleteHex : InvalidHex;
     155        }
     156        bool isValid() const { return m_value >= 0; }
     157        int value() const
     158        {
     159            ASSERT(m_value >= 0);
     160            return m_value;
     161        }
     162       
     163    private:
     164        int m_value;
     165    };
     166    UnicodeHexValue parseFourDigitUnicodeHex();
    138167    void shiftLineTerminator();
    139168
     
    158187    template <bool shouldBuildIdentifiers> ALWAYS_INLINE JSTokenType parseIdentifier(JSTokenData*, unsigned lexerFlags, bool strictMode);
    159188    template <bool shouldBuildIdentifiers> NEVER_INLINE JSTokenType parseIdentifierSlowCase(JSTokenData*, unsigned lexerFlags, bool strictMode);
    160     template <bool shouldBuildStrings> ALWAYS_INLINE bool parseString(JSTokenData*, bool strictMode);
    161     template <bool shouldBuildStrings> NEVER_INLINE bool parseStringSlowCase(JSTokenData*, bool strictMode);
     189    enum StringParseResult {
     190        StringParsedSuccessfully,
     191        StringUnterminated,
     192        StringCannotBeParsed
     193    };
     194    template <bool shouldBuildStrings> ALWAYS_INLINE StringParseResult parseString(JSTokenData*, bool strictMode);
     195    template <bool shouldBuildStrings> NEVER_INLINE StringParseResult parseStringSlowCase(JSTokenData*, bool strictMode);
    162196    ALWAYS_INLINE void parseHex(double& returnValue);
    163197    ALWAYS_INLINE bool parseOctal(double& returnValue);
  • trunk/Source/JavaScriptCore/parser/Parser.h

    r148696 r148849  
    706706        case NUMBER:
    707707        case IDENT:
    708         case STRING:
     708        case STRING:
     709        case UNTERMINATED_IDENTIFIER_ESCAPE_ERRORTOK:
     710        case UNTERMINATED_IDENTIFIER_UNICODE_ESCAPE_ERRORTOK:
     711        case UNTERMINATED_MULTILINE_COMMENT_ERRORTOK:
     712        case UNTERMINATED_NUMERIC_LITERAL_ERRORTOK:
     713        case UNTERMINATED_STRING_LITERAL_ERRORTOK:
     714        case INVALID_IDENTIFIER_ESCAPE_ERRORTOK:
     715        case INVALID_IDENTIFIER_UNICODE_ESCAPE_ERRORTOK:
     716        case INVALID_NUMERIC_LITERAL_ERRORTOK:
     717        case INVALID_OCTAL_NUMBER_ERRORTOK:
     718        case INVALID_STRING_LITERAL_ERRORTOK:
    709719        case ERRORTOK:
    710         case EOFTOK: 
     720        case EOFTOK:
    711721            return 0;
    712722        case LastUntaggedToken:
     
    735745            m_errorMessage = "Unexpected string " + getToken();
    736746            return;
    737         case ERRORTOK:
     747           
     748        case UNTERMINATED_IDENTIFIER_ESCAPE_ERRORTOK:
     749        case UNTERMINATED_IDENTIFIER_UNICODE_ESCAPE_ERRORTOK:
     750            m_errorMessage = "Incomplete unicode escape in identifier: '" + getToken() + '\'';
     751            return;
     752        case UNTERMINATED_MULTILINE_COMMENT_ERRORTOK:
     753            m_errorMessage = "Unterminated multiline comment";
     754            return;
     755        case UNTERMINATED_NUMERIC_LITERAL_ERRORTOK:
     756            m_errorMessage = "Unterminated numeric literal '" + getToken() + '\'';
     757            return;
     758        case UNTERMINATED_STRING_LITERAL_ERRORTOK:
     759            m_errorMessage = "Unterminated string literal '" + getToken() + '\'';
     760            return;
     761        case INVALID_IDENTIFIER_ESCAPE_ERRORTOK:
     762            m_errorMessage = "Invalid escape in identifier: '" + getToken() + '\'';
     763            return;
     764        case INVALID_IDENTIFIER_UNICODE_ESCAPE_ERRORTOK:
     765            m_errorMessage = "Invalid unicode escape in identifier: '" + getToken() + '\'';
     766            return;
     767        case INVALID_NUMERIC_LITERAL_ERRORTOK:
     768            m_errorMessage = "Invalid numeric literal: '" + getToken() + '\'';
     769            return;
     770        case INVALID_OCTAL_NUMBER_ERRORTOK:
     771            m_errorMessage = "Invalid use of octal: '" + getToken() + '\'';
     772                return;
     773        case INVALID_STRING_LITERAL_ERRORTOK:
     774            m_errorMessage = "Invalid string literal: '" + getToken() + '\'';
     775            return;
     776        case ERRORTOK:
    738777            m_errorMessage = "Unrecognized token '" + getToken() + '\'';
    739778            return;
     
    9951034        // likely, and we are currently unable to distinguish between the two cases.
    9961035        if (isFunctionBodyNode(static_cast<ParsedNode*>(0)) || m_hasStackOverflow)
    997             error = ParserError::StackOverflow;
    998         else if (isEvalNode<ParsedNode>())
    999             error = ParserError(ParserError::EvalError, errMsg, errLine);
    1000         else
    1001             error = ParserError(ParserError::SyntaxError, errMsg, errLine);
     1036            error = ParserError(ParserError::StackOverflow, ParserError::SyntaxErrorNone, m_token);
     1037        else {
     1038            ParserError::SyntaxErrorType errorType = ParserError::SyntaxErrorIrrecoverable;
     1039            if (m_token.m_type == EOFTOK)
     1040                errorType = ParserError::SyntaxErrorRecoverable;
     1041            else if (m_token.m_type & UnterminatedErrorTokenFlag)
     1042                errorType = ParserError::SyntaxErrorUnterminatedLiteral;
     1043           
     1044            if (isEvalNode<ParsedNode>())
     1045                error = ParserError(ParserError::EvalError, errorType, m_token, errMsg, errLine);
     1046            else
     1047                error = ParserError(ParserError::SyntaxError, errorType, m_token, errMsg, errLine);
     1048        }
    10021049    }
    10031050
  • trunk/Source/JavaScriptCore/parser/ParserError.h

    r143147 r148849  
    2929#include "Error.h"
    3030#include "ExceptionHelpers.h"
     31#include "ParserTokens.h"
    3132#include <wtf/text/WTFString.h>
    3233
     
    3435
    3536struct ParserError {
    36     enum ErrorType { ErrorNone, StackOverflow, SyntaxError, EvalError, OutOfMemory } m_type;
     37    enum SyntaxErrorType {
     38        SyntaxErrorNone,
     39        SyntaxErrorIrrecoverable,
     40        SyntaxErrorUnterminatedLiteral,
     41        SyntaxErrorRecoverable
     42    };
     43
     44    enum ErrorType {
     45        ErrorNone,
     46        StackOverflow,
     47        EvalError,
     48        OutOfMemory,
     49        SyntaxError
     50    };
     51
     52    ErrorType m_type;
     53    SyntaxErrorType m_syntaxErrorType;
     54    JSToken m_token;
    3755    String m_message;
    3856    int m_line;
     
    4260    {
    4361    }
    44 
    45     ParserError(ErrorType type)
     62   
     63    explicit ParserError(ErrorType type)
    4664        : m_type(type)
     65        , m_syntaxErrorType(SyntaxErrorNone)
    4766        , m_line(-1)
    4867    {
    4968    }
    5069
    51     ParserError(ErrorType type, String msg, int line)
     70    ParserError(ErrorType type, SyntaxErrorType syntaxError, JSToken token)
    5271        : m_type(type)
     72        , m_syntaxErrorType(syntaxError)
     73        , m_token(token)
     74        , m_line(-1)
     75    {
     76    }
     77
     78    ParserError(ErrorType type, SyntaxErrorType syntaxError, JSToken token, String msg, int line)
     79        : m_type(type)
     80        , m_syntaxErrorType(syntaxError)
     81        , m_token(token)
    5382        , m_message(msg)
    5483        , m_line(line)
     
    73102        return createOutOfMemoryError(globalObject); // Appease Qt bot
    74103    }
     104#undef GET_ERROR_CODE
    75105};
    76106
  • trunk/Source/JavaScriptCore/parser/ParserTokens.h

    r146318 r148849  
    3939    BinaryOpTokenAllowsInPrecedenceAdditionalShift = 4,
    4040    BinaryOpTokenPrecedenceMask = 15 << BinaryOpTokenPrecedenceShift,
     41    ErrorTokenFlag = 1 << (BinaryOpTokenAllowsInPrecedenceAdditionalShift + BinaryOpTokenPrecedenceShift + 7),
     42    UnterminatedErrorTokenFlag = ErrorTokenFlag << 1
    4143};
    4244
     
    8688    COLON,
    8789    DOT,
    88     ERRORTOK,
    8990    EOFTOK,
    9091    EQUAL,
     
    134135    TIMES = 20 | BINARY_OP_PRECEDENCE(10),
    135136    DIVIDE = 21 | BINARY_OP_PRECEDENCE(10),
    136     MOD = 22 | BINARY_OP_PRECEDENCE(10)
     137    MOD = 22 | BINARY_OP_PRECEDENCE(10),
     138    ERRORTOK = 0 | ErrorTokenFlag,
     139    UNTERMINATED_IDENTIFIER_ESCAPE_ERRORTOK = 0 | ErrorTokenFlag | UnterminatedErrorTokenFlag,
     140    INVALID_IDENTIFIER_ESCAPE_ERRORTOK = 1 | ErrorTokenFlag,
     141    UNTERMINATED_IDENTIFIER_UNICODE_ESCAPE_ERRORTOK = 2 | ErrorTokenFlag | UnterminatedErrorTokenFlag,
     142    INVALID_IDENTIFIER_UNICODE_ESCAPE_ERRORTOK = 3 | ErrorTokenFlag,
     143    UNTERMINATED_MULTILINE_COMMENT_ERRORTOK = 4 | ErrorTokenFlag | UnterminatedErrorTokenFlag,
     144    UNTERMINATED_NUMERIC_LITERAL_ERRORTOK = 5 | ErrorTokenFlag | UnterminatedErrorTokenFlag,
     145    INVALID_OCTAL_NUMBER_ERRORTOK = 6 | ErrorTokenFlag | UnterminatedErrorTokenFlag,
     146    INVALID_NUMERIC_LITERAL_ERRORTOK = 7 | ErrorTokenFlag,
     147    UNTERMINATED_STRING_LITERAL_ERRORTOK = 8 | ErrorTokenFlag | UnterminatedErrorTokenFlag,
     148    INVALID_STRING_LITERAL_ERRORTOK = 9 | ErrorTokenFlag,
    137149};
    138150
  • trunk/Source/JavaScriptCore/runtime/Completion.cpp

    r148696 r148849  
    5252    return true;
    5353}
     54   
     55bool checkSyntax(ExecState* exec, const SourceCode& source, ParserError& error)
     56{
     57    JSLockHolder lock(exec);
     58    RELEASE_ASSERT(exec->vm().identifierTable == wtfThreadData().currentIdentifierTable());
     59    VM* vm = &exec->vm();
     60    RefPtr<ProgramNode> programNode = parse<ProgramNode>(vm, source, 0, Identifier(), JSParseNormal, JSParseProgramCode, error);
     61    return programNode;
     62}
    5463
    5564JSValue evaluate(ExecState* exec, const SourceCode& source, JSValue thisValue, JSValue* returnedException)
  • trunk/Source/JavaScriptCore/runtime/Completion.h

    r140718 r148849  
    2727
    2828namespace JSC {
    29 
     29   
     30    struct ParserError;
    3031    class ExecState;
    3132    class JSScope;
    3233    class SourceCode;
    3334
     35    JS_EXPORT_PRIVATE bool checkSyntax(ExecState*, const SourceCode&, ParserError&);
    3436    JS_EXPORT_PRIVATE bool checkSyntax(ExecState*, const SourceCode&, JSValue* exception = 0);
    3537    JS_EXPORT_PRIVATE JSValue evaluate(ExecState*, const SourceCode&, JSValue thisValue = JSValue(), JSValue* exception = 0);
Note: See TracChangeset for help on using the changeset viewer.