JavaScriptCore RegExp Processing

presented by Michael Saboff

Engine is called: YARR (Yet Another RegExp Runtime)

Has both interpreter and JIT backends

Uses backtracking algorithm (not DFA based)

First thing that happens is that the YARR parses a regex and converts it into a YarrPattern.

Then the YARR interpreter converts the YarrPattern into byte code that it can run.

Question: Why not use a DFA based system? It takes a lot longer to compile a DFA system. Also, because JS regular expressions are irregular you wouldn’t be able to have a DFA based system for that.

Question: How would you know if you wanted to have more tiers of RexExps? We would probably want to see what the areas the extra tier would help us. We would also want to know how often we are going see those cases.

Optimizations that YARR does right now:

Generate different code for different matching purposes. Sometimes the regexp is only used for matching other times they want to access the captured groups.

Check multiple adjacent characters at once (up to 8)

Remove unnecessary enclosing patterns: such as /.*abc.*/ which is the same as /abc/

Character class canonicalization. Such as converting [12345] into (1 <= c && 5 <= c) instead of if (c == 1) return match; else if (c == 2) …

JSC Goals

presented by Saam Barati

JetStream 2:

We think that the next thing JSC should optimize is BitInt. We also want to improve our memory usage.

New Bytecode format: It separates the parts of the Bytecode specific to a particular function from the core instructions. This lets us share the core instructions across processes or save it to disk. Hopefully, it is a win for startup and for non-jit cases.

JetStream 2, which is a new benchmark for JS performance that will be open sourced soon (TM).

Last modified 3 years ago Last modified on Oct 12, 2018 4:03:43 PM