Skip to content

Commit 9c6fee7

Browse files
committed
[JSC] Yarr BoyerMoore search should support character-class
https://bugs.webkit.org/show_bug.cgi?id=228613 Reviewed by Saam Barati. JSTests: * stress/regexp-bm-search-character-non-fixed-size.js: Added. (shouldBe): * stress/regexp-bm-search-many-candidate-zero-length.js: Added. (shouldBe): (regexp.a.b.c.d.e.f.g.h.i.j.k.l.m.n.o.p.q.r.s.t.u.v.w.x.y.z.0.1.2.3.4.5.6.7.8.9.t.v.n.r): * stress/regexp-bm-search-non-fixed-size.js: Added. (shouldBe): Source/JavaScriptCore: This patch adds character-class support for BoyerMoore lookahead search in Yarr. Currently, we only support fixed-sized character-class. We can extend it for repeat cases in the future. To apply this character-class thing to jQuery's RegExp, we also allow non-fixed-sized disjunction. For example, /aaaa.*|bbbb/'s disjunction is not fixed-sized. But still we can use (aaaa|bbbb) prefix since this part is fixed-sized and we know minimum-size of this disjunction is 4. Plus, instead of giving up BoyerMoore search when we found non-supported terms, we shorten BoyerMoore search length not to include this term so that we can still have a chance to leverage BoyerMoore search. In the case of /aaaa|bbbb|ccc(d|e|f)/, we previously gave up since it finds `(d|e|f)`. But now, instead we shorten the length from 4 to 3, and construct search pattern with `aaa|bbb|ccc`. This patch improves jquery-todomvc-regexp by 20%. ToT Patched jquery-todomvc-regexp 545.3561+-0.6968 ^ 451.6117+-0.4613 ^ definitely 1.2076x faster This improves Speedometer2/jQuery-TodoMVC by 2%. ---------------------------------------------------------------------------------------------------------------------------------- | subtest | ms | ms | b / a | pValue (significance using False Discovery Rate) | ---------------------------------------------------------------------------------------------------------------------------------- | Elm-TodoMVC |123.470833 |123.550000 |1.000641 | 0.841600 | | VueJS-TodoMVC |26.883333 |26.950000 |1.002480 | 0.846732 | | EmberJS-TodoMVC |127.708333 |127.754167 |1.000359 | 0.934206 | | BackboneJS-TodoMVC |50.545833 |50.445833 |0.998022 | 0.679610 | | Preact-TodoMVC |20.879167 |20.791667 |0.995809 | 0.796541 | | AngularJS-TodoMVC |137.479167 |137.275000 |0.998515 | 0.729817 | | Vanilla-ES2015-TodoMVC |69.079167 |68.912500 |0.997587 | 0.524325 | | Inferno-TodoMVC |65.604167 |66.120833 |1.007876 | 0.145549 | | Flight-TodoMVC |77.029167 |76.708333 |0.995835 | 0.518562 | | Angular2-TypeScript-TodoMVC |40.516667 |40.812500 |1.007302 | 0.513386 | | VanillaJS-TodoMVC |54.762500 |54.895833 |1.002435 | 0.647381 | | jQuery-TodoMVC |255.950000 |250.425000 |0.978414 | 0.000000 (significant) | | EmberJS-Debug-TodoMVC |341.745833 |342.804167 |1.003097 | 0.219937 | | React-TodoMVC |88.854167 |88.700000 |0.998265 | 0.568405 | | React-Redux-TodoMVC |151.266667 |150.804167 |0.996942 | 0.256403 | | Vanilla-ES2015-Babel-Webpack-TodoMVC |65.783333 |65.645833 |0.997910 | 0.437464 | ---------------------------------------------------------------------------------------------------------------------------------- a mean = 246.52898 b mean = 246.85128 pValue = 0.3927330278 (Bigger means are better.) 1.001 times better Results ARE NOT significant * yarr/YarrJIT.cpp: (JSC::Yarr::BoyerMooreInfo::shortenLength): (JSC::Yarr::BoyerMooreInfo::setAll): (JSC::Yarr::BoyerMooreInfo::addCharacters): (JSC::Yarr::BoyerMooreInfo::addRanges): * yarr/YarrJIT.h: (JSC::Yarr::BoyerMooreBitmap::add): (JSC::Yarr::BoyerMooreBitmap::addCharacters): (JSC::Yarr::BoyerMooreBitmap::addRanges): (JSC::Yarr::BoyerMooreBitmap::setAll): (JSC::Yarr::BoyerMooreBitmap::isAllSet const): Canonical link: https://commits.webkit.org/240194@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@280570 268f45cc-cd09-0410-ab3c-d52691b4dbfc
1 parent 685cdeb commit 9c6fee7

7 files changed

Lines changed: 249 additions & 20 deletions

File tree

JSTests/ChangeLog

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,18 @@
1+
2021-08-02 Yusuke Suzuki <ysuzuki@apple.com>
2+
3+
[JSC] Yarr BoyerMoore search should support character-class
4+
https://bugs.webkit.org/show_bug.cgi?id=228613
5+
6+
Reviewed by Saam Barati.
7+
8+
* stress/regexp-bm-search-character-non-fixed-size.js: Added.
9+
(shouldBe):
10+
* stress/regexp-bm-search-many-candidate-zero-length.js: Added.
11+
(shouldBe):
12+
(regexp.a.b.c.d.e.f.g.h.i.j.k.l.m.n.o.p.q.r.s.t.u.v.w.x.y.z.0.1.2.3.4.5.6.7.8.9.t.v.n.r):
13+
* stress/regexp-bm-search-non-fixed-size.js: Added.
14+
(shouldBe):
15+
116
2021-08-02 Yusuke Suzuki <ysuzuki@apple.com>
217

318
[JSC] Update test262
Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
function shouldBe(actual, expected) {
2+
if (actual !== expected)
3+
throw new Error('bad value: ' + actual);
4+
}
5+
6+
let regexp = /ssssss.*ss/;
7+
let regexpFail = /oooooo.*oo/;
8+
9+
for (var i = 0; i < 1e2; ++i) {
10+
let matched = `aaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbcccccccpppppppppppptttttttttttt<<<<<<<<<<<<<<ddddddddddddddddddddddddddddddddddddddddjjjjjjjjjjssssss src="hey.js" ssHey`.match(regexp);
11+
shouldBe(matched[0], `ssssss src="hey.js" ss`);
12+
let notMatched = `aaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbcccccccpppppppppppptttttttttttt<<<<<<<<<<<<<<ddddddddddddddddddddddddddddddddddddddddjjjjjjjjjjssssss src="hey.js" ssHey`.match(regexpFail);
13+
shouldBe(notMatched, null);
14+
}
Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
function shouldBe(actual, expected) {
2+
if (actual !== expected)
3+
throw new Error('bad value: ' + actual);
4+
}
5+
6+
var regexp = /a|b|c|d|e|f|g|h|i|j|k|l|m|n|o|p|q|r|s|t|u|v|w|x|y|z|0|1|2|3|4|5|6|7|8|9| |\t|\v|\n|\r|\$|\^|\&|\*|\(|\)/
7+
8+
for (var i = 0; i < 1e2; ++i) {
9+
shouldBe(regexp.test(`%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%`), false);
10+
shouldBe(regexp.test(`テスト`), false);
11+
shouldBe(regexp.test(`testing`), true);
12+
shouldBe(RegExp.leftContext, ``);
13+
shouldBe(RegExp.rightContext, `esting`);
14+
}
Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
function shouldBe(actual, expected) {
2+
if (actual !== expected)
3+
throw new Error('bad value: ' + actual);
4+
}
5+
6+
let regexp = /<script.*\/>/i;
7+
let regexpFail = /<scripp.*\/>/i;
8+
9+
for (var i = 0; i < 1e2; ++i) {
10+
let matched = `aaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbcccccccpppppppppppptttttttttttt<<<<<<<<<<<<<<ddddddddddddddddddddddddddddddddddddddddjjjjjjjjjj<script src="hey.js" />Hey`.match(regexp);
11+
shouldBe(matched[0], `<script src="hey.js" />`);
12+
let notMatched = `aaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbcccccccpppppppppppptttttttttttt<<<<<<<<<<<<<<ddddddddddddddddddddddddddddddddddddddddjjjjjjjjjj<script src="hey.js" />Hey`.match(regexpFail);
13+
shouldBe(notMatched, null);
14+
}

Source/JavaScriptCore/ChangeLog

Lines changed: 66 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,69 @@
1+
2021-08-02 Yusuke Suzuki <ysuzuki@apple.com>
2+
3+
[JSC] Yarr BoyerMoore search should support character-class
4+
https://bugs.webkit.org/show_bug.cgi?id=228613
5+
6+
Reviewed by Saam Barati.
7+
8+
This patch adds character-class support for BoyerMoore lookahead search in Yarr.
9+
Currently, we only support fixed-sized character-class. We can extend it for repeat cases in the future.
10+
11+
To apply this character-class thing to jQuery's RegExp, we also allow non-fixed-sized disjunction.
12+
For example, /aaaa.*|bbbb/'s disjunction is not fixed-sized. But still we can use (aaaa|bbbb) prefix since
13+
this part is fixed-sized and we know minimum-size of this disjunction is 4.
14+
15+
Plus, instead of giving up BoyerMoore search when we found non-supported terms, we shorten BoyerMoore search
16+
length not to include this term so that we can still have a chance to leverage BoyerMoore search. In the case
17+
of /aaaa|bbbb|ccc(d|e|f)/, we previously gave up since it finds `(d|e|f)`. But now, instead we shorten the length
18+
from 4 to 3, and construct search pattern with `aaa|bbb|ccc`.
19+
20+
This patch improves jquery-todomvc-regexp by 20%.
21+
22+
ToT Patched
23+
24+
jquery-todomvc-regexp 545.3561+-0.6968 ^ 451.6117+-0.4613 ^ definitely 1.2076x faster
25+
26+
This improves Speedometer2/jQuery-TodoMVC by 2%.
27+
28+
----------------------------------------------------------------------------------------------------------------------------------
29+
| subtest | ms | ms | b / a | pValue (significance using False Discovery Rate) |
30+
----------------------------------------------------------------------------------------------------------------------------------
31+
| Elm-TodoMVC |123.470833 |123.550000 |1.000641 | 0.841600 |
32+
| VueJS-TodoMVC |26.883333 |26.950000 |1.002480 | 0.846732 |
33+
| EmberJS-TodoMVC |127.708333 |127.754167 |1.000359 | 0.934206 |
34+
| BackboneJS-TodoMVC |50.545833 |50.445833 |0.998022 | 0.679610 |
35+
| Preact-TodoMVC |20.879167 |20.791667 |0.995809 | 0.796541 |
36+
| AngularJS-TodoMVC |137.479167 |137.275000 |0.998515 | 0.729817 |
37+
| Vanilla-ES2015-TodoMVC |69.079167 |68.912500 |0.997587 | 0.524325 |
38+
| Inferno-TodoMVC |65.604167 |66.120833 |1.007876 | 0.145549 |
39+
| Flight-TodoMVC |77.029167 |76.708333 |0.995835 | 0.518562 |
40+
| Angular2-TypeScript-TodoMVC |40.516667 |40.812500 |1.007302 | 0.513386 |
41+
| VanillaJS-TodoMVC |54.762500 |54.895833 |1.002435 | 0.647381 |
42+
| jQuery-TodoMVC |255.950000 |250.425000 |0.978414 | 0.000000 (significant) |
43+
| EmberJS-Debug-TodoMVC |341.745833 |342.804167 |1.003097 | 0.219937 |
44+
| React-TodoMVC |88.854167 |88.700000 |0.998265 | 0.568405 |
45+
| React-Redux-TodoMVC |151.266667 |150.804167 |0.996942 | 0.256403 |
46+
| Vanilla-ES2015-Babel-Webpack-TodoMVC |65.783333 |65.645833 |0.997910 | 0.437464 |
47+
----------------------------------------------------------------------------------------------------------------------------------
48+
a mean = 246.52898
49+
b mean = 246.85128
50+
pValue = 0.3927330278
51+
(Bigger means are better.)
52+
1.001 times better
53+
Results ARE NOT significant
54+
55+
* yarr/YarrJIT.cpp:
56+
(JSC::Yarr::BoyerMooreInfo::shortenLength):
57+
(JSC::Yarr::BoyerMooreInfo::setAll):
58+
(JSC::Yarr::BoyerMooreInfo::addCharacters):
59+
(JSC::Yarr::BoyerMooreInfo::addRanges):
60+
* yarr/YarrJIT.h:
61+
(JSC::Yarr::BoyerMooreBitmap::add):
62+
(JSC::Yarr::BoyerMooreBitmap::addCharacters):
63+
(JSC::Yarr::BoyerMooreBitmap::addRanges):
64+
(JSC::Yarr::BoyerMooreBitmap::setAll):
65+
(JSC::Yarr::BoyerMooreBitmap::isAllSet const):
66+
167
2021-08-02 Stephan Szabo <stephan.szabo@sony.com>
268

369
[PlayStation] Make C files in testapi compile with a C standard rather than C++ one

0 commit comments

Comments
 (0)