Skip to content

use TextEncoder to encode string if available#68

Merged
gfx merged 1 commit intomasterfrom
text_encoder
Jul 8, 2019
Merged

use TextEncoder to encode string if available#68
gfx merged 1 commit intomasterfrom
text_encoder

Conversation

@gfx
Copy link
Copy Markdown
Member

@gfx gfx commented Jun 16, 2019

This does not affect benchmark results because the dataset for benchmarks does not have large strings, but it is much efficient than pure JS or WASM if the input string is large.

@gfx gfx requested a review from sergeyzenchenko June 16, 2019 13:23
@sergeyzenchenko
Copy link
Copy Markdown
Collaborator

Have you testes with different sizes?

@sergeyzenchenko
Copy link
Copy Markdown
Collaborator

@gfx

@gfx
Copy link
Copy Markdown
Member Author

gfx commented Jun 17, 2019

Yep, as benchmark/encode-string.ts shows:

$ npx ts-node benchmark/encode-string.ts

## string "A" x 10 (byteLength=10)

utf8EncodeJs x 14,402,025 ops/sec ±8.20% (68 runs sampled)
utf8DecodeTE x 843,323 ops/sec ±19.87% (54 runs sampled)

## string "A" x 100 (byteLength=100)

utf8EncodeJs x 1,520,583 ops/sec ±2.55% (86 runs sampled)
utf8DecodeTE x 1,201,906 ops/sec ±4.48% (69 runs sampled)

## string "A" x 200 (byteLength=200)

utf8EncodeJs x 774,931 ops/sec ±1.76% (85 runs sampled)
utf8DecodeTE x 1,071,303 ops/sec ±5.44% (66 runs sampled)

## string "A" x 1000 (byteLength=1000)

utf8EncodeJs x 142,148 ops/sec ±8.10% (72 runs sampled)
utf8DecodeTE x 457,927 ops/sec ±8.05% (44 runs sampled)

## string "A" x 10000 (byteLength=10000)

utf8EncodeJs x 15,303 ops/sec ±3.46% (78 runs sampled)
utf8DecodeTE x 70,942 ops/sec ±9.21% (38 runs sampled)

## string "A" x 100000 (byteLength=100000)

utf8EncodeJs x 1,704 ops/sec ±2.56% (86 runs sampled)
utf8DecodeTE x 8,498 ops/sec ±5.34% (61 runs sampled)

## string "あ" x 10 (byteLength=30)

utf8EncodeJs x 12,520,756 ops/sec ±4.06% (85 runs sampled)
utf8DecodeTE x 1,255,161 ops/sec ±3.06% (70 runs sampled)

## string "あ" x 100 (byteLength=300)

utf8EncodeJs x 940,380 ops/sec ±7.21% (72 runs sampled)
utf8DecodeTE x 698,070 ops/sec ±4.93% (76 runs sampled)

## string "あ" x 200 (byteLength=600)

utf8EncodeJs x 570,138 ops/sec ±3.57% (88 runs sampled)
utf8DecodeTE x 152,060 ops/sec ±25.34% (29 runs sampled)

## string "あ" x 1000 (byteLength=3000)

utf8EncodeJs x 111,823 ops/sec ±8.33% (81 runs sampled)
utf8DecodeTE x 100,644 ops/sec ±10.49% (59 runs sampled)

## string "あ" x 10000 (byteLength=30000)

utf8EncodeJs x 12,831 ops/sec ±2.17% (92 runs sampled)
utf8DecodeTE x 9,405 ops/sec ±14.25% (50 runs sampled)

## string "あ" x 100000 (byteLength=300000)

utf8EncodeJs x 933 ops/sec ±14.97% (70 runs sampled)
utf8DecodeTE x 801 ops/sec ±14.61% (51 runs sampled)

## string "🌏" x 20 (byteLength=40)

utf8EncodeJs x 4,361,021 ops/sec ±23.57% (51 runs sampled)
utf8DecodeTE x 958,067 ops/sec ±8.69% (63 runs sampled)

## string "🌏" x 200 (byteLength=400)

utf8EncodeJs x 612,881 ops/sec ±6.40% (81 runs sampled)
utf8DecodeTE x 339,414 ops/sec ±12.43% (58 runs sampled)

## string "🌏" x 400 (byteLength=800)

utf8EncodeJs x 279,219 ops/sec ±10.39% (71 runs sampled)
utf8DecodeTE x 173,350 ops/sec ±14.61% (55 runs sampled)

## string "🌏" x 2000 (byteLength=4000)

utf8EncodeJs x 27,919 ops/sec ±29.91% (37 runs sampled)
utf8DecodeTE x 41,550 ops/sec ±23.79% (58 runs sampled)

## string "🌏" x 20000 (byteLength=40000)

utf8EncodeJs x 3,842 ops/sec ±22.42% (54 runs sampled)
utf8DecodeTE x 5,670 ops/sec ±4.68% (74 runs sampled)

## string "🌏" x 200000 (byteLength=400000)

utf8EncodeJs x 726 ops/sec ±1.37% (90 runs sampled)
utf8DecodeTE x 640 ops/sec ±3.54% (78 runs sampled)

@gfx gfx merged commit e4cb3ce into master Jul 8, 2019
@gfx gfx deleted the text_encoder branch July 8, 2019 00:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants