Skip to content

Commit e4e4c56

Browse files
committed
Website updates for buffer coercion
1 parent 3605573 commit e4e4c56

4 files changed

Lines changed: 25 additions & 20 deletions

File tree

website/guide/custombehavior.html

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -250,8 +250,11 @@ <h2 id="nodejsbuffer-custombehavior">Node.js Buffer binding</h2>
250250
0x00 rather than throwing a TypeError.</li>
251251
<li>Duktape only supports the <code>"utf8"</code> encoding (and accepts no
252252
spelling variants). Most API calls ignore an encoding argument, and
253-
use the Duktape internal string representation (CESU-8 / extended UTF-8)
254-
for string-to-buffer conversion.</li>
253+
use UTF-8 implicitly for string-to-buffer coercion.</li>
254+
<li>UTF-8 decoding replacement character approach follows
255+
<a href="http://unicode.org/review/pr-121.html">Unicode Technical Committee Recommended Practice for Replacement Characters</a>
256+
which matches WHATWG Encoding API specification but differs from Node.js
257+
(at least up to version v6.8.1).</li>
255258
<li>Node.js Buffer has additional <code>byteLength</code> (matching
256259
<code>length</code>), <code>byteOffset</code> (= 0), and
257260
<code>BYTES_PER_ELEMENT</code> (= 1) properties.</li>

website/guide/duktapebuiltins.html

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -136,6 +136,7 @@ <h3 id="builtin-duktape-enc">enc()</h3>
136136
"jx" and "jc"), second argument is the value to encode, and any further
137137
arguments are format specific.</p>
138138

139+
<!-- XXX: maybe remove support for non-buffer inputs? -->
139140
<p>For "hex" and "base64", buffer values are encoded as is, other values
140141
are string coerced and the internal byte representation (extended UTF-8)
141142
is then encoded. The result is a string. For example, to encode a string
@@ -168,10 +169,10 @@ <h3 id="builtin-duktape-dec">dec()</h3>
168169
<p>If you wish to get back a string value, you can coerce the plain buffer to
169170
a string e.g. as follows:</p>
170171
<pre class="ecmascript-code">
171-
// Use Node.js Buffer binding for buffer-to-string coercion. The buffer
172-
// data is decoded as UTF-8 and re-encoded into an Ecmascript string (with
173-
// surrogate pairs used for non-BMP codepoints).
174-
var result = Buffer.from(Duktape.dec('base64', 'Zm9v')).toString();
172+
// Use TextDecoder which decodes the input as UTF-8. You can also use
173+
// the Node.js Buffer binding to achieve a similar result.
174+
175+
var result = new TextDecoder().decode(Duktape.dec('base64', 'Zm9v'));
175176
print(typeof result, result); // prints 'string foo'
176177
</pre>
177178

website/guide/internalproperties.html

Lines changed: 14 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -85,25 +85,26 @@ <h1 id="internalproperties">Internal properties</h1>
8585

8686
<p>Internal strings cannot be created from Ecmascript code using the default
8787
built-ins alone. However, application code can easily add such a binding
88-
using the C API (this must be considered in sandboxing).</p>
88+
using the C API which must be considered in sandboxing.</p>
8989

9090
<p>There's no special access control for internal properties: if user code has
9191
access to the property name (string), it can read/write the property value.
92-
Any code with the ability to create or use buffers can potentially create an
93-
internal string by converting a buffer into a string. However, standard Ecmascript
94-
code with no access to buffer values or ability to create them cannot create internal
95-
strings (or any invalid UTF-8 strings in general). When sandboxing, ensure that
96-
the sandboxed code has no access to the <code>Duktape</code> built-in or any
97-
buffer values.</p>
92+
The default Ecmascript built-ins don't provide a way of creating an internal
93+
string: buffer-to-string coercions always involve an encoding such as UTF-8
94+
which will reject or replace invalid byte sequences. However, C code can
95+
easily create internal strings. When sandboxing, ensure that custom C bindings
96+
don't accidentally provide a mechanism to create internal strings by e.g.
97+
converting a buffer as-is to a string.</p>
9898

99-
<p>As a concrete example, the internal value of a <code>Date</code> can be
100-
accessed as follows:</p>
99+
<p>As a concrete example the internal value of a <code>Date</code> instance
100+
can be accessed as follows:</p>
101101
<pre class="ecmascript-code">
102-
// Print the internal timestamp of a Date instance. User code should NEVER
103-
// actually do this because the internal properties may change between
104-
// versions in an arbitrary manner!
102+
// Print the internal timestamp of a Date instance. Assumes a hypothetical
103+
// rawBufferToString() custom C binding which takes an input buffer and pushes
104+
// the bytes as-is as a string using duk_push_lstring(), thus creating an
105+
// internal string.
105106

106-
var key = Duktape.dec('hex', 'ff56616c7565'); // \xFFValue
107+
var key = rawBufferToString(Duktape.dec('hex', 'ff56616c7565')); // \xFFValue
107108
var dt = new Date(123456);
108109
print('internal value is:', dt[key]); // prints 123456
109110
</pre>

website/guide/sandboxing.html

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,5 +16,5 @@ <h1 id="sandboxing">Sandboxing</h1>
1616
for a detailed discussion of how to implement sandboxing.</p>
1717

1818
<div class="note">
19-
Sandboxing support in Duktape 1.3 is still a work in progress.
19+
Sandboxing support in Duktape 2.0 is still a work in progress.
2020
</div>

0 commit comments

Comments
 (0)