Skip to content

[fork-ffi] enums: fromCharCode — fromCharCode is bound directly to string.char, which builds a string… #80

@Unisay

Description

@Unisay

Package: purescript-lua-enums
File: src/Data/Enum.lua
Function: fromCharCode
Class: semantics Severity: high

fromCharCode is bound directly to string.char, which builds a string from raw byte values 0..255 and ERRORS for any value > 255 ('bad argument #1 to char (invalid value)', confirmed on Lua 5.1 for 256 and 65535). Even for 128..255 it emits a single raw byte (0x80-0xFF) which is invalid as a standalone UTF-8 character, not the Latin-1 code point JS produces. The JS FFI is String.fromCharCode(c), valid over the entire Char range 0..65535. fromCharCode backs charToEnum, which Data.Enum uses for the Char instances of toEnum/succ/pred; so any Char above U+007F is wrong and any Char above U+00FF throws at runtime. Correct only for the ASCII subrange 0..127.

Current (Lua):

fromCharCode = (string.char)

Expected: JS: String.fromCharCode(c) yields the character for code unit c over 0..65535 (e.g. 65 -> 'A', 233 -> 'é', 256 -> U+0100 'Ā', 65535 -> U+FFFF), as a valid UTF-8 string after compilation.

Proposed fix:

Encode the code point as UTF-8 (full BMP range) instead of a single byte. E.g.:
  fromCharCode = function(n)
    if n < 0x80 then return string.char(n) end
    if n < 0x800 then return string.char(0xC0 + math.floor(n / 0x40), 0x80 + (n % 0x40)) end
    if n < 0x10000 then return string.char(0xE0 + math.floor(n / 0x1000), 0x80 + (math.floor(n / 0x40) % 0x40), 0x80 + (n % 0x40)) end
    return string.char(0xF0 + math.floor(n / 0x40000), 0x80 + (math.floor(n / 0x1000) % 0x40), 0x80 + (math.floor(n / 0x40) % 0x40), 0x80 + (n % 0x40))
  end
Verified on Lua 5.1: toCharCode(fromCharCode(233))==233, ...(65535)==65535, ...(0x100)==256, no errors.

Found by the FFI audit; reproduced under Lua 5.1.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions