Skip to content

Document load-bearing invariants and subsystems with GHC-style Notes #44

@Unisay

Description

@Unisay

While fixing #37 it became clear that several of the compiler's conventions are load-bearing but documented poorly: the same rule copy-pasted across call sites, an invariant spelled out only where it is consumed rather than where it is established, or a whole subsystem with nothing but a section banner. When those drift apart or go unnoticed, they cause exactly the kind of silent miscompilation #37 was.

A repo-wide scan turned up 28 candidates for the GHC-style Note [Title] convention (a free-standing block comment with a bracketed title, referenced from each dependent site via -- See Note [Title]). Two were done as part of #37 (Note [Sequential scoping of Let bindings] and Note [Locals are uniquely named after renameShadowedNames]). This issue tracks the rest.

Not every item must become a Note. The low-priority ones are listed so the call is made consciously rather than by omission. Each box is independent; pick them off in any order.

One caveat on CoreFn/Laziness.hs: that file is adapted from the upstream purs compiler, so retitling its comments increases divergence from upstream. Weigh that before touching the laziness items.

High

  • PSString is UTF-16 code units, not text (lib/Language/PureScript/PSString.hs): the lone-surrogate invariant and the dual JSON encoding (string vs Word16 array) are relied on by CoreFn FromJSON and by the IR decode-or-escape fallback in another module.
  • Compiling case expressions to decision trees (lib/Language/PureScript/Backend/IR.hs): seven functions and four types implement the algorithm; today there is only a section banner and a paper URL.
  • Graph-based dead code elimination for Lua (lib/Language/PureScript/Backend/Lua/DCE.hs): six cooperating functions, zero explanatory comments. Needs an overview Note.
  • Runtime lazy initialization (Names.hs, Backend/Lua/Fixture.hs, Backend/IR/Query.hs, Backend.hs): the literal string PSLUA_runtime_lazy silently couples four modules, and renaming either side makes the fixture stop being emitted.
  • runtimeLazy calling convention (CoreFn/Laziness.hs, Backend/Lua/Fixture.hs): a cross-module ABI between generated code and the hand-written Lua fixture; the fixture side has no comment.
  • Foreign bindings structure emitted by the Linker (producer Backend/IR/Linker.hs, consumers Backend/IR/DCE.hs): DCE pattern-matches the exact expression shape the Linker emits, but the only comment sits on the consumer side.
  • Laziness transform for recursive binding groups (CoreFn/Laziness.hs): a module overview that three other modules cooperate with but have no named anchor to cite.
  • Delay and force (CoreFn/Laziness.hs): the two core attributes that both traversal passes and the ordering analysis depend on, documented only as a Haddock on one helper.
  • Initialization order (USE-INIT / USE-USE / USE-IMMEDIATE) (CoreFn/Laziness.hs): the named rules and graph encoding that justify searchReachable, reachablesByIndex, and the SCC fallback, currently floating untitled between two functions.

Medium

  • Inline annotations and inlining heuristics (Backend/IR/Inliner.hs, Linker.hs, Optimizer.hs): how an annotation travels from a pragma through Ann to the decision sites, spanning three modules.
  • Inliner annotations must all be consumed (Backend/IR.hs): the annotation map is a linear resource filled by parseAnnotations, drained by useAnnotation, with runRepM erroring on leftovers, which is how typo'd pragmas get reported.
  • Newtype constructors are erased (Backend/IR.hs): three sites silently implement one convention (construction is identity, application unwraps, matching skips the constructor) and must stay consistent. No comment today.
  • Binding case scrutinees once (Backend/IR.hs): bind a scrutinee once then refer to it, except for cheap literals and refs. Matters for semantics, since decision trees duplicate the scrutinee. Could fold into the decision-tree Note.
  • Match history pruning (Backend/IR.hs): remembering positive and negative constructor outcomes per scrutinee to skip redundant tag tests. Could be a subsection of the decision-tree Note.
  • Unique node IDs across the UberModule (Backend/IR/DCE.hs): the graph construction assumes Id uniqueness across foreigns, bindings, and exports, recorded only by a terse two-liner.
  • MaxRoseTree: late flattening guarantees termination (CoreFn/Laziness.hs): why the abort monad is interleaved with the tree, a termination argument currently smeared across several comment fragments.
  • Lua reserved words as foreign export keys (Backend/Lua.hs, Lua/Key.hs, Lua/Name.hs): the reserved-key round-trip spanning three modules, hinted at only by a two-line comment.
  • Lua operator precedence (Backend/Lua/Types.hs, Lua/Printer.hs): an anonymous block holding the Lua 5.x precedence table that the HasPrecedence instances transcribe and the Printer's parenthesization depends on.
  • Foreign module source format (Backend/Lua/Linker/Foreign.hs): the FFI file contract whose two non-obvious constraints are each load-bearing for a separate parser function plus a consumer in Lua.hs.
  • Nullary functions and Prim.undefined (Backend/Lua.hs): ParamUnused becoming a no-parameter function and Prim.undefined becoming a no-argument call are one arity-changing encoding that must stay in sync, but only one half is commented.
  • Namespaced De Bruijn indices (Backend/IR/Types.hs): Ref's per-name index scheme; possibly fold into the existing Note [Sequential scoping of Let bindings], since the two conventions are intertwined.

Low (decide consciously; inline may suffice)

  • InternalIdentData is the extension point for generated idents (Names.hs): already a complete Haddock, only two reference sites.
  • Prim is always imported (Backend/IR.hs): a short, single-site cross-pass dependency.
  • IR is assumed well-typed (Backend/IR/Optimizer.hs): a real pass-wide assumption, but currently one consuming site.
  • unsafePerformIO in decodeString (PSString.hs): a classic safety-argument Note, but one site and three lines.
  • Compiling case alternatives to nested ifs (test/.../IR/Spec.hs): an ASCII before/after diagram that belongs with mkCase rather than buried in a test.

Metadata

Metadata

Assignees

No one assigned

    Labels

    documentationImprovements or additions to documentation

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions