fix: lower De Bruijn indices when a binder is removed (#56)#70
Merged
Conversation
The IR uses per-name De Bruijn indices: a `Ref (Local n) k` selects the k-th enclosing binder named `n`. Two optimizer passes removed a binder without lowering the references that had skipped over it, leaving them pointing one binder too far out -- an unbound local that the Lua backend rejects with `UnexpectedRefBound`. This aborts `Data.Array` compilation (`foldRecM`, #56): * beta reduction drops the applied lambda; * dead-code elimination blanks an unused named binder to `ParamUnused`. Both now repair the survivors with a new `unshift`, the inverse of `shift 1`. `shift` is refactored to share its scope-tracking traversal (`overFreeIndex`) with `unshift` so the two cannot drift apart. Tests: property and example coverage for shift/unshift/substitute, including the classic capture-avoiding cases (Types spec); a well-scopedness invariant over a well-scoped expression generator (this found the DCE sibling bug) plus deterministic regressions for both passes (Optimizer spec); and an eval golden `Golden.TailRecM2Shadow` over the `foldRecM` shape that triggered #56.
There was a problem hiding this comment.
Pull request overview
Fixes a scoping bug in the IR optimizer where removing a binder (via beta reduction or DCE blanking) could leave same-named local references with De Bruijn indices pointing past the remaining binders, causing Lua codegen to abort (e.g. Data.Array.foldRecM, issue #56).
Changes:
- Introduces
unshift(inverse ofshift 1) and refactorsshift/unshiftto share a single scope-tracking traversal (overFreeIndex). - Applies
unshiftwhen beta-reducing away a named lambda binder and when DCE blanks an unused named binder toParamUnused. - Adds direct De Bruijn primitive property tests + optimizer well-scopedness invariant tests, plus a golden regression reproducing the original
tailRecM2/b-name collision shape.
Reviewed changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| test/ps/spago.dhall | Adds tailrec dependency needed by the new PureScript golden regression. |
| test/ps/output/Golden.TailRecM2Shadow.Test/golden.lua | New golden Lua output for the regression case. |
| test/ps/output/Golden.TailRecM2Shadow.Test/golden.ir | New golden IR output for the regression case. |
| test/ps/output/Golden.TailRecM2Shadow.Test/eval/golden.txt | Expected eval output for the golden regression. |
| test/ps/output/Golden.TailRecM2Shadow.Test/eval/.gitignore | Ignores eval runner’s actual.txt. |
| test/ps/output/Golden.TailRecM2Shadow.Test/corefn.json | New golden CoreFn input for the regression case. |
| test/ps/golden/Golden/TailRecM2Shadow/Test.purs | Adds a PureScript repro mirroring the original shadowing/capture scenario. |
| test/Language/PureScript/Backend/IR/Types/Spec.hs | Adds laws/examples for shift/unshift and capture-avoiding substitution. |
| test/Language/PureScript/Backend/IR/Optimizer/Spec.hs | Adds a well-scopedness invariant property + deterministic regressions for beta-reduce and DCE. |
| test/Language/PureScript/Backend/IR/Gen.hs | Adds scopedExp generator to produce well-scoped terms for optimizer invariants. |
| lib/Language/PureScript/Backend/IR/Types.hs | Refactors shift and adds unshift via shared overFreeIndex traversal. |
| lib/Language/PureScript/Backend/IR/Optimizer.hs | Fixes beta reduction to unshift after substitution when removing a binder. |
| lib/Language/PureScript/Backend/IR/DCE.hs | Fixes DCE binder-blanking to unshift outer references when removing a name slot. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #56.
Compiling
Data.Arrayaborted withUnexpected bound reference: Ref Nothing (Local (Name "b")) 1, which blocked thepurescript-lua-arraysport entirely. The single declarationData.Array.foldRecMwas enough to trigger it.Root cause
The IR addresses locals with per-name De Bruijn indices:
Ref (Local n) kmeans the k-th enclosing binder namedn, counting from the innermost. When an optimizer pass removes a binder, every reference that pointed past that binder (index ≥ 1, because it was skipping over it) has to drop by one. Two passes removed a binder and skipped that repair:betaReducedrops the applied lambda. InfoldRecM f b array = tailRecM2 go b 0, the outer parameterbcollides withtailRecM2's own third parameterb. InliningtailRecM2(a single-use dictionary accessor) beta-reduces under that innerb; capture avoidance shifts the outer reference up to index 1, and removing the inner binder left it stranded there instead of lowering it back to 0.ParamUnused. If an inner lambda's parameter is unused but its body still references an outer binder of the same name (at index 1, skipping the inner one), blanking the binder must lower that reference to 0. This is the same class of bug; the well-scopedness property below found it.Both sites now repair the survivors with a new
unshift, the inverse ofshift 1. To keep the two from drifting apart,shiftis refactored to share its scope-tracking traversal (overFreeIndex) withunshift; the strictminIndex < indexguard leaves genuine inner references alone and keeps theNaturalindex from underflowing.Tests
There was no unit coverage for the De Bruijn primitives at all, so the original bug could only surface as an integration crash. This adds a layer that exercises them directly:
unshift . shift 1 ≡ idand asubstitute-identity law as properties, plus the classic capture-avoiding substitution examples (capture under a same-named binder, pass-through of an unrelated binder, a shadowing binder stopping the substitution, a reference reaching past one).foldRecMshape for beta reduction, a shadowing unused binder for DCE.Golden.TailRecM2Shadow, an eval golden over atailRecM2-basedfoldRecMparaphrase whose accumulator parameter is namedbto reproduce the original collision.Verified:
cabal testis green, andpslua --entry Data.Arraynow compilespurescript-lua-arraysto luacheck-clean Lua.