fix(skills): stage updates in a temp dir and swap in-place#13449
Open
SamMorrowDrums wants to merge 4 commits into
Open
fix(skills): stage updates in a temp dir and swap in-place#13449SamMorrowDrums wants to merge 4 commits into
SamMorrowDrums wants to merge 4 commits into
Conversation
Previously, `gh skill update --dir` walked two directory levels up for namespaced skill layouts when computing the install base, causing the skill to be relocated and the original directory to be removed by the post-install migration step. The same migration also relocated namespaced skills found under agent install directories during a normal update, even though users expect updates to land in the same place the skill was discovered. Update now stages each install into a private temp directory and, on success, atomically swaps the contents into the existing skill directory. This: - Always updates in-place, regardless of whether the skill lives under an agent host directory or a custom `--dir`, and regardless of whether the layout is flat or namespaced. - Preserves the skill directory's inode so symlinks, mounts, and external references continue to resolve. - Removes stale files left over from the previous version. - Leaves the existing skill completely untouched if the install fails partway through. The previous test harness silently skipped `verify` whenever `wantErr` was set, which masked the existing failure-preservation assertion. That harness shortcut is removed so the assertion is enforced. Adds an acceptance test that plants a namespaced skill, runs an update with `--dir`, and asserts the skill remains at its original namespaced path with no flat-path duplicate. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Contributor
There was a problem hiding this comment.
Pull request overview
Fixes gh skill update so updates are applied in-place (preserving the existing skill directory) by staging installs in a temporary directory and then replacing the contents of the installed skill directory, preventing accidental relocation/deletion of namespaced installs (closes #13370).
Changes:
- Reworks update flow to stage an updated skill in a temp directory and then copy/move files into the existing install directory (no directory relocation/removal).
- Updates unit tests to assert in-place behavior (including stale-file cleanup) and ensures error-path
verifyblocks actually run. - Adds an acceptance test covering in-place updates for namespaced skills when using
--dir.
Show a summary per file
| File | Description |
|---|---|
| pkg/cmd/skills/update/update.go | Replaces “install then possibly delete/migrate” logic with a staging + in-place content replacement approach. |
| pkg/cmd/skills/update/update_test.go | Updates expectations for in-place behavior; fixes test harness so verify runs even when wantErr is set. |
| acceptance/testdata/skills/skills-update-inplace.txtar | New acceptance test ensuring namespaced --dir updates don’t relocate or delete the original directory. |
Copilot's findings
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comments suppressed due to low confidence (3)
pkg/cmd/skills/update/update.go:469
- The implementation deletes the existing contents of
u.local.dirbefore moving/copying staged files in. IfRemoveAllor any subsequentmoveOrCopy/copyPathcall fails partway through (permissions, disk full, interrupted copy, etc.), the skill directory can be left partially updated or empty, which contradicts the function contract/comments (“failed install leaves the existing skill completely untouched”) and the PR description. To make this failure-safe, consider a transactional approach (e.g., rename existing entries into a temporary backup dir first, populate from staging, and restore from backup on error; ideally stage on the same filesystem so renames can be atomic).
if err := os.MkdirAll(u.local.dir, 0o755); err != nil {
return fmt.Errorf("could not ensure skill directory %s: %w", u.local.dir, err)
}
existing, err := os.ReadDir(u.local.dir)
if err != nil {
return fmt.Errorf("could not read skill directory %s: %w", u.local.dir, err)
}
for _, entry := range existing {
if err := os.RemoveAll(filepath.Join(u.local.dir, entry.Name())); err != nil {
return fmt.Errorf("could not clean skill directory %s: %w", u.local.dir, err)
}
}
staged, err := os.ReadDir(stagedSkillDir)
if err != nil {
return fmt.Errorf("could not read staged skill directory %s: %w", stagedSkillDir, err)
}
for _, entry := range staged {
src := filepath.Join(stagedSkillDir, entry.Name())
dst := filepath.Join(u.local.dir, entry.Name())
if err := moveOrCopy(src, dst); err != nil {
return fmt.Errorf("could not move %s into place: %w", entry.Name(), err)
}
}
pkg/cmd/skills/update/update.go:481
moveOrCopyfalls back tocopyPathfor anyos.Renameerror. That can mask real failures (e.g., permission denied, destination already exists, invalid path) and may produce partial copies while still discarding the original error context. It’s safer to only fall back for cross-device rename failures (EXDEV) and otherwise return the rename error.
// moveOrCopy renames src to dst, falling back to a recursive copy when the
// rename crosses filesystem boundaries (e.g. when TMPDIR lives on a separate
// volume from the skill directory).
func moveOrCopy(src, dst string) error {
if err := os.Rename(src, dst); err == nil {
return nil
}
return copyPath(src, dst)
}
pkg/cmd/skills/update/update.go:515
copyPathreads entire files into memory viaos.ReadFilebefore writing. If a skill includes large files, this can cause unnecessary memory spikes. Consider switching to streaming copy (open src/dst andio.Copy) while still applying the desired permissions.
default:
data, err := os.ReadFile(src)
if err != nil {
return err
}
return os.WriteFile(dst, data, info.Mode().Perm())
}
- Files reviewed: 3/3 changed files
- Comments generated: 1
Address review feedback on the in-place update implementation: - Stage into a sibling of the existing skill directory so every rename during the swap is intra-filesystem and atomic. This eliminates the cross-device copy fallback and the associated memory-spike concern (no more os.ReadFile/os.WriteFile of arbitrary skill files). - Replace the "clear-then-fill" sequence with a backup-and-swap: move existing entries into a sibling backup dir, then move staged entries into place. If any step fails, restore from backup. This makes the "failed install leaves the existing skill untouched" guarantee hold even when the failure occurs mid-swap rather than only during install. - Drop the moveOrCopy/copyPath helpers entirely. Adds two unit tests: TestSwapDirectoryContents_PreservesDestInode asserts the directory's identity is preserved across a successful swap, and TestSwapDirectoryContents_RollsBackOnFailure exercises the rollback path by pointing the swap at a non-existent staged dir, verifying original content (including nested subdirs) is fully restored. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Match prior-art style: leading docstring on helper tests (see TestDeduplicateByName_Namespaced) and use assert.Empty for clearer failure output instead of a bare t.Errorf in a loop. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #13370.
Problem
gh skill updatecould relocate skills and delete the original install directory:--dir, namespaced layouts ({dir}/{namespace}/{name}/SKILL.md) had their install base computed by walkingfilepath.Dirtwice, putting the new install one level too shallow. The post-install migration step thenRemoveAlled the original namespaced directory, taking the user's files with it.--dir, the same migration cleanup quietly relocated namespaced skills found under an agent install directory to a flat layout, even though users expect updates to land where the skill currently lives.Either path breaks symlinks, mounts, and any external references that point at the skill directory.
Fix
Each update is staged into a sibling directory of the existing skill dir (same filesystem, so renames are atomic). On success the contents are swapped in via per-entry rename:
If any step fails, the rollback path removes any freshly installed entries and moves the originals back from the backup.
This guarantees:
--dir.Tests
acceptance/testdata/skills/skills-update-inplace.txtarplants a namespaced skill, runsgh skill update --dir, and asserts the skill remains at the original namespaced path with no flat-path duplicate.TestSwapDirectoryContents_PreservesDestInodeasserts the skill directory's identity is preserved across a successful swap (viaos.SameFile).TestSwapDirectoryContents_RollsBackOnFailureexercises the rollback path and asserts the original content (including nested subdirs) is fully restored.--dir" unit test is updated to assert in-place behavior and stale-file cleanup.install_failure_during_update_reports_error_and_continuesunit test had a harness shortcut that silently skipped itsverifyblock wheneverwantErrwas set. That shortcut is removed so the "preserve original on install failure" assertion is now actually enforced.