Fix mount manager activation error when already exists by dmcgowan · Pull Request #13126 · containerd/containerd

dmcgowan · 2026-03-26T00:41:21Z

Correctly handle cases where the mount activation still exists:

If activation is fully activate, then just return already exists and allow the caller to return error or call Info to continue.
If activation is stale or incomplete due to crash during activation, overwrite the identifier and cleanup the incomplete activation during activate.

The issue can arise when restarting a task in which the previous mount not completely deactivated. Since mounts are their own component, there is no guarantee that a mount will get immediately cleaned up when a task is removed (such as with an unclean shutdown). Same if the process is killed during activation, activating a mount requires a two stage transaction to claim an identifier, perform the mounts, then register the mounts back with the identifier. If an unclean shutdown occurs between the stages, an ID can be left claimed without any active mounts. Those can just be reclaimed on activate with any previous mount state cleaned up during GC or during activate.

…dy exists

prakleumas · 2026-04-09T17:13:03Z

+				cur, err := sbkt.CreateBucket([]byte{byte(i)})
+				if err != nil {
+					return err
+				}


Is there a guaranteed upper bound on the number of system mounts? Casting the loop index to a single byte truncates the value, which limits the maximum number of system mounts to 256 before silently wrapping around to zero and causing CreateBucket to fail. Furthermore, since BoltDB iterates keys lexicographically, a naive integer-to-string conversion would alter the expected read sequence (e.g., "10" sorts before "2"), potentially disrupting mount layering order. It might be safer to format the index as a zero-padded string to preserve both unbounded capacity and the correct iteration sequence.

Suggested change

cur, err := sbkt.CreateBucket([]byte{byte(i)})

if err != nil {

return err

}

cur, err := sbkt.CreateBucket([]byte(fmt.Sprintf("%04x", i)))

if err != nil {

return err

}

prakleumas · 2026-04-09T17:13:44Z

+	if len(m.Options) > 0 {
+		if err := bkt.Put(bucketKeyOptions, []byte(strings.Join(m.Options, ","))); err != nil {
+			return err
+		}


Are there any scenarios where a system mount option might natively contain a comma (e.g., a complex assignment like lowerdir=a,b)? If an option contains a comma, joining and splitting by , will incorrectly divide it into disjoint slice elements. Since mount options are passed as C-strings to the kernel and cannot contain null characters, it is generally safer to use a null byte \x00 as the slice delimiter to perfectly preserve the original array boundaries.

Suggested change

if len(m.Options) > 0 {

if err := bkt.Put(bucketKeyOptions, []byte(strings.Join(m.Options, ","))); err != nil {

return err

}

if len(m.Options) > 0 {

if err := bkt.Put(bucketKeyOptions, []byte(strings.Join(m.Options, "\x00"))); err != nil {

return err

}

}

prakleumas · 2026-04-09T17:14:13Z

+	if v := bkt.Get(bucketKeyOptions); len(v) > 0 {
+		m.Options = strings.Split(string(v), ",")
+	}


If the delimiter in putSystemMount is updated to prevent boundary collisions, this reader will also need to be updated to match the new delimiter. This ensures that any option strings containing commas are preserved accurately.

Suggested change

if v := bkt.Get(bucketKeyOptions); len(v) > 0 {

m.Options = strings.Split(string(v), ",")

}

if v := bkt.Get(bucketKeyOptions); len(v) > 0 {

m.Options = strings.Split(string(v), "\x00")

}

prakleumas · 2026-04-09T17:15:29Z

^{beep boop}
Hello! I am @samuelkarp's friendly code review robot 🤖! Sam asked me to take a look and post my comments directly on GitHub.

Please treat my comments as suggestions and advisory, not mandates. Since I am an LLM and my output is used here directly, you should evaluate carefully and see if my suggestions make sense. Feel free to directly resolve any you disagree with.

^{This is not an automated action. This bot does not read or reply to comments.}

Copilot

Pull request overview

This PR improves mount-manager activation behavior when an activation record for the same name already exists, allowing callers (notably the v2 task manager) to reuse a fully-active mount and to recover from stale/incomplete activation records left behind by crashes.

Changes:

Update v2 task creation to handle ErrAlreadyExists from mount activation by querying mount Info() and continuing with existing mounts.
Make mount manager Activate() differentiate “fully active” vs “stale/incomplete” existing records, reclaim stale records, and serialize same-name activations to avoid races.
Persist and reload “system mounts” in the mount manager so Info() can return complete activation info; add comprehensive tests for these scenarios.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File	Description
core/runtime/v2/task_manager.go	Reuses existing mount activation on `ErrAlreadyExists` by calling `m.mounts.Info()` and updating `opts.Rootfs`.
core/mount/manager/manager.go	Adds keyed per-name activation locking; distinguishes active vs stale records; cleans stale targets; persists/reads system mounts.
core/mount/manager/manager_test.go	Adds tests for already-exists activation, stale/incomplete recovery, `Info()`, system mount persistence, and same-name concurrency.
core/mount/manager/buckets.go	Extends BoltDB schema keys to support persisting system mounts.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Correctly handle cases where the mount activation still exists: - If activation is fully activate, then just return already exists and allow the caller to return error or call Info to continue. - If activation is stale or incomplete due to crash during activation, overwrite the identifier and cleanup the incomplete activation during activate. Signed-off-by: Derek McGowan <derek@mcg.dev>

github-project-automation Bot added this to Pull Request Review Mar 26, 2026

github-project-automation Bot moved this to Needs Triage in Pull Request Review Mar 26, 2026

k8s-ci-robot added the size/L label Mar 26, 2026

dosubot Bot added area/runtime Runtime kind/bug labels Mar 26, 2026

dmcgowan added cherry-pick/2.2.x Change to be cherry picked to release/2.2 branch and removed kind/bug area/runtime Runtime size/L labels Mar 26, 2026

dmcgowan added this to the 2.3 milestone Mar 26, 2026

dmcgowan moved this from Needs Triage to Needs Reviewers in Pull Request Review Mar 26, 2026

github-actions Bot added a commit to dmcgowan/containerd that referenced this pull request Apr 4, 2026

Merge containerd#13126: Fix mount manager activation error when alrea…

a4fdf6a

…dy exists

github-actions Bot added a commit to dmcgowan/containerd that referenced this pull request Apr 5, 2026

Merge containerd#13126: Fix mount manager activation error when alrea…

65194e3

…dy exists

github-actions Bot added a commit to dmcgowan/containerd that referenced this pull request Apr 6, 2026

Merge containerd#13126: Fix mount manager activation error when alrea…

196a14a

…dy exists

github-actions Bot added a commit to dmcgowan/containerd that referenced this pull request Apr 7, 2026

Merge containerd#13126: Fix mount manager activation error when alrea…

b971c25

…dy exists

github-actions Bot added a commit to dmcgowan/containerd that referenced this pull request Apr 8, 2026

Merge containerd#13126: Fix mount manager activation error when alrea…

0fe842f

…dy exists

github-actions Bot added a commit to dmcgowan/containerd that referenced this pull request Apr 9, 2026

Merge containerd#13126: Fix mount manager activation error when alrea…

af2e818

…dy exists

prakleumas reviewed Apr 9, 2026

View reviewed changes

Copilot AI review requested due to automatic review settings April 20, 2026 17:59

dmcgowan force-pushed the handle-mount-already-exists branch from cd4569f to ae95987 Compare April 20, 2026 17:59

k8s-ci-robot added the size/L label Apr 20, 2026

Copilot started reviewing on behalf of dmcgowan April 20, 2026 18:00 View session

Copilot AI reviewed Apr 20, 2026

View reviewed changes

Comment thread core/mount/manager/manager.go

Comment thread core/runtime/v2/task_manager.go

Comment thread core/mount/manager/manager.go Outdated

Comment thread core/mount/manager/manager.go Outdated

dmcgowan force-pushed the handle-mount-already-exists branch from ae95987 to 80ec03f Compare April 21, 2026 01:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix mount manager activation error when already exists#13126

Fix mount manager activation error when already exists#13126
dmcgowan wants to merge 1 commit intocontainerd:mainfrom
dmcgowan:handle-mount-already-exists

dmcgowan commented Mar 26, 2026

Uh oh!

Uh oh!

Uh oh!

prakleumas Apr 9, 2026

Uh oh!

prakleumas Apr 9, 2026

Uh oh!

prakleumas Apr 9, 2026

Uh oh!

prakleumas commented Apr 9, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

dmcgowan commented Mar 26, 2026

Uh oh!

Uh oh!

Uh oh!

prakleumas Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

prakleumas Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

prakleumas Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

prakleumas commented Apr 9, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants