Skip to content

Fix network-drive path corruption in read_config() and related call sites#3349

Draft
deruyter92 wants to merge 10 commits into
mainfrom
jaap/safe_resolve_paths
Draft

Fix network-drive path corruption in read_config() and related call sites#3349
deruyter92 wants to merge 10 commits into
mainfrom
jaap/safe_resolve_paths

Conversation

@deruyter92

Copy link
Copy Markdown
Collaborator

Fixes: #3348

Problem:
On Windows 11 with SMB-mounted network drives, Path.resolve() can return a
\\?\Volume{GUID}\... form that Windows cannot open via a plain string.
read_config() called .resolve() to compute curr_dir and immediately wrote
that back into config.yaml and returned it in cfg["project_path"].
All downstream code that builds paths from project_path using str() or
string concatenation then failed with FileNotFoundError.

The PyTorch engine was mostly unaffected because it keeps Path objects in
memory; the TensorFlow engine was broken because predict_videos.py converts
paths to strings before passing them to load_config().

Summary:
This PR addresses the problem in 2 ways:
1 Replace Path(..).resolve() with Path(..).absolute(). The purpose of this line is only to
detect when a project has been moved — symlink resolution is unnecessary, and
.absolute() is safe across all platforms and mount types.
2. Add a helper safe_resolve(path: Path) → Path for safe path resolution for when the resolved path
cannot be opened as a string. This can be used wherever symlink resolution is
genuinely useful (e.g. image/data paths).

Changes:

  • In read_config(), replaced Path(configname).parent.resolve() with
    Path(configname).parent.absolute(). to fix read_config() project path resolution can break path #3348
  • Follow the same pattern in create_project/new.py, create_project/new_3d.py, gui/tabs/analyze_videos.py
  • In generate_training_dataset/frame_extraction.py, removed a redundant .resolve() on the config path before passing it to
    read_config() — the fix in read_config() makes this unnecessary.
  • Add a helper in auxiliaryfunctions.py: safe_resolve(path: Path) → Path
  • in pose_estimation_pytorch/modelzoo/memory_replay.py, replaced .resolve() calls on image/data paths with safe_resolve(), so symlinks are still followed where possible while remaining safe on Win11 SMB.

Tests:

Added TestSafeResolve and TestReadConfigProjectPath in
tests/test_auxiliaryfunctions.py:

  • Verifies safe_resolve() returns the resolved path on normal filesystems.
  • Verifies safe_resolve() falls back to .absolute() when resolve() is
    unusable.
  • Verifies read_config() replaces a persisted \\?\Volume{GUID}\... path
    with a usable directory.
  • Verifies read_config() still auto-updates project_path when a project is
    moved to a new location (the original feature this logic implements).

Notes:

  • We currenly use str paths and pathlib.Path mixed throughout the codebase. We should settle for pathlib.Path everywhere (up to 3rd party boundaries that require str). This requires a bigger refactor, so will be addressed in a separate PR. pathlib.Path is better at handling edge cases like server paths, preventing issues like read_config() project path resolution can break path #3348. Also, we can fix brittle cases where paths are joined manually parent + "/" + file, and make the codebase use more uniform function signatures.
  • The fact that read_config() currently also writes to the config is not very convenient or intuitive. It will be better to keep separate read and sync of the config files. This is addressed in Migration to structured and validated configs #3198

@deruyter92 deruyter92 self-assigned this May 29, 2026
@deruyter92 deruyter92 added bug fix! fix for a real buggy one... config Related to config.yaml, ruamel, YAML parsing, ... high-priority labels May 29, 2026
@deruyter92 deruyter92 added this to the Structured configs milestone May 29, 2026
@deruyter92 deruyter92 marked this pull request as ready for review May 29, 2026 08:16
@deruyter92 deruyter92 force-pushed the jaap/safe_resolve_paths branch from 4f5603e to fddcbaa Compare May 29, 2026 08:45
@deruyter92 deruyter92 requested review from C-Achard and Copilot May 29, 2026 09:01

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes Windows 11 SMB/network-drive path corruption caused by Path.resolve() producing \\?\Volume{GUID}\... paths that break downstream string-based file I/O, primarily impacting TensorFlow workflows via read_config() and related call sites.

Changes:

  • Replaced several .resolve() usages with .absolute() for project/root paths to avoid persisting unusable \\?\Volume{GUID}\... strings.
  • Added safe_resolve(path: Path) -> Path intended to keep symlink resolution where useful while remaining safe on Win11 SMB.
  • Added tests covering safe_resolve() behavior and read_config() project path rewriting.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
deeplabcut/utils/auxiliaryfunctions.py Adds safe_resolve() and switches read_config() to use .absolute() for project_path.
deeplabcut/pose_estimation_pytorch/modelzoo/memory_replay.py Replaces .resolve() with safe_resolve() for project/data/image paths.
deeplabcut/gui/tabs/analyze_videos.py Uses .absolute() instead of .resolve() for video parent folder collection.
deeplabcut/generate_training_dataset/frame_extraction.py Removes redundant .resolve() before calling read_config().
deeplabcut/create_project/new.py Uses .absolute() for working directory instead of .resolve().
deeplabcut/create_project/new_3d.py Uses .absolute() for working directory instead of .resolve().
tests/test_auxiliaryfunctions.py Adds tests for safe_resolve() and read_config() network-drive safety.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread deeplabcut/utils/auxiliaryfunctions.py
Comment thread deeplabcut/utils/auxiliaryfunctions.py Outdated
Comment thread deeplabcut/pose_estimation_pytorch/modelzoo/memory_replay.py Outdated
Comment thread tests/test_auxiliaryfunctions.py Outdated
Comment thread deeplabcut/utils/auxiliaryfunctions.py
@C-Achard C-Achard requested review from AlexEMG and MMathisLab June 15, 2026 21:30
@C-Achard C-Achard marked this pull request as draft June 26, 2026 17:06
@C-Achard

Copy link
Copy Markdown
Collaborator

@deruyter92 For the merge conflict, we should keep the new config refactor routing, correct?

@C-Achard

Copy link
Copy Markdown
Collaborator

Marking as draft before we resolve

@deruyter92

deruyter92 commented Jun 26, 2026

Copy link
Copy Markdown
Collaborator Author

@C-Achard that is correct. So needed a mix of current and upstream changes in this case. I've resolved the conflict. Should be good now! (Feel free to un-draft when the tests are passing)

@C-Achard

Copy link
Copy Markdown
Collaborator

I'll fix the tests next week! Thanks a lot

C-Achard added 2 commits June 29, 2026 09:57
Update `test_auxiliaryfunctions.py` to assert `project_path` as a `pathlib.Path` instead of a string. The tests now cast to `str` only for the `"Volume{"` check, use `.exists()` for filesystem validation, and compare against `project_dir.absolute()` directly.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.

Comment on lines +45 to +65
def safe_resolve(path: Path) -> Path:
"""Return a resolved Path that is safe to use with str-based I/O.

Prefers Path.resolve() so that symlinks are followed (useful on Linux).
Falls back to Path.absolute() when the resolved path cannot be opened
as a plain string — e.g. on Windows 11 + SMB network drives where
resolve() may return an unusable \\\\?\\Volume{GUID}\\... form.

See https://github.com/DeepLabCut/DeepLabCut/issues/3348
"""
resolved = path.resolve()
try:
if resolved.is_dir():
# test if os.listdir works after str-conversion
os.listdir(str(resolved))
else:
open(str(resolved)).close()
return resolved
except OSError:
return path.absolute()

Comment on lines +289 to +293
f = tmp_path / "config.yaml"
f.touch()
result = auxiliaryfunctions.safe_resolve(f)
assert result == f.resolve()
assert result.exists()
Comment on lines +48 to +52
Prefers Path.resolve() so that symlinks are followed (useful on Linux).
Falls back to Path.absolute() when the resolved path cannot be opened
as a plain string — e.g. on Windows 11 + SMB network drives where
resolve() may return an unusable \\\\?\\Volume{GUID}\\... form.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug fix! fix for a real buggy one... config Related to config.yaml, ruamel, YAML parsing, ... high-priority

Projects

None yet

Development

Successfully merging this pull request may close these issues.

read_config() project path resolution can break path

5 participants