Fix network-drive path corruption in read_config() and related call sites#3349
Fix network-drive path corruption in read_config() and related call sites#3349deruyter92 wants to merge 10 commits into
read_config() and related call sites#3349Conversation
4f5603e to
fddcbaa
Compare
There was a problem hiding this comment.
Pull request overview
This PR fixes Windows 11 SMB/network-drive path corruption caused by Path.resolve() producing \\?\Volume{GUID}\... paths that break downstream string-based file I/O, primarily impacting TensorFlow workflows via read_config() and related call sites.
Changes:
- Replaced several
.resolve()usages with.absolute()for project/root paths to avoid persisting unusable\\?\Volume{GUID}\...strings. - Added
safe_resolve(path: Path) -> Pathintended to keep symlink resolution where useful while remaining safe on Win11 SMB. - Added tests covering
safe_resolve()behavior andread_config()project path rewriting.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
deeplabcut/utils/auxiliaryfunctions.py |
Adds safe_resolve() and switches read_config() to use .absolute() for project_path. |
deeplabcut/pose_estimation_pytorch/modelzoo/memory_replay.py |
Replaces .resolve() with safe_resolve() for project/data/image paths. |
deeplabcut/gui/tabs/analyze_videos.py |
Uses .absolute() instead of .resolve() for video parent folder collection. |
deeplabcut/generate_training_dataset/frame_extraction.py |
Removes redundant .resolve() before calling read_config(). |
deeplabcut/create_project/new.py |
Uses .absolute() for working directory instead of .resolve(). |
deeplabcut/create_project/new_3d.py |
Uses .absolute() for working directory instead of .resolve(). |
tests/test_auxiliaryfunctions.py |
Adds tests for safe_resolve() and read_config() network-drive safety. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
@deruyter92 For the merge conflict, we should keep the new config refactor routing, correct? |
|
Marking as draft before we resolve |
|
@C-Achard that is correct. So needed a mix of current and upstream changes in this case. I've resolved the conflict. Should be good now! (Feel free to un-draft when the tests are passing) |
|
I'll fix the tests next week! Thanks a lot |
Update `test_auxiliaryfunctions.py` to assert `project_path` as a `pathlib.Path` instead of a string. The tests now cast to `str` only for the `"Volume{"` check, use `.exists()` for filesystem validation, and compare against `project_dir.absolute()` directly.
| def safe_resolve(path: Path) -> Path: | ||
| """Return a resolved Path that is safe to use with str-based I/O. | ||
|
|
||
| Prefers Path.resolve() so that symlinks are followed (useful on Linux). | ||
| Falls back to Path.absolute() when the resolved path cannot be opened | ||
| as a plain string — e.g. on Windows 11 + SMB network drives where | ||
| resolve() may return an unusable \\\\?\\Volume{GUID}\\... form. | ||
|
|
||
| See https://github.com/DeepLabCut/DeepLabCut/issues/3348 | ||
| """ | ||
| resolved = path.resolve() | ||
| try: | ||
| if resolved.is_dir(): | ||
| # test if os.listdir works after str-conversion | ||
| os.listdir(str(resolved)) | ||
| else: | ||
| open(str(resolved)).close() | ||
| return resolved | ||
| except OSError: | ||
| return path.absolute() | ||
|
|
| f = tmp_path / "config.yaml" | ||
| f.touch() | ||
| result = auxiliaryfunctions.safe_resolve(f) | ||
| assert result == f.resolve() | ||
| assert result.exists() |
| Prefers Path.resolve() so that symlinks are followed (useful on Linux). | ||
| Falls back to Path.absolute() when the resolved path cannot be opened | ||
| as a plain string — e.g. on Windows 11 + SMB network drives where | ||
| resolve() may return an unusable \\\\?\\Volume{GUID}\\... form. | ||
|
|
Fixes: #3348
Problem:
On Windows 11 with SMB-mounted network drives,
Path.resolve()can return a\\?\Volume{GUID}\...form that Windows cannot open via a plain string.read_config()called.resolve()to computecurr_dirand immediately wrotethat back into
config.yamland returned it incfg["project_path"].All downstream code that builds paths from
project_pathusingstr()orstring concatenation then failed with
FileNotFoundError.The PyTorch engine was mostly unaffected because it keeps
Pathobjects inmemory; the TensorFlow engine was broken because
predict_videos.pyconvertspaths to strings before passing them to
load_config().Summary:
This PR addresses the problem in 2 ways:
1 Replace Path(..).resolve() with Path(..).absolute(). The purpose of this line is only to
detect when a project has been moved — symlink resolution is unnecessary, and
.absolute()is safe across all platforms and mount types.2. Add a helper
safe_resolve(path: Path) → Pathfor safe path resolution for when the resolved pathcannot be opened as a string. This can be used wherever symlink resolution is
genuinely useful (e.g. image/data paths).
Changes:
read_config(), replacedPath(configname).parent.resolve()withPath(configname).parent.absolute(). to fix read_config() project path resolution can break path #3348create_project/new.py,create_project/new_3d.py,gui/tabs/analyze_videos.pygenerate_training_dataset/frame_extraction.py, removed a redundant.resolve()on the config path before passing it toread_config()— the fix inread_config()makes this unnecessary.auxiliaryfunctions.py:safe_resolve(path: Path) → Pathpose_estimation_pytorch/modelzoo/memory_replay.py, replaced.resolve()calls on image/data paths withsafe_resolve(), so symlinks are still followed where possible while remaining safe on Win11 SMB.Tests:
Added
TestSafeResolveandTestReadConfigProjectPathintests/test_auxiliaryfunctions.py:safe_resolve()returns the resolved path on normal filesystems.safe_resolve()falls back to.absolute()whenresolve()isunusable.
read_config()replaces a persisted\\?\Volume{GUID}\...pathwith a usable directory.
read_config()still auto-updatesproject_pathwhen a project ismoved to a new location (the original feature this logic implements).
Notes:
strpaths andpathlib.Pathmixed throughout the codebase. We should settle for pathlib.Path everywhere (up to 3rd party boundaries that requirestr). This requires a bigger refactor, so will be addressed in a separate PR.pathlib.Pathis better at handling edge cases like server paths, preventing issues like read_config() project path resolution can break path #3348. Also, we can fix brittle cases where paths are joined manuallyparent + "/" + file, and make the codebase use more uniform function signatures.read_config()currently also writes to the config is not very convenient or intuitive. It will be better to keep separate read and sync of the config files. This is addressed in Migration to structured and validated configs #3198