Skip to content

update list_videos_in_folder#3303

Open
deruyter92 wants to merge 20 commits into
mainfrom
jaap/update_list_videos_in_folder
Open

update list_videos_in_folder#3303
deruyter92 wants to merge 20 commits into
mainfrom
jaap/update_list_videos_in_folder

Conversation

@deruyter92
Copy link
Copy Markdown
Collaborator

@deruyter92 deruyter92 commented May 4, 2026

Summary

The helper list_videos_in_folder in the PyTorch branch (used for collecting a list of videos for inference) currently always filters the input files by extension: either the default SUPPORTED_VIDEOS or custom specified video_type. When the input paths contain directories, it makes sense to require a list of valid extensions in order to collect video files from it. However, for specified video files, this strict requirement is not always the preferred behavior: for instance this does not allow specifying video files without extension.

The current PR changes the contract to

  • Always require valid extensions for collecting files from a directory (default SUPPORTED_VIDEOS or custom video_type)
  • Don't filter specified file paths for valid extensions, by default, but assume that the user-specified files are valid videos.
  • When video_type is provided, do filter the the video files by the specified valid extensions.

solves #3300

Details

PR Status

  • relocate list_videos_in_folder to auxfun_videos: more centralized instead of pytorch-specific
  • rename to collect_video_paths, which more accurately reflects function. feedback on naming is welcome!
  • conditional filtering:
    • treat specified video files differently from video paths collected from specified directory
    • files are not filtered by extension by default, only when explicitly specifying extensions to filter for.
    • directories are scanned using SUPPORTED_VIDEOS by default or with specified extensions filter if set.
    • All cases are filtered using exclude_patterns which is defaults to ["_labeled.", "_full."] to match prior DLC outputs.
  • add basic testing
  • deprecate original list_videos_in_folder and map to new collect_video_paths
  • deprecate get_list_of_videos and map to new collect_video_paths
  • deprecate get_video_list and map to new collect_video_paths
  • keep grab_files_in_folder, get_camerawise_videos (different implementation)
  • decide on inline path.iterdir() scan in create_project/new.py and gui/tabs/create_project.py
  • ...

Examples:
New signature

def collect_video_paths(
    data_path: str | Path | list[str | Path],
    extensions: str | list[str] | None = None,
    shuffle: bool = False,
    exclude_patterns: list[str] | None = None,
) -> list[Path]:
    """
    Collects video paths from a given set of data paths: directories, files or mix of both.
    Optionally filters paths by extension and excludes patterns. Files and directories are treated differently:
    - Files are not filtered by extension by default. Set ``extensions`` if needed, to filter also supplied files.
    - Directory contents are filtered by ``SUPPORTED_VIDEOS`` by default. Specify custom ``extensions`` if needed.
    - exclude patterns are ALWAYS applied for directory contents and supplied files. Set to `[]` to exclude no patterns.

    Args:
        data_path: Path or list of paths to folders containing videos, or individual
            video files. Can be a mix of directories and files.
        extensions: The types of videos to filter for (e.g., ".mp4", ".avi", etc.).
            - If set: filter all videos with the given extensions. Both for directory contents and supplied files.
            - If ``None``, provided files are not filtered, but directory contents are filtered by ``SUPPORTED_VIDEOS``.
        shuffle: Whether to shuffle the order of videos. If False, videos are returned
            in sorted order for deterministic behavior.
        exclude_patterns: Patterns to exclude from the collection. Defaults to ["*_labeled.*", "*_full.*"].
            Set to [] to exclude no patterns.

    Returns:
        The paths of videos to analyze. Duplicate paths are removed.

    Raises:
        FileNotFoundError: If any path in data_path does not exist.
    """
 ...
    return unique_videos

deprecation marker:

@deprecated(replacement="deeplabcut.collect_video_paths", since="3.0.0")
def get_list_of_videos(
   videos: list[str] | str,
   videotype: list[str] | str = "",
   in_random_order: bool = True,
) -> list[str]:
   return collect_video_paths(
       data_path=videos,
       extensions=videotype,
       shuffle=in_random_order,
   )

def deprecated(
   replacement: str | None = None,
   since: str | None = None,
   removed_in: str | None = None,
) -> Callable:
   """Mark a function as deprecated.

   Args:
       replacement: Fully-qualified name of the replacement callable, e.g.
           ``"deeplabcut.utils.auxfun_videos.list_videos_in_folder"``.
       since: Version in which the function was deprecated.
       removed_in: Version in which the function will be removed.
   """
   ...
   return decorator

deruyter92 added 2 commits May 4, 2026 22:27
…iles if filter `video_type` is set.

- Accept files without extension
- Default folder searching is kept as is (using valid video extensions)
@C-Achard
Copy link
Copy Markdown
Collaborator

C-Achard commented May 5, 2026

Note: CI seems to be failing due to tf-macos, merging #3292 may potentially help

@C-Achard C-Achard linked an issue May 5, 2026 that may be closed by this pull request
2 tasks
@C-Achard C-Achard added the bug fix! fix for a real buggy one... label May 6, 2026
@C-Achard
Copy link
Copy Markdown
Collaborator

@deruyter92 Great to add deprecations, definitely agree. Do you think a separate PR for that specifically would be useful, or is it more efficient to merge this directly ?

Copy link
Copy Markdown
Collaborator

@C-Achard C-Achard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really nice work overall! Definitely much better to have the video loading centralized, and deprecations are nice.

I added quite a few comments that I hope will help future us avoid mistakes and make things easier to use, happy to discuss if there are any concerns.

I will open a PR for the deprecations, I have a small design suggestion to ensure we can greatly extend and automate the system later if needed, while keeping current design and lightweight code.

Comment thread deeplabcut/utils/auxfun_videos.py Outdated
Comment thread deeplabcut/utils/auxfun_videos.py Outdated
Comment thread deeplabcut/utils/auxfun_videos.py
Comment thread deeplabcut/utils/auxfun_videos.py Outdated

def collect_video_paths(
data_path: str | Path | list[str | Path],
extensions: str | list[str] | None = None,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here also I'd argue for slightly clearer intent: if None, filtered by support, OK.
Currently [] will also result in default behavior. If we want [] to be "accept no extensions", changes would be needed.
Depends on downstream use.

Copy link
Copy Markdown
Collaborator Author

@deruyter92 deruyter92 May 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The extensions parameter is now updated to yield the following groups of behavior:

  1. Default
    • None (default) -> only filter directory contents using SUPPORTED_VIDEOS, but accept all provided files without filtering
    • "" -> same behavior as None (Required for backward compatibility)
  2. Only select specified extensions
    • Sequence of str, e.g. (".avi", ".mp4") -> filter both files and directories, collect only paths with ".avi" suffix or ".mp4" suffix
    • String e.g. ".avi" -> identical to single-length sequence: (".avi")
  3. Only select no-extensions
    • Empty list / tuple [] -> only select files without extension

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My prior behavior just treated all Falsy values, including "" the same as None. But this would only yield option 1 and 2.

Maybe the current update is a bit too complex, but at least it supports all cases.

Comment thread deeplabcut/utils/auxfun_videos.py
Comment thread deeplabcut/utils/deprecation.py
Comment thread deeplabcut/utils/deprecation.py Outdated
Comment thread tests/test_auxiliaryfunctions.py
deruyter92 and others added 8 commits May 12, 2026 13:34
* Add structured deprecation info and warnings

Introduce a DLCDeprecationWarning and a DeprecationInfo pydantic model to standardize deprecation metadata (kind, target, replacement, since, removed_in, renamed params) with parsing and validation of versions. Revamp deprecated and renamed_parameter decorators to build messages from DeprecationInfo, emit DLCDeprecationWarning, attach metadata to wrapped callables (__deprecated_info__, __deprecated_params__), use ParamSpec/TypeVar typing for wrappers, and enforce error when both old and new kwargs are passed. Switch to packaging.version for version parsing.

* Use DLCDeprecationWarning and add metadata tests

Replace generic DeprecationWarning checks with DLCDeprecationWarning and import packaging.version.Version. Add tests verifying deprecated decorators attach metadata (including since/removed_in parsed as Version), validate invalid version inputs, and ensure removed_in > since. Also add tests for renamed_parameter behavior (conflicting old+new raises, metadata attachment, and invalid since handling) and small docstring/name preservation assertions.

* Add packaging as core dep
@deruyter92 deruyter92 marked this pull request as ready for review May 12, 2026 15:02
@deruyter92 deruyter92 requested review from AlexEMG and MMathisLab May 12, 2026 15:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug fix! fix for a real buggy one...

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] DLC3 cannot process suffix-less video files

2 participants