Skip to content

External detectors for top-down models#3372

Draft
C-Achard wants to merge 53 commits into
DeepLabCut:mainfrom
C-Achard:cy/external-detectors
Draft

External detectors for top-down models#3372
C-Achard wants to merge 53 commits into
DeepLabCut:mainfrom
C-Achard:cy/external-detectors

Conversation

@C-Achard

@C-Achard C-Achard commented Jun 15, 2026

Copy link
Copy Markdown
Collaborator

Overview

This extension adds support for "external" object detectors in the PyTorch
top-down pose pipeline. The intended workflow is:

image/video frame
  -> detector
  -> bounding boxes / crops
  -> top-down pose model

The design supports two closely related use cases:

  1. Live external detector inference
    A foundation model or other pretrained detector is run at inference time
    and its detections are converted into the DeepLabCut detector output format.

  2. Offline / precomputed detector outputs
    Detector outputs are run once, saved to disk, and then reused later for
    training or inference of the top-down pose model without rerunning the
    detector.

This second mode is especially useful when:

  • detector inference is expensive,
  • detector code has heavyweight dependencies,
  • detector weights should remain frozen,

Core concepts

  1. BaseExternalDetector
    External detectors implement a minimal detector API by subclassing
    BaseExternalDetector and registering themselves in EXTERNAL_DETECTORS.

    A detector must expose:

    predict(images: list[torch.Tensor]) -> list[DetectionResult]
    

    where each DetectionResult is a dictionary containing at least:

    {
        "boxes":  FloatTensor[N, 4],  # absolute XYXY pixel coordinates
        "scores": FloatTensor[N],
        "labels": LongTensor[N],
    }
    

    The detector is assumed to be inference-oriented and typically frozen.
    It does not need to implement a training loop.

  2. Detector inference runner compatibility
    BaseExternalDetector provides a forward() shim so it can be used by the
    existing DLC inference runner stack. This means an external detector can
    be wrapped by a standard DetectorInferenceRunner and postprocessed into
    DLC-style detector context:

    {
        "bboxes": np.ndarray[N, 4],
        "bbox_scores": np.ndarray[N],
    }
    

    In the top-down data path, boxes are expected in XYWH format after
    postprocessing.
    Note that other useful fields like "classes" or "labels" are currently accepted but not deeply integrated into the rest of the code.

  3. PrecomputedDetectorRunner
    PrecomputedDetectorRunner is an adapter that makes saved bounding boxes
    behave like a detector runner. It implements:

    inference(images, shelf_writer=None) -> list[DetectorContext]
    

    so it can be plugged directly into dataset creation or other DLC code that
    expects a detector runner. It will load from disk and emit DLC-formatted
    detector outputs on demand.

  4. BBox schema
    Precomputed detections are stored in a JSON artifact using two schema types:

    • BBoxEntry: detections for one image
    • BBoxes: split-aware container with:
      train: list[BBoxEntry]
      test: list[BBoxEntry]

    Each BBoxEntry contains:

    • bboxes
    • bbox_scores
    • bbox_format ("xywh" or "xyxy")
    • optional image_path
  5. Bounding-box source selection
    Dataset creation now owns the decision of where bounding boxes come from.
    For top-down and detection tasks, the bbox source is resolved in this order:

    1. detector_runner is provided
      -> use detection boxes
    2. explicit config value data.bbox_source
    3. loader/task default

    For historical DLC compatibility, DLCLoader defaults to KEYPOINTS for
    top-down and detect tasks unless overridden. This preserves the previous
    behavior for projects that do not use external detectors.

  6. Multi-animal matching
    For multi-animal top-down training, detector predictions must be matched to
    annotated individuals. This implementation does that by:

    • deriving a reference box for each annotated individual
      (prefer keypoints, otherwise existing bbox),
    • computing pairwise IoU between reference boxes and detector boxes,
    • solving assignment with Hungarian matching,
    • accepting matches above bbox_match_iou_threshold,
    • optionally falling back to the original GT/reference bbox when no match
      is found.

    Notes:

    • only "real individual" annotations are matched (category_id == 1),
      which avoids assigning detector boxes to unique-bodypart-only annotations.
    • in the single-animal case, the highest-scoring detector box is used
      directly.

What this does not do (yet)

  • If the particular detector outputs require custom pre- or post-processing (e.g. custom score thresholding, prompts, class filtering, NMS, etc.), it is currently expected that this be handled within the custom detector implementation or in a wrapper around it before saving precomputed outputs.
  • There is no bounding box validity checking or filtering built into the dataset creation process beyond the IoU-based matching and optional GT fallback.

Current scope

This implementation provides:

  • an external detector interface,
  • a registry for external detectors,
  • a live detector runner builder,
  • a JSON schema for precomputed detections,
  • a precomputed detector runner adapter,
  • dataset-level bbox replacement for top-down training,
  • multi-animal IoU-based assignment logic.

The most mature path at present is:

  • run detector externally or through the external detector runner,
  • save detections as a BBoxes JSON artifact,
  • train a top-down pose model using precomputed detector boxes.

Testing coverage included here

The current tests cover:

  • detector registry/build path,
  • end-to-end mock external detector inference,
  • schema round-tripping,
  • precomputed detector runner contract,
  • create_dataset() integration,
  • cached annotation immutability,
  • multi-animal matching with reversed detector order,
  • end-to-end pose training on offline/precomputed boxes.

C-Achard and others added 30 commits April 8, 2026 14:34
Introduce support for external (inference-only) detectors. Adds a new package with:

- deeplabcut/.../external/__init__.py: exports BaseExternalDetector, EXTERNAL_DETECTORS, and DetectionResult.
- base.py: defines DetectionResult TypedDict, a DetectorForwardLike Protocol, a builder wrapper, the EXTERNAL_DETECTORS registry, and BaseExternalDetector with a predict API and a forward() compatibility shim that returns empty losses and detections.
- build.py: helper get_external_detector_inference_runner to instantiate an external detector, attach preprocessing/postprocessing, and build an inference runner (always uses pre-trained snapshot=None).

These changes enable integrating frozen third-party detectors into the existing DLC inference pipeline.
Import BaseExternalDetector and add a DetectorModel type alias combining BaseDetector and BaseExternalDetector. Update DetectorInferenceRunner typing and constructor to accept DetectorModel so external detector implementations can be used for inference without changing runtime behavior.
Register a simple MockExternalDetector and add an end-to-end test for external detector inference. __init__.py now imports the mock to populate the EXTERNAL_DETECTORS registry; mock.py implements a lightweight detector that returns a centered box per image. The new test verifies building the detector from the registry, preprocessing/postprocessing, running the DetectorInferenceRunner, and basic output shape/score sanity.
Update docstring for external detectors to state they are inference-oriented and usually not trained (though a pose estimation model may be trained on top of them). Note that external detectors may lack a training loop, optimizer, snapshot loading, or target generation, clarifying expected behavior for integrators and maintainers.
Add logging and defensive checks when building external detectors. Import logging and create a module logger, then attempt to freeze model parameters (requires_grad=False) while catching AttributeError/RuntimeError and warning if the detector has no parameters; also wrap detector.eval() in a try/except to warn if eval() is missing. These changes help avoid accidental training and provide clearer messages for detectors that don't expose standard APIs.
Introduce BBoxComputationMethod enum and DetectorRunnerLike protocol; add logging and required imports. Make Loader.create_dataset accept an optional detector_runner and avoid mutating cached annotations by deep-copying them. Add logic to resolve bbox source for top-down tasks and extend _compute_bboxes to handle detection-based matching (with optional fallback to GT), keypoints-based boxes, and a placeholder for segmentation masks. Implement IoU computation, xywh->xyxy conversion, and Hungarian matching (scipy.linear_sum_assignment) to associate detector boxes to GT, plus small validation and informative logging. Overall this enables using external detector outputs to populate top-down bounding boxes robustly.
Import BBoxComputationMethod and use it in COCOLoader (replace literal "gt" with BBoxComputationMethod.GT). In DLCLoader remove the call that recomputed/overwrote annotation bboxes and add a comment explaining that create_dataset(...) now owns bbox source selection (e.g. gt, keypoints, detection), so loaders should not recompute/overwrite bboxes. Small import adjustments to support the enum.
Introduce a default_bbox_method hook on Loader to centralize legacy bbox source behavior and fall back when model_cfg has no bbox_source. ground_truth_keypoints now reads bbox_source from model_cfg and falls back to Loader.default_bbox_method(task). _resolve_bbox_method's signature and returns were updated to use the BBoxComputationMethod enum (DETECTION_BBOX / GT). DLCLoader overrides default_bbox_method to preserve historical behavior (use KEYPOINTS for TOP_DOWN and DETECT). Also updated imports to bring in BBoxComputationMethod and Task.
Introduce a new deeplabcut.pose_estimation_pytorch.data.bboxes module that defines typed DetectorContext, BBoxEntry and BBoxes models (Pydantic) with helpers to convert between xyxy/xywh, serialize/deserialize JSON, and produce DLC-style detector contexts. Update base external detector to import these types and add PrecomputedDetectorRunner: an adapter that serves precomputed BBoxEntry lists as detector outputs (supports optional image-path validation and target bbox format). This enables using saved detector outputs for top-down training/inference and simplifies bbox format handling and persistence.
Add unit tests covering bbox-source behavior and precomputed detector runner integration. New tests verify DLCLoader's backward-compatible default_bbox_method (keypoints for TOP_DOWN/DETECT), that create_dataset() derives keypoint-based bboxes by default, and that explicit bbox_source='gt' or a provided detector_runner override this behavior. They also test BBoxEntry round-trip conversion, PrecomputedDetectorRunner inference contract, and a live integration path using the MockExternalDetector -> inference runner -> BBoxEntry -> PrecomputedDetectorRunner -> create_dataset(). Several regression guards ensure create_dataset() deep-copies annotations and does not mutate cached load_data() payloads. Tests use small FakeDLCLoader/DummyDetectorRunner fixtures and patch PoseDataset to a lightweight DummyPoseDataset for focused loader logic testing.
Move the BBoxComputationMethod enum out of deeplabcut/pose_estimation_pytorch/data/base.py into a new deeplabcut/pose_estimation_pytorch/data/bboxes.py to centralize bbox-related types. Update imports in cocoloader.py, dlcloader.py and base.py to reference the new module, and adjust tests to assert against the enum members instead of string literals. Also remove the now-unused Enum import from base.py.
Call _resolve_bbox_method for TOP_DOWN and DETECT tasks (passing task and detector_runner) and avoid mutating cached annotations; shorten the related comment. Replace BBoxComputationMethod members to explicit string values instead of using auto() (and remove the unused auto import) so enum values are deterministic/serializable.
Add PrecomputedDetectorRunner to deeplabcut.pose_estimation_pytorch.models.detectors.external exports and update the test to import it from the external detectors package (instead of from data.bboxes). This centralizes the runner in the external detectors public API and keeps imports consistent in tests.
Update test config generation to use ruamel.yaml SingleQuotedScalarString for the video path key and set yaml width to avoid wrapping long paths. Use .as_posix() for project_path and video keys to ensure consistent POSIX-style strings, and open the config file with explicit utf-8 encoding. These changes make the generated YAML more robust for long or special-character paths.
Replace incorrect tuple comparison (task == (Task.TOP_DOWN, Task.DETECT)) with a single enum check (task == Task.TOP_DOWN) in tests/pose_estimation_pytorch/apis/test_apis_export.py. This restores the detector-related test branches so detector snapshots/data are properly created and validated for TOP_DOWN tasks (three occurrences updated).
Handle the common single-animal case by trusting the detector output instead of performing IoU matching. When only one candidate annotation is present, pick the highest-scoring predicted bbox (or the first if scores are unavailable), copy it into the annotation, update the area, and increment the total count. This avoids matching against stale placeholder bboxes and simplifies assignment for single-animal frames.
Add Loader._get_reference_bbox_for_matching to prefer deriving a reference bbox from visible keypoints (with margin) and fall back to annotation['bbox'], raising if neither exists. Update _compute_bboxes to coerce the bbox method earlier and to use this helper when building gt_bboxes for detection-to-annotation matching. Change BBoxComputationMethod to a str Enum so methods carry string values.
Normalize path comparison and add helpers for precomputed bboxes.

- Import Loader and DetectorRunnerLike types.
- Add PrecomputedDetectorRunner._normalize_path_for_compare to compare paths using Path.as_posix(), and use it when validating image paths to avoid mismatches across OS path styles.
- Add precompute_detector_bboxes(loader, detector_runner, output_file, modes=(...), bbox_format) to run a detector over dataset image paths, save results to a BBoxes JSON, and return the BBoxes object. Validates output length against image count.
- Add build_precomputed_detector_runner_from_config(model_cfg, mode, ...) to load a BBoxes file from model_cfg['data']['precomputed_bboxes'] and construct a PrecomputedDetectorRunner, returning None if not configured.

These changes enable caching detector outputs to reuse for top-down pose training without rerunning detectors.
Build precomputed detector runners for Task.TOP_DOWN and pass them to dataset creation so top-down training/validation use precomputed detections. Also tighten optimizer construction: resolve optimizer class earlier, use only parameters with requires_grad, raise if there are no trainable parameters, and instantiate the optimizer with those params (uses optimizer_config[type] and params).
Introduce support for precomputed detector boxes and external detector metadata in PyTorch pose configs. Adds new parameters (precomputed_bboxes, bbox_source, external_detector_metadata) to make_pytorch_pose_config, imports BBoxComputationMethod and Enum, and implements logic to set data.bbox_source, store precomputed_bboxes path, and apply safe defaults for matching/validation. Adds _yaml_safe_value helper to convert Enums, Paths, and nested containers to YAML-safe types and applies it to the final pose_config before saving.
Introduce an end-to-end test that verifies the offline / precomputed detector workflow for multi-animal top-down training. The test adds a FakeMultiAnimalDLCLoader, a PrecomputedDetectorRunner-loaded BBoxes artifact, and asserts that create_dataset correctly matches detector boxes to annotations even when detector outputs are reordered. It also builds minimal TinyTrainDataset and TinyPoseModel, runs a short training cycle via build_training_runner, ensures the detector is not invoked during training, confirms only pose model params are optimized, and checks that model parameters are updated.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Introduce DetectorToPoseInferenceRunner to compose a detector runner with a top-down pose runner, enabling detector-first -> pose inference flows. Adds DetectorRunnerLike import, implements input/context normalization, detector output normalization (bboxes, bbox_scores), and injects these into the pose runner inputs. Update build_inference_runner to accept an optional detector_runner and return the composed runner for Task.TOP_DOWN. Also tidy a one-line dict lookup formatting in training.py.
Import PrecomputedDetectorRunner in existing build tests and add a unit to ensure build_inference_runner wraps a top-down runner when a detector_runner is provided. Add a new test module tests/pose_estimation_pytorch/models/external_detectors/test_inference_wrapper.py containing comprehensive unit and integration tests for DetectorThenPoseInferenceRunner. The new tests include stubs (DummyDetectorRunner, RecordingPoseRunner, PreprocessingPoseRunner) and cover: injection of bboxes into pose runner context, defaulting bbox scores, handling of no detections, error cases (output count mismatch and invalid bbox_score length), passing shelf_writer through, and an integration check with the real top-down preprocessor to validate crop shapes. These tests ensure correct composition and data flow from external detectors to top-down pose inference.
Introduce explicit detector mode handling and demo inference.

- Add DetectorMode enum and detector_mode parameter to make_pytorch_pose_config, with coercion and validation logic for 'native' vs 'external' modes.
- Enforce/validate interactions between detector_mode, precomputed_bboxes, detector_type, bbox_source, external_detector_metadata, and top_down/backbone net types; raise clear errors for invalid combos.
- Preserve legacy behavior when detector_mode is None (infer from precomputed_bboxes), and default net_type from project config with a warning and type check.
- Serialize external detector metadata under pose_config['metadata']['detector'] when using external mode and ensure precomputed_bboxes/bbox_source are set appropriately.
- Add logger import and minor YAML-safe value typing.

- Update examples/detector_test_full_api.py to exercise external detector workflows:
  - Set detector_mode='external' in the demo pose config and require precomputed boxes.
  - Lower eval_interval for the demo, tidy docstring bullets, and add end-to-end video inference demo helpers (write_synthetic_video, build_video_context_from_detector, run_video_inference_demo).
  - Wire inference into main() via a run_inference flag and a new --no-inference CLI option; adjust progress prints accordingly.

These changes enable explicit external/precomputed detector support in top-down workflows and extend the example to demonstrate video inference using saved detector contexts.
Rewrite and clarify the detector_test_full_api example: add detailed docs, type annotations, and focused helper classes (IdentityTopDownTransform, tiny detector/model). Update pose-config handling to produce a lightweight, CPU-friendly demo (set net_type to resnet_50, colormode RGB, raise eval_interval, and add detector.train_settings.epochs stub). Ensure DLCLoader compatibility (get_image_paths fallback), serialize precomputed bboxes as JSON, and add video-inference helpers that build per-frame contexts. Switch video APIs to videos_api and improve snapshot lookup, and adjust CLI (rename script usage and add --no-inference flag). Miscellaneous formatting and import grouping for readability.
Add example script external_detector_workflow.py demonstrating an external-detector driven top-down DeepLabCut PyTorch workflow. The script provides a replaceable MyExternalDetector adapter, precomputes and saves detector boxes (precomputed_bboxes.json), generates/updates pytorch_config.yaml for external detector mode, trains pose models via train_network, and runs inference on videos or image folders (with optional per-frame cache). Includes CLI, UserSettings, and helper functions (prepare_external_topdown_pose_config, save_external_detector_bboxes, train_external_topdown_pose_model, analyze_video_with_external_boxes, analyze_image_folder_with_external_boxes). Requires an existing DLC project/config.yaml and a user-supplied detector implementation.
C-Achard and others added 18 commits April 10, 2026 18:21
Expand bbox format handling and improve annotation parsing when COCO metadata is missing. Changes:

- Add "cxcywh" to BBoxFormat and implement _cxcywh_to_xyxy conversion.
- Add docstrings to xyxy<->xywh conversion helpers clarifying top-left origin assumptions.
- Update _extract_keypoints_and_bboxes signature/formatting and make it robust to missing COCO fields by:
  - Computing a default area from bbox or visible keypoints when "area" is absent.
  - Building arrays for area, category_id, iscrowd, and individual_id with sensible defaults.
  - Applying the visibility mask to the merged annotation fields instead of raising on missing area.

These changes allow training on DLC-style annotations that may lack some COCO metadata and add support for center-based bbox inputs.
Introduce DetectorToPoseInferenceRunner with options for max_individuals, num_joints, num_unique_bodyparts and fill_value. Implement selection/ordering of detector boxes, padding utilities, and prediction normalization (_select_and_order_boxes, _pad_first_dim, _empty_prediction, _normalize_prediction). Refactor inference flow to run detector first, enrich contexts, call pose runner, and return normalized fixed-shape predictions (or write via shelf_writer). Expose new kwargs in build_inference_runner. Update training GPU reporting to check the runner device string for CUDA before querying CUDA memory. Update external_detector_workflow to use the new composite runner and initialize limits from model metadata.
Update documentation in scripts/external_detector_workflow.py: insert a reminder to create a training shuffle using DLC, adjust the numbering of subsequent steps and minor spacing. This is a non-functional README-style change to clarify setup steps for external detectors.
Change the default behavior to not fall back to ground-truth bboxes (bbox_fallback_to_gt=False) across config generation, loader defaults, example/script settings, and tests. Add a warning when bbox_fallback_to_gt is set but not applicable (only valid for detection bboxes), and log an error if detector matching leaves annotations unmatched while fallback is disabled, advising users to either improve detector performance or enable GT fallback. This prevents silently substituting GT boxes and makes unmatched-detector cases more visible to users.
Expose bbox margin as a configurable parameter and thread it through DLCLoader: read bbox_margin from model_cfg["data"], pass it into to_coco/load_ground_truth, and forward it into _add_bbox_annotations (default 20). Adds explanatory comments clarifying that to_coco initializes keypoint-derived GT bboxes for compatibility and that create_dataset still owns the effective bbox source.
Update DetectorToPoseInferenceRunner to accept None for max_individuals, num_joints, and num_unique_bodyparts by widening type hints to int | None. Initialization logic now preserves None for max_individuals, defaults num_joints to 17 if None, and defaults num_unique_bodyparts to 0 if None; when numeric values are provided the previous minimum constraints (via max(...)) are still enforced. This makes it possible to explicitly leave max_individuals unset while keeping sensible defaults/constraints for provided values.
Parse detector dicts explicitly: read 'bboxes' and optional 'bbox_scores' (defaulting to ones) and raise on length mismatch. Enforce deterministic ordering by descending bbox score, apply max_individuals only when set (don't implicitly truncate), and return arrays as float32 without unnecessary copies. Adjust empty/normalized prediction behavior: _empty_prediction emits zero rows when max_individuals is unspecified and fixed-size padded outputs when it is set; _normalize_prediction only pads bodyparts when max_individuals is provided.
Refactor DetectorToPoseInferenceRunner.inference to simplify flow and delegate responsibilities to the pose runner. Inputs are split into raw_images and incoming_contexts using clearer comprehensions; the detector now receives raw images only. Incoming context dictionaries are copied to avoid mutating caller-owned data. Error formatting was improved. Removed the wrapper's internal normalization, empty-prediction handling, and shelf_writer logic; the wrapper now returns pose_runner.inference(enriched_inputs, shelf_writer=shelf_writer) and relies on the pose runner for preprocessing, prediction, postprocessing, and shelf writing.
Update test to clarify that the detector should receive raw image inputs only while incoming context is preserved and forwarded to the pose runner after bbox injection. Adjust the assertion to expect the specific image list (including a Path for the second image) and update the explanatory comment.
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Minor style cleanup with no behavioral changes.

- Removed a stray trailing blank line in deeplabcut/pose_estimation_pytorch/data/dlcloader.py.
- Reflowed multi-line f-strings and the long type annotation in deeplabcut/pose_estimation_pytorch/runners/inference.py into single-line statements to improve readability and satisfy linters.

These edits are purely formatting and do not change runtime logic.
@C-Achard C-Achard requested a review from Copilot June 15, 2026 20:44
@C-Achard C-Achard self-assigned this Jun 15, 2026
@C-Achard C-Achard added enhancement New feature or request pytorch labels Jun 15, 2026

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds an “external detector” pathway to the PyTorch top-down pose pipeline, enabling both live detector inference (via a lightweight external-detector API) and offline reuse of precomputed bounding boxes via a JSON artifact + adapter runner.

Changes:

  • Introduces an external detector interface + registry, plus a PrecomputedDetectorRunner adapter and JSON bbox schemas.
  • Updates dataset creation and training paths to support selecting bbox sources (GT/keypoints/detections) and matching detector boxes to multi-animal annotations via IoU/Hungarian assignment.
  • Adds an inference wrapper to compose a detector runner with a top-down pose runner, plus extensive tests and workflow examples.

Reviewed changes

Copilot reviewed 20 out of 20 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
tests/pose_estimation_pytorch/models/external_detectors/test_precomputed_bbox.py Integration tests for bbox schema roundtrip + precomputed runner + create_dataset.
tests/pose_estimation_pytorch/models/external_detectors/test_inference_wrapper.py Tests for detector→pose inference composition wrapper behavior.
tests/pose_estimation_pytorch/models/external_detectors/test_build.py End-to-end test for building/running an external detector through the runner stack.
tests/pose_estimation_pytorch/data/test_bbox.py Tests for bbox_source resolution + backwards-compatible defaults + non-mutation regression.
tests/pose_estimation_pytorch/apis/test_precomp_bbox_training.py End-to-end test for offline multi-animal training using precomputed detector boxes.
scripts/external_detector_workflow.py User-facing workflow script showing offline detector box generation + training + inference.
examples/detector_test_full_api.py Fully synthetic end-to-end example demonstrating the offline precomputed boxes workflow.
deeplabcut/pose_estimation_pytorch/runners/train.py Safer optimizer construction; tweak GPU usage string formatting logic.
deeplabcut/pose_estimation_pytorch/runners/inference.py Support external detectors in DetectorInferenceRunner and add DetectorToPoseInferenceRunner wrapper.
deeplabcut/pose_estimation_pytorch/models/detectors/external/mock.py Mock external detector implementation for tests.
deeplabcut/pose_estimation_pytorch/models/detectors/external/build.py Helper to build an inference runner for external detectors.
deeplabcut/pose_estimation_pytorch/models/detectors/external/base.py External detector base API, registry, precomputed runner, and bbox precompute utility.
deeplabcut/pose_estimation_pytorch/models/detectors/external/init.py External detector public exports + registry population import.
deeplabcut/pose_estimation_pytorch/data/utils.py Improves robustness when COCO metadata fields are missing during training.
deeplabcut/pose_estimation_pytorch/data/dlcloader.py Preserves backward-compatible default bbox behavior; makes bbox margin config-driven in COCO conversion.
deeplabcut/pose_estimation_pytorch/data/cocoloader.py Aligns bbox computation method usage with new BBoxComputationMethod enum.
deeplabcut/pose_estimation_pytorch/data/bboxes.py Adds bbox schemas (BBoxEntry/BBoxes), format conversion, and bbox_source enum/types.
deeplabcut/pose_estimation_pytorch/data/base.py Adds detector-runner protocol, bbox_source resolution, and detection-bbox matching logic.
deeplabcut/pose_estimation_pytorch/config/make_pose_config.py Adds “detector_mode” concept and config support for external/offline bbox artifacts.
deeplabcut/pose_estimation_pytorch/apis/training.py Wires precomputed detector runner creation into the high-level training API.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread deeplabcut/pose_estimation_pytorch/models/detectors/external/base.py Outdated
Comment thread deeplabcut/pose_estimation_pytorch/runners/inference.py Outdated
Comment thread deeplabcut/pose_estimation_pytorch/runners/train.py
Comment thread deeplabcut/pose_estimation_pytorch/apis/training.py
@C-Achard C-Achard changed the title External detectors for top-down models (PyTorch) External detectors for top-down models Jun 15, 2026
C-Achard and others added 5 commits June 15, 2026 15:50
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Index precomputed bbox entries by normalized image path for fast lookup and detect duplicate entries. Normalize paths using POSIX form and lowercase for comparison, and add a suffix-based fallback matcher that errors on ambiguous matches. Add helpers to extract image paths from inputs and to find matches by suffix. Update inference to prefer path-based matching (with informative errors and an order-only fallback for ndarray inputs) and adjust target_format typing from str to BBoxFormat.
Add unit tests for PrecomputedDetectorRunner to verify path-based subset lookup, that inference preserves the requested image order, and that a ValueError is raised when a requested path is missing. Tests construct BBoxes/BBoxEntry fixtures with image_path values and call PrecomputedDetectorRunner.from_bboxes(...), then assert returned bboxes and bbox_scores match expected values and ordering.
Improve TrainingRunner._gpu_usage_str by checking torch.cuda.is_available() and passing the current device to torch.cuda.memory_reserved and torch.cuda.get_device_properties. This avoids querying GPU 0 implicitly and prevents errors on non-CUDA or multi-GPU setups; also removes leftover commented-out code.
Extract data config and add explicit handling for precomputed detector bboxes. When training a TOP_DOWN task with data.bbox_source='detection_bbox', require data.precomputed_bboxes to be configured (raise ValueError if missing) and use bbox_validate_image_paths as before. If precomputed_bboxes are present but bbox_source is not 'detection_bbox', emit an info log that the precomputed boxes will be ignored. This avoids silent misconfiguration and clarifies expected config fields.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request pytorch

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants