Skip to content

Addition of BUCTD Models#2952

Merged
MMathisLab merged 114 commits into
mainfrom
lucas/buctd_v2
Apr 14, 2025
Merged

Addition of BUCTD Models#2952
MMathisLab merged 114 commits into
mainfrom
lucas/buctd_v2

Conversation

@n-poulsen

@n-poulsen n-poulsen commented Apr 11, 2025

Copy link
Copy Markdown
Contributor

BUCTD Pose Estimation Models

BUCTD is a state of the art crowded animal (and human) pose estimation algorithm. This PR serves to add this directly to the DeepLabCut code base. Here is the stand alone paper code for single image inference. This PR also expands significantly the code to track individuals in videos.

Paper: Rethinking pose estimation in crowds: overcoming the detection information
bottleneck and ambiguity

BUCTD_fig1

Ref:

@InProceedings{Zhou_2023_ICCV,
    author    = {Zhou, Mu and Stoffl, Lucas and Mathis, Mackenzie Weygandt and Mathis, Alexander},
    title     = {Rethinking Pose Estimation in Crowds: Overcoming the Detection Information Bottleneck and Ambiguity},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2023},
    pages     = {14689-14699}
}

Tracking Performance

The following video was produced through the example notebook: examples/COLAB/Demo_BUCTD_and_CTD_tracking.ipynb. The model was trained on 78 images containing the 3 mice, with the default parameters for the model.

trimouse_1DLC_CtdCoamW32_trimiceJun22shuffle11_snapshot_090_el_id_p10_labeled-ezgif com-optimize

Models

Both CoAM and PreNet BUCTD architectures are added. Some base PreNet architectures are added (most notably ctd_prenet_cspnext_m and ctd_prenet_cspnext_x), but of course any backbone/pose head can be used as a PreNet BUCTD model.

Training and Model Confituration

BUCTD models are trained using generative sampling. The configuration for generative sampling during training is stored in the pytorch_config under the data: gen_sampling key:

data:
  gen_sampling:
    keypoint_sigmas: 0.1

BUCTD models require conditions from a bottom-up for evaluation. This can be configured through the data key as well:

# Example: Loading the predictions for snapshot-250.pt of shuffle 1.
data:
  conditions:
    shuffle: 1
    snapshot: snapshot-250.pt

# Example: Loading the predictions for the last snapshot of shuffle 6.
data:
  conditions:
    shuffle: 6
    snapshot_index: -1

Tracking

One of the big advantages of having a CTD model is that it can be used to track individuals directly! Let's say you have the pose for your animals at frame T. Then you can use those poses as conditions for frame T+1, and let your CTD model simply "update" the poses depending on how much your mice moved.

In the simplest scenario, you only need to run the BU model on the first frame, and then the CTD model takes over for inference and tracking:

  1. Run the BU model to generate conditions for the 1st frame of the video
  2. For every frame after that, use the predictions from the previous frame as conditions

However, this may not fit your scenario perfectly. Maybe all the mice aren't present in the first frame, and if they aren't detected by the BU model they'll never be tracked. Maybe at some point the CTD model makes an error and you lose track of a mouse. There are some options to deal with this:

  • Run the BU model every time at least one animal is not detected (if you expect N animals to be in the video and you only detect N-1 animals, run the BU model):
    • In this case, the predictions from the BU model need to be "merged in" to the existing N-1 tracks
    • We can merge them in by using a similarity score between poses (OKS) which ranges from 0 to 1
    • You likely don't want to run the BU model every frame, as this would slow down inference.
  • Run the BU model every K frames in case new animal appears

Docs & Examples

  • Docs were added for different approaches to pose estimation in: docs/pytorch/architectures.md
  • A new COLAB notebook was added to train a CTD model: examples/COLAB/Demo_BUCTD_and_CTD_tracking.ipynb

Bug fixes & Improvements

  • calc_object_keypoint_similarity: allow users to pass arrays to have different OKS sigmas for each keypoint
  • users can now get the scorer from the DLCLoader and a Snapshot
  • Loaders have a method to list snapshots

@MMathisLab MMathisLab left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀🔥

@MMathisLab MMathisLab merged commit cfce4ba into main Apr 14, 2025
@maximpavliv maximpavliv added the CTD Contidional Top-Down label May 15, 2025
@MMathisLab MMathisLab deleted the lucas/buctd_v2 branch June 15, 2025 13:37
@deruyter92 deruyter92 mentioned this pull request May 21, 2026
1 task
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants