Addition of BUCTD Models#2952
Merged
Merged
Conversation
AlexEMG
approved these changes
Apr 14, 2025
2 tasks
This was referenced May 21, 2025
Closed
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
BUCTD Pose Estimation Models
BUCTD is a state of the art crowded animal (and human) pose estimation algorithm. This PR serves to add this directly to the DeepLabCut code base. Here is the stand alone paper code for single image inference. This PR also expands significantly the code to track individuals in videos.
Paper: Rethinking pose estimation in crowds: overcoming the detection information
bottleneck and ambiguity
Ref:
Tracking Performance
The following video was produced through the example notebook:
examples/COLAB/Demo_BUCTD_and_CTD_tracking.ipynb. The model was trained on 78 images containing the 3 mice, with the default parameters for the model.Models
Both CoAM and PreNet BUCTD architectures are added. Some base PreNet architectures are added (most notably
ctd_prenet_cspnext_mandctd_prenet_cspnext_x), but of course any backbone/pose head can be used as a PreNet BUCTD model.Training and Model Confituration
BUCTD models are trained using generative sampling. The configuration for generative sampling during training is stored in the
pytorch_configunder thedata: gen_samplingkey:BUCTD models require conditions from a bottom-up for evaluation. This can be configured through the
datakey as well:Tracking
One of the big advantages of having a CTD model is that it can be used to track individuals directly! Let's say you have the pose for your animals at frame T. Then you can use those poses as conditions for frame T+1, and let your CTD model simply "update" the poses depending on how much your mice moved.
In the simplest scenario, you only need to run the BU model on the first frame, and then the CTD model takes over for inference and tracking:
However, this may not fit your scenario perfectly. Maybe all the mice aren't present in the first frame, and if they aren't detected by the BU model they'll never be tracked. Maybe at some point the CTD model makes an error and you lose track of a mouse. There are some options to deal with this:
Docs & Examples
docs/pytorch/architectures.mdexamples/COLAB/Demo_BUCTD_and_CTD_tracking.ipynbBug fixes & Improvements
calc_object_keypoint_similarity: allow users to pass arrays to have different OKS sigmas for each keypointscorerfrom theDLCLoaderand aSnapshotLoadershave a method to list snapshots