DLC Live Benchmark.md

Inference time

System	Model type	Runtime	Device type	Precision	Video	Video length (s) - # Frames	FPS	Frame size	Display settings	Pose model backbone	Avg Inference time ± Std (including 1st inference)	Avg Inference time ± Std	Average FPS ± Std	Model size
Linux	ONNX	ONNX	CUDA	Full precision (FP32)	Ventral gait	10s - 1.5k	150	(658,302)	None	`ResNet50` (bu)	29.02ms ± 47.59ms	27.8ms ± 2.32ms	36 ± 3	92.12 MB
Linux	ONNX	ONNX	CPU	Full precision (FP32)	Ventral gait	10s - 1.5k	150	(658,302)	None	`ResNet50` (bu)	146.12ms ± 13.26ms	146.11 ± 13.25	7 ± 1	92.12 MB
Linux	PyTorch	PyTorch	CUDA	Full precision (FP32)	Ventral gait	10s - 1.5k	150	(658,302)	None	`ResNet50` (bu)	6.04ms ± 7.37ms	5.97ms ± 6.8ms	271 ± 112	96.5 MB
Linux	PyTorch	PyTorch	CPU	Full precision (FP32)	Ventral gait	10s - 1.5k	150	(658,302)	None	`ResNet50` (bu)	365.26ms ± 13.88ms	365.17ms ± 13.44ms	3 ± 0	96.5 MB
Linux	ONNX	TensorRT	CUDA	Full precision (FP32) - no caching	Ventral gait	10s - 1.5k	150	(658,302)	None	`ResNet50` (bu)	55.32ms ± 1254.16ms	22.93ms ± 0.88	44 ± 2	92.12 MB
Linux	ONNX	TensorRT	CUDA	Full precision (FP32) - engine caching	Ventral gait	10s - 1.5k	150	(658,302)	None	`ResNet50` (bu)	20.8ms ± 3.4ms	20.72ms ± 1.25ms	48 ± 3	92.12 MB
Linux	ONNX	TensorRT	CUDA	FP16	Ventral gait	10s - 1.5k	150	(658,302)	None	`ResNet50` (bu)	34.37ms ± 858.96ms	12.19ms ± 0.87	82 ± 6	46.16 MB
Linux	ONNX	ONNX	CUDA	FP16	Ventral gait	10s - 1.5k	150	(658,302)	None	`ResNet50` (bu)	21.74ms ± 43.24ms	20.62ms ± 2.5ms	49 ± 5	46.16 MB
Linux	PyTorch	PyTorch	CUDA	FP32	Ventral gait	10s - 1.5k	150	(164,75)	Resize=0.25	`ResNet50` (bu)	22.27ms ± 12.5ms	22.16ms ± 11.65ms	70 ± 68	96.5 MB
Linux	ONNX	ONNX	CUDA	(FP32)	Ventral gait	10s - 1.5k	150	(164,75)	Resize=0.25	`ResNet50` (bu)	6.18ms ± 37.03ms	5.22ms ± 0.86ms	195 ± 25
Linux	ONNX	ONNX	CPU	(FP32)	Ventral gait	10s - 1.5k	150	(164,75)	Resize=0.25	`ResNet50` (bu)	13.17ms ± 1.25ms	13.17ms ± 1.23ms	76 ± 4
Linux	ONNX	TensorRT	CUDA	(FP32)	Ventral gait	10s - 1.5k	150	(164,75)	Resize=0.25	`ResNet50` (bu)	15.12ms ± 458.27ms	3.28ms ± 0.24ms	306 ± 23
Linux	ONNX	ONNX	CUDA	FP16	Ventral gait	10s - 1.5k	150	(164,75)	Resize=0.25	`ResNet50` (bu)	5.83ms ± 33.27ms	4.97ms ± 1.5ms	214 ± 45
Linux	ONNX	ONNX	CUDA	FP16	Pigeon	36s - ~1k	30	(480, 270)	Resize=0.25	`ResNet50 + SSDLite detector` (td)	17.08 ms ± 139.91ms	12.82 ms ± 1.52	79 ± 8	45.50 MB
Linux	ONNX	ONNX	CUDA	FP32	Pigeon	36s - ~1k	30	(480, 270)	Resize=0.25	`ResNet50 + SSDLite detector` (td)	25.06 ms ± 129.74ms	21.1 ms ± 0.82ms	47 ± 2	90.79 MB
Linux	ONNX	TensorRT	CUDA	FP32	Pigeon	36s - ~1k	30	(480, 270)	Resize=0.25	`ResNet50 + SSDLite detector` (td)	6.18 ms ± 1376.44 ms	14.22 ms ± 0.48ms	70 ± 3
Linux	ONNX	TensorRT	CUDA	FP16	Pigeon	36s - ~1k	30	(480, 270)	Resize=0.25	`ResNet50 + SSDLite detector` (td)	49.81 ms ± 1361.7ms	8.3 ms ± 0.75ms	121 ± 11
Linux	PyTorch	PyTorch	CUDA	FP32	Pigeon	36s - ~1k	30	(480, 270)	Resize=0.25	`ResNet50 + SSDLite detector` (td)	7.7 ms ± 5.38 ms	7.78 ms ± 6.0 ms	185 ± 96
Linux	PyTorch	PyTorch	CPU	FP32	Pigeon	36s - ~1k	30	(480, 270)	Resize=0.25	`ResNet50 + SSDLite detector` (td)	167.33 ms ± 21.0 ms	167.32 ms ± 21.01 ms	6 ± 1
Linux	ONNX	ONNX	CPU	FP32	Pigeon	36s - ~1k	30	(480, 270)	Resize=0.25	`ResNet50 + SSDLite detector` (td)	85.64 ms ± 8.23 ms	85.65 ms ± 8.23	12 ± 1
Linux	ONNX	ONNX	CPU	FP16	Pigeon	36s - ~1k	30	(480, 270)	Resize=0.25	`ResNet50 + SSDLite detector` (td)	161.32 ms ± 18.29ms	161.3 ms ± 18.29ms	6 ± 1

** CUDA: NVIDIA GeForce RTX 3050 (6GB) ** CPU: 13th Gen Intel Core i7-13620H × 16 ** Linux: Ubuntu 24.04 LTS

^ Startup time at inference for a TensorRT engine takes between 30 and 50 seconds, which skews the inference time measurement. Caching is used to reduce that time.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inference time

FilesExpand file tree

DLC Live Benchmark.md

Latest commit

History

DLC Live Benchmark.md

File metadata and controls

Inference time