Skip to content

Latest commit

 

History

History
executable file
·
32 lines (29 loc) · 7.76 KB

File metadata and controls

executable file
·
32 lines (29 loc) · 7.76 KB

Inference time

System Model type Runtime Device type Precision Video Video length (s) - # Frames FPS Frame size Display settings Pose model backbone Avg Inference time ± Std
(including 1st inference)
Avg Inference time ± Std Average FPS ± Std Model size
Linux ONNX ONNX CUDA Full precision (FP32) Ventral gait 10s - 1.5k 150 (658,302) None ResNet50 (bu) 29.02ms ± 47.59ms 27.8ms ± 2.32ms 36 ± 3 92.12 MB
Linux ONNX ONNX CPU Full precision (FP32) Ventral gait 10s - 1.5k 150 (658,302) None ResNet50 (bu) 146.12ms ± 13.26ms 146.11 ± 13.25 7 ± 1 92.12 MB
Linux PyTorch PyTorch CUDA Full precision (FP32) Ventral gait 10s - 1.5k 150 (658,302) None ResNet50 (bu) 6.04ms ± 7.37ms 5.97ms ± 6.8ms 271 ± 112 96.5 MB
Linux PyTorch PyTorch CPU Full precision (FP32) Ventral gait 10s - 1.5k 150 (658,302) None ResNet50 (bu) 365.26ms ± 13.88ms 365.17ms ± 13.44ms 3 ± 0 96.5 MB
Linux ONNX TensorRT CUDA Full precision (FP32) - no caching Ventral gait 10s - 1.5k 150 (658,302) None ResNet50 (bu) 55.32ms ± 1254.16ms 22.93ms ± 0.88 44 ± 2 92.12 MB
Linux ONNX TensorRT CUDA Full precision (FP32) - engine caching Ventral gait 10s - 1.5k 150 (658,302) None ResNet50 (bu) 20.8ms ± 3.4ms 20.72ms ± 1.25ms 48 ± 3 92.12 MB
Linux ONNX TensorRT CUDA FP16 Ventral gait 10s - 1.5k 150 (658,302) None ResNet50 (bu) 34.37ms ± 858.96ms 12.19ms ± 0.87 82 ± 6 46.16 MB
Linux ONNX ONNX CUDA FP16 Ventral gait 10s - 1.5k 150 (658,302) None ResNet50 (bu) 21.74ms ± 43.24ms 20.62ms ± 2.5ms 49 ± 5 46.16 MB
Linux PyTorch PyTorch CUDA FP32 Ventral gait 10s - 1.5k 150 (164,75) Resize=0.25 ResNet50 (bu) 22.27ms ± 12.5ms 22.16ms ± 11.65ms 70 ± 68 96.5 MB
Linux ONNX ONNX CUDA (FP32) Ventral gait 10s - 1.5k 150 (164,75) Resize=0.25 ResNet50 (bu) 6.18ms ± 37.03ms 5.22ms ± 0.86ms 195 ± 25
Linux ONNX ONNX CPU (FP32) Ventral gait 10s - 1.5k 150 (164,75) Resize=0.25 ResNet50 (bu) 13.17ms ± 1.25ms 13.17ms ± 1.23ms 76 ± 4
Linux ONNX TensorRT CUDA (FP32) Ventral gait 10s - 1.5k 150 (164,75) Resize=0.25 ResNet50 (bu) 15.12ms ± 458.27ms 3.28ms ± 0.24ms 306 ± 23
Linux ONNX ONNX CUDA FP16 Ventral gait 10s - 1.5k 150 (164,75) Resize=0.25 ResNet50 (bu) 5.83ms ± 33.27ms 4.97ms ± 1.5ms 214 ± 45
Linux ONNX ONNX CUDA FP16 Pigeon 36s - ~1k 30 (480, 270) Resize=0.25 ResNet50 + SSDLite detector (td) 17.08 ms ± 139.91ms 12.82 ms ± 1.52 79 ± 8 45.50 MB
Linux ONNX ONNX CUDA FP32 Pigeon 36s - ~1k 30 (480, 270) Resize=0.25 ResNet50 + SSDLite detector (td) 25.06 ms ± 129.74ms 21.1 ms ± 0.82ms 47 ± 2 90.79 MB
Linux ONNX TensorRT CUDA FP32 Pigeon 36s - ~1k 30 (480, 270) Resize=0.25 ResNet50 + SSDLite detector (td) 6.18 ms ± 1376.44 ms 14.22 ms ± 0.48ms 70 ± 3
Linux ONNX TensorRT CUDA FP16 Pigeon 36s - ~1k 30 (480, 270) Resize=0.25 ResNet50 + SSDLite detector (td) 49.81 ms ± 1361.7ms 8.3 ms ± 0.75ms 121 ± 11
Linux PyTorch PyTorch CUDA FP32 Pigeon 36s - ~1k 30 (480, 270) Resize=0.25 ResNet50 + SSDLite detector (td) 7.7 ms ± 5.38 ms 7.78 ms ± 6.0 ms 185 ± 96
Linux PyTorch PyTorch CPU FP32 Pigeon 36s - ~1k 30 (480, 270) Resize=0.25 ResNet50 + SSDLite detector (td) 167.33 ms ± 21.0 ms 167.32 ms ± 21.01 ms 6 ± 1
Linux ONNX ONNX CPU FP32 Pigeon 36s - ~1k 30 (480, 270) Resize=0.25 ResNet50 + SSDLite detector (td) 85.64 ms ± 8.23 ms 85.65 ms ± 8.23 12 ± 1
Linux ONNX ONNX CPU FP16 Pigeon 36s - ~1k 30 (480, 270) Resize=0.25 ResNet50 + SSDLite detector (td) 161.32 ms ± 18.29ms 161.3 ms ± 18.29ms 6 ± 1

** CUDA: NVIDIA GeForce RTX 3050 (6GB) ** CPU: 13th Gen Intel Core i7-13620H × 16 ** Linux: Ubuntu 24.04 LTS

^ Startup time at inference for a TensorRT engine takes between 30 and 50 seconds, which skews the inference time measurement. Caching is used to reduce that time.