Skip to content

Commit 49e23b4

Browse files
committed
Adding links to performance benchmark page
1 parent 3d8d878 commit 49e23b4

File tree

52 files changed

+104
-0
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

52 files changed

+104
-0
lines changed

CUDA-Optimized/FastSpeech/README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -315,6 +315,8 @@ Sample result waveforms are [FP32](fastspeech/trt/samples) and [FP16](fastspeech
315315

316316
## Performance
317317

318+
The performance measurements in this document were conducted at the time of publication and may not reflect the performance achieved from NVIDIA’s latest software release. For the most up-to-date performance measurements, go to [NVIDIA Data Center Deep Learning Product Performance](https://developer.nvidia.com/deep-learning-performance-training-inference).
319+
318320
### Benchmarking
319321

320322
The following section shows how to run benchmarks measuring the model performance in training and inference modes.

Kaldi/SpeechRecognition/README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -192,6 +192,8 @@ you can set `count` to `1` in the [`instance_group` section](https://docs.nvidia
192192

193193
## Performance
194194

195+
The performance measurements in this document were conducted at the time of publication and may not reflect the performance achieved from NVIDIA’s latest software release. For the most up-to-date performance measurements, go to [NVIDIA Data Center Deep Learning Product Performance](https://developer.nvidia.com/deep-learning-performance-training-inference).
196+
195197

196198
### Metrics
197199

MxNet/Classification/RN50v1.5/README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -552,6 +552,8 @@ By default:
552552

553553
## Performance
554554

555+
The performance measurements in this document were conducted at the time of publication and may not reflect the performance achieved from NVIDIA’s latest software release. For the most up-to-date performance measurements, go to [NVIDIA Data Center Deep Learning Product Performance](https://developer.nvidia.com/deep-learning-performance-training-inference).
556+
555557
### Benchmarking
556558

557559
To benchmark training and inference, run:

PyTorch/Classification/ConvNets/efficientnet/README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -492,6 +492,8 @@ Quantized models could also be used to classify new images using the `classify.p
492492

493493
## Performance
494494

495+
The performance measurements in this document were conducted at the time of publication and may not reflect the performance achieved from NVIDIA’s latest software release. For the most up-to-date performance measurements, go to [NVIDIA Data Center Deep Learning Product Performance](https://developer.nvidia.com/deep-learning-performance-training-inference).
496+
495497
### Benchmarking
496498

497499
The following section shows how to run benchmarks measuring the model performance in training and inference modes.

PyTorch/Classification/ConvNets/resnet50v1.5/README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -498,6 +498,8 @@ To run inference on JPEG image using pretrained weights:
498498

499499
## Performance
500500

501+
The performance measurements in this document were conducted at the time of publication and may not reflect the performance achieved from NVIDIA’s latest software release. For the most up-to-date performance measurements, go to [NVIDIA Data Center Deep Learning Product Performance](https://developer.nvidia.com/deep-learning-performance-training-inference).
502+
501503
### Benchmarking
502504

503505
The following section shows how to run benchmarks measuring the model performance in training and inference modes.

PyTorch/Classification/ConvNets/resnext101-32x4d/README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -481,6 +481,8 @@ To run inference on JPEG image using pretrained weights:
481481

482482
## Performance
483483

484+
The performance measurements in this document were conducted at the time of publication and may not reflect the performance achieved from NVIDIA’s latest software release. For the most up-to-date performance measurements, go to [NVIDIA Data Center Deep Learning Product Performance](https://developer.nvidia.com/deep-learning-performance-training-inference).
485+
484486
### Benchmarking
485487

486488
The following section shows how to run benchmarks measuring the model performance in training and inference modes.

PyTorch/Classification/ConvNets/se-resnext101-32x4d/README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -483,6 +483,8 @@ To run inference on JPEG image using pretrained weights:
483483

484484
## Performance
485485

486+
The performance measurements in this document were conducted at the time of publication and may not reflect the performance achieved from NVIDIA’s latest software release. For the most up-to-date performance measurements, go to [NVIDIA Data Center Deep Learning Product Performance](https://developer.nvidia.com/deep-learning-performance-training-inference).
487+
486488
### Benchmarking
487489

488490
The following section shows how to run benchmarks measuring the model performance in training and inference modes.

PyTorch/Classification/ConvNets/triton/resnet50/README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -325,6 +325,8 @@ we can consider that all clients are local.
325325

326326
## Performance
327327

328+
The performance measurements in this document were conducted at the time of publication and may not reflect the performance achieved from NVIDIA’s latest software release. For the most up-to-date performance measurements, go to [NVIDIA Data Center Deep Learning Product Performance](https://developer.nvidia.com/deep-learning-performance-training-inference).
329+
328330

329331
### Offline scenario
330332
This table lists the common variable parameters for all performance measurements:

PyTorch/Classification/ConvNets/triton/resnext101-32x4d/README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -194,6 +194,8 @@ To process static configuration logs, `triton/scripts/process_output.sh` script
194194

195195
## Performance
196196

197+
The performance measurements in this document were conducted at the time of publication and may not reflect the performance achieved from NVIDIA’s latest software release. For the most up-to-date performance measurements, go to [NVIDIA Data Center Deep Learning Product Performance](https://developer.nvidia.com/deep-learning-performance-training-inference).
198+
197199
### Dynamic batching performance
198200
The Triton Inference Server has a dynamic batching mechanism built-in that can be enabled. When it is enabled, the server creates inference batches from multiple received requests. This allows us to achieve better performance than doing inference on each single request. The single request is assumed to be a single image that needs to be inferenced. With dynamic batching enabled, the server will concatenate single image requests into an inference batch. The upper bound of the size of the inference batch is set to 64. All these parameters are configurable.
199201

PyTorch/Classification/ConvNets/triton/se-resnext101-32x4d/README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -195,6 +195,8 @@ To process static configuration logs, `triton/scripts/process_output.sh` script
195195

196196
## Performance
197197

198+
The performance measurements in this document were conducted at the time of publication and may not reflect the performance achieved from NVIDIA’s latest software release. For the most up-to-date performance measurements, go to [NVIDIA Data Center Deep Learning Product Performance](https://developer.nvidia.com/deep-learning-performance-training-inference).
199+
198200
### Dynamic batching performance
199201
The Triton Inference Server has a dynamic batching mechanism built-in that can be enabled. When it is enabled, the server creates inference batches from multiple received requests. This allows us to achieve better performance than doing inference on each single request. The single request is assumed to be a single image that needs to be inferenced. With dynamic batching enabled, the server will concatenate single image requests into an inference batch. The upper bound of the size of the inference batch is set to 64. All these parameters are configurable.
200202

0 commit comments

Comments
 (0)