Skip to content

Commit ced4afc

Browse files
Merge pull request NVIDIA#596 from NVIDIA/gh/release
[Transformer-XL/TF] Updated perf table
2 parents 3337f72 + 40c3be6 commit ced4afc

1 file changed

Lines changed: 5 additions & 3 deletions

File tree

  • TensorFlow/LanguageModeling/Transformer-XL

TensorFlow/LanguageModeling/Transformer-XL/README.md

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,6 @@ to achieve state-of-the-art accuracy and is tested and maintained by NVIDIA.
4040
* [Training accuracy: NVIDIA DGX-2 (16x V100 32G)](#training-accuracy-nvidia-dgx-2-16x-v100-32gb)
4141
* [Base model](#base-model-2)
4242
* [Training loss plot](#training-loss-plot)
43-
* [Base model](#base-model-3)
4443
* [Training stability test](#training-stability-test)
4544
* [Base model](#base-model-4)
4645
* [Training performance results](#training-performance-results)
@@ -893,8 +892,11 @@ training iterations.
893892

894893
|**GPUs**|**Batch Size / GPU**|**Throughput - TF32 (tok/s)**|**Throughput - Mixed precision (tok/s)**|**Throughput speedup (TF32 to Mixed precision)**|**Weak Scaling - TF32**|**Weak Scaling - Mixed precision**|
895894
|-------:|-------------------:|----------------------------:|---------------------------------------:|-----------------------------------------------:|----------------------:|---------------------------------:|
896-
| 1 | 16 | 34,244 | 36,455 | 1.065 | 1.000 | 6.555 |
897-
| 8 | 32 | 224,474 | 227,502 | 1.013 | 8.636 | 6.241 |
895+
| 1 | 16 | 25,127 | 26,130 | 1.040 | 1.000 | 1.000 |
896+
| 1 | 32 | 30,958 | 33,117 | 1.070 | 1.000 | 1.000 |
897+
| 1 | 64 | 34,244 | 36,455 | 1.065 | 1.000 | 1.000 |
898+
| 8 | 16 | 157,538 | 155,656 | 0.988 | 6.270 | 5.957 |
899+
| 8 | 32 | 224,474 | 227,502 | 1.013 | 7.251 | 6.870 |
898900

899901
To achieve these same results, follow the steps in the [Quick Start Guide](#quick-start-guide).
900902

0 commit comments

Comments
 (0)