NVIDIA
diff --git a/‎TensorFlow/Segmentation/UNet_Industrial/Dockerfile‎
Lines changed: 7 additions & 3 deletions b/‎TensorFlow/Segmentation/UNet_Industrial/Dockerfile‎
Lines changed: 7 additions & 3 deletions
diff --git a/‎TensorFlow/Segmentation/UNet_Industrial/README.md‎
Lines changed: 17 additions & 23 deletions b/‎TensorFlow/Segmentation/UNet_Industrial/README.md‎
Lines changed: 17 additions & 23 deletions
diff --git a/‎TensorFlow/Segmentation/UNet_Industrial/datasets/core.py‎
Lines changed: 0 additions & 2 deletions b/‎TensorFlow/Segmentation/UNet_Industrial/datasets/core.py‎
Lines changed: 0 additions & 2 deletions
diff --git a/‎TensorFlow/Segmentation/UNet_Industrial/datasets/dagm2007.py‎
Lines changed: 43 additions & 31 deletions b/‎TensorFlow/Segmentation/UNet_Industrial/datasets/dagm2007.py‎
Lines changed: 43 additions & 31 deletions
diff --git a/‎TensorFlow/Segmentation/UNet_Industrial/dllogger/README.md‎
Lines changed: 0 additions & 22 deletions b/‎TensorFlow/Segmentation/UNet_Industrial/dllogger/README.md‎
Lines changed: 0 additions & 22 deletions
diff --git a/‎TensorFlow/Segmentation/UNet_Industrial/dllogger/dllogger/__init__.py‎
Lines changed: 0 additions & 19 deletions b/‎TensorFlow/Segmentation/UNet_Industrial/dllogger/dllogger/__init__.py‎
Lines changed: 0 additions & 19 deletions
diff --git a/‎TensorFlow/Segmentation/UNet_Industrial/dllogger/dllogger/autologging.py‎
Lines changed: 0 additions & 60 deletions b/‎TensorFlow/Segmentation/UNet_Industrial/dllogger/dllogger/autologging.py‎
Lines changed: 0 additions & 60 deletions
@@ -16,11 +16,15 @@
 #
 # ==============================================================================
 
-FROM nvcr.io/nvidia/tensorflow:19.05-py3
+FROM nvcr.io/nvidia/tensorflow:20.01-tf1-py3
 
 LABEL version="1.0" maintainer="Jonathan DEKHTIAR <jonathan.dekhtiar@nvidia.com>"
 
+WORKDIR /opt
+COPY requirements.txt /opt/requirements_unet_tf_industrial.txt
+
+RUN python -m pip --no-cache-dir --no-cache install --upgrade pip && \
+    pip --no-cache-dir --no-cache install -r /opt/requirements_unet_tf_industrial.txt
+
 ADD . /workspace/unet_industrial
 WORKDIR /workspace/unet_industrial
-
-RUN pip install dllogger/
 
@@ -138,7 +138,7 @@ Aside from these dependencies, ensure you have the following components:
 
 * [NVIDIA Docker](https://github.com/NVIDIA/nvidia-docker)
 
-* [TensorFlow 19.03-py3 NGC container](https://ngc.nvidia.com/catalog/containers/nvidia:tensorflow)
+* [TensorFlow 19.12-tf1-py3 NGC container](https://ngc.nvidia.com/catalog/containers/nvidia:tensorflow)
 * (optional) NVIDIA Volta GPU (see section below) - for best training performance using mixed precision
 
 For more information about how to get started with NGC containers, see the
@@ -219,11 +219,6 @@ cd scripts/
 ./UNet_FP32_EVAL.sh <path to result repository> <path to dataset> <DAGM2007 classID (1-10)>
 ```
 
-If you wish to evaluate external checkpoint, make sure to put the TF ckpt files inside a folder named "checkpoints"
-and provide its parent path as `<path to result repository>` in the example above. 
-Be aware that the script will not fail if it does not find the checkpoint. 
-It will randomly initialize the weights and run performance tests.
-
 ## Advanced
 
 The following sections provide greater details of the dataset, running training and inference, and the training results.
@@ -374,7 +369,7 @@ The following sections provide details on the achieved results in training accur
 #### Training accuracy results
 
 Our results were obtained by running the `./scripts/UNet_{FP32, AMP}_{1, 4, 8}GPU.sh` training
-script in the Tensorflow:19.03-py3 NGC container on NVIDIA DGX-1 with 8x V100 16G GPUs.
+script in the Tensorflow:19.12-tf1-py3 NGC container on NVIDIA DGX-1 with 8x V100 16G GPUs.
 
 ##### Threshold = 0.75
 
@@ -481,30 +476,29 @@ script in the Tensorflow:19.03-py3 NGC container on NVIDIA DGX-1 with 8x V100 16
 <!-- Spreedsheet to Markdown: https://thisdavej.com/copy-table-in-excel-and-paste-as-a-markdown-table/ -->
 
 Our results were obtained by running the scripts
-`./scripts/benchmarking/DGX1v_trainbench_{FP16, FP32, FP32AMP, FP32FM}_{1, 4, 8}GPU.sh` training script in the
-TensorFlow 19.03-py3 NGC container on an NVIDIA DGX-1 with 8 V100 16G GPUs.
-
-
-| # GPUs | Precision                       | Throughput (Imgs/sec) | Training Time | Speedup |
-|--------|---------------------------------|-----------------------|---------------|---------|
-| 1      | FP32                            | 89                    | 7m44          | 1.00    |
-| 1      | Automatic Mixed Precision (AMP) | 104                   | 6m40          | 1.17    |
-| 4      | FP32                            | 261                   | 2m48          | 1.00    |
-| 4      | Automatic Mixed Precision (AMP) | 302                   | 2m27          | 1.16    |
-| 8      | FP32                            | 445                   | 1m44          | 1.00    |
-| 8      | Automatic Mixed Precision (AMP) | 491                   | 1m36          | 1.10    |
+`./scripts/benchmarking/DGX1v_trainbench_{FP32, AMP}_{1, 4, 8}GPU.sh` training script in the
+TensorFlow `19.12-tf1-py3` NGC container on an NVIDIA DGX-1 with 8 V100 16G GPUs.
+
+| # GPUs | Precision                       | Throughput (Imgs/sec) | AMP Speedup | Scaling efficiency |
+|--------|---------------------------------|-----------------------|-------------|--------------------|
+| 1      | FP32                            | 92                    | 1.00        | 1.00               |
+| 1      | Automatic Mixed Precision (AMP) | 167                   | 1.82        | 1.00               |
+| 4      | FP32                            | 299                   | 1.00        | 3.25               |
+| 4      | Automatic Mixed Precision (AMP) | 458                   | 1.53        | 2.74               |
+| 8      | FP32                            | 507                   | 1.00        | 5.51               |
+| 8      | Automatic Mixed Precision (AMP) | 561                   | 1.11        | 3.36               |
 
 To achieve these same results, follow the [Quick Start Guide](#quick-start-guide) outlined above.
 
 #### Inference performance results
 
-Our results were obtained by running the aforementioned scripts in the TensorFlow 
-19.03-py3 NGC container on an NVIDIA DGX-1 server with 8 V100 16G GPUs.
+Our results were obtained by running the scripts `./scripts/benchmarking/DGX1v_evalbench_{FP32, AMP}.sh`
+evaluation script in the `19.12-tf1-py3` NGC container on an NVIDIA DGX-1 server with 8 V100 16G GPUs.
 
 | # GPUs | Precision                       | Throughput (Imgs/sec) | Speedup |
 |--------|---------------------------------|-----------------------|---------|
-| 1      | FP32                            | 228                   | 1.00    |
-| 1      | Automatic Mixed Precision (AMP) | 301                   | 1.32    |
+| 1      | FP32                            | 306                   | 1.00    |
+| 1      | Automatic Mixed Precision (AMP) | 550                   | 1.80    |
 
 To achieve these same results, follow the [Quick Start Guide](#quick-start-guide) outlined above.
 
 
@@ -19,8 +19,6 @@
 #
 # ==============================================================================
 
-from __future__ import print_function
-
 import os
 from abc import ABC, abstractmethod
 
 
@@ -37,7 +37,7 @@
 
 from utils import hvd_utils
 
-from dllogger.logger import LOGGER
+from dllogger import Logger
 
 __all__ = ['DAGM2007_Dataset']
 
@@ -109,7 +109,21 @@ def dataset_fn(
 
         shuffle_buffer_size = 10000
 
-        def decode_csv(line):
+        image_dir, csv_file = self._get_data_dirs(training=training)
+
+        mask_image_dir = os.path.join(image_dir, "Label")
+
+        dataset = tf.data.TextLineDataset(csv_file)
+
+        dataset = dataset.skip(1)  # Skip CSV Header
+
+        if only_defective_images:
+            dataset = dataset.filter(lambda line: tf.not_equal(tf.strings.substr(line, -1, 1), "0"))
+
+        if hvd_utils.is_using_hvd() and training:
+            dataset = dataset.shard(hvd.size(), hvd.rank())
+
+        def _load_dagm_data(line):
 
             input_image_name, image_mask_name, label = tf.decode_csv(
                 line, record_defaults=[[""], [""], [0]], field_delim=','
@@ -156,10 +170,33 @@ def decode_image(filepath, resize_shape, normalize_data_method):
                 ),
             )
 
+            label = tf.cast(label, tf.int32)
+
+            return tf.data.Dataset.from_tensor_slices(([input_image], [mask_image], [label]))
+
+        dataset = dataset.apply(
+            tf.data.experimental.parallel_interleave(
+                _load_dagm_data,
+                cycle_length=batch_size*8,
+                block_length=4,
+                buffer_output_elements=batch_size*8
+            )
+        )
+
+        dataset = dataset.cache()
+
+        if training:
+            dataset = dataset.apply(tf.data.experimental.shuffle_and_repeat(buffer_size=shuffle_buffer_size, seed=seed))
+
+        else:
+            dataset = dataset.repeat()
+
+        def _augment_data(input_image, mask_image, label):
+
             if augment_data:
 
-                if not hvd_utils.is_using_hvd() or hvd.local_rank() == 0:
-                    LOGGER.log("Using data augmentation ...")
+                if not hvd_utils.is_using_hvd() or hvd.rank() == 0:
+                    print("Using data augmentation ...")
 
                 #input_image = tf.image.per_image_standardization(input_image)
 
@@ -173,36 +210,11 @@ def decode_image(filepath, resize_shape, normalize_data_method):
                 input_image = tf.image.rot90(input_image, k=n_rots)
                 mask_image = tf.image.rot90(mask_image, k=n_rots)
 
-            label = tf.cast(label, tf.int32)
-
             return (input_image, mask_image), label
 
-        image_dir, csv_file = self._get_data_dirs(training=training)
-
-        mask_image_dir = os.path.join(image_dir, "Label")
-
-        dataset = tf.data.TextLineDataset(csv_file)
-
-        dataset = dataset.skip(1)  # Skip CSV Header
-
-        if only_defective_images:
-            dataset = dataset.filter(lambda line: tf.not_equal(tf.strings.substr(line, -1, 1), "0"))
-
-        dataset = dataset.cache()
-
-        if training:
-
-            dataset = dataset.apply(tf.data.experimental.shuffle_and_repeat(buffer_size=shuffle_buffer_size, seed=seed))
-
-            if hvd_utils.is_using_hvd():
-                dataset = dataset.shard(hvd.size(), hvd.rank())
-
-        else:
-            dataset = dataset.repeat()
-
         dataset = dataset.apply(
             tf.data.experimental.map_and_batch(
-                map_func=decode_csv,
+                map_func=_augment_data,
                 num_parallel_calls=num_threads,
                 batch_size=batch_size,
                 drop_remainder=True,
@@ -212,7 +224,7 @@ def decode_image(filepath, resize_shape, normalize_data_method):
         dataset = dataset.prefetch(buffer_size=tf.contrib.data.AUTOTUNE)
 
         if use_gpu_prefetch:
-            dataset.apply(tf.data.experimental.prefetch_to_device(device="/gpu:0", buffer_size=batch_size * 8))
+            dataset.apply(tf.data.experimental.prefetch_to_device(device="/gpu:0", buffer_size=4))
 
         return dataset
Original file line number	Diff line number	Diff line change
`@@ -19,8 +19,6 @@`
`19`	`19`	`#`
`20`	`20`	`# ==============================================================================`
`21`	`21`
`22`		`-from __future__ import print_function`
`23`		`-`
`24`	`22`	`import os`
`25`	`23`	`from abc import ABC, abstractmethod`
`26`	`24`