You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: PaddlePaddle/Classification/RN50v1.5/README.md
+13-13Lines changed: 13 additions & 13 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -530,10 +530,10 @@ The model will be stored in the directory specified with `--output-dir`, includi
530
530
-`.pdopts`: The optimizer information contains all the Tensors used by the optimizer. For Adam optimizer, it contains beta1, beta2, momentum, and so on. All the information will be saved to a file with suffix “.pdopt”. (If the optimizer has no Tensor need to save (like SGD), the file will not be generated).
531
531
-`.pdmodel`: The network description is the description of the program. It’s only used for deployment. The description will save to a file with the suffix “.pdmodel”.
532
532
533
-
The default prefix of model files is `paddle_example`. Model of each epoch would be stored in directory `./output/ResNet/epoch_id/` with three files by default, including `paddle_example.pdparams`, `paddle_example.pdopts`, `paddle_example.pdmodel`. Note that `epoch_id` is 0-based, which means `epoch_id` is from 0 to 89 for a total of 90 epochs. For example, the model of the 89th epoch would be stored in `./output/ResNet/89/paddle_example`
533
+
The default prefix of model files is `resnet_50_paddle`. Model of each epoch would be stored in directory `./output/ResNet/epoch_id/` with three files by default, including `resnet_50_paddle.pdparams`, `resnet_50_paddle.pdopts`, `resnet_50_paddle.pdmodel`. Note that `epoch_id` is 0-based, which means `epoch_id` is from 0 to 89 for a total of 90 epochs. For example, the model of the 89th epoch would be stored in `./output/ResNet/89/resnet_50_paddle`
534
534
535
535
Assume you want to train the ResNet for 90 epochs, but the training process aborts during the 50th epoch due to infrastructure faults. To resume training from the checkpoint, specify `--from-checkpoint` and `--last-epoch-of-checkpoint` with following these steps:
536
-
- Set `./output/ResNet/49/paddle_example` to `--from-checkpoint`.
536
+
- Set `./output/ResNet/49/resnet_50_paddle` to `--from-checkpoint`.
537
537
- Set `--last-epoch-of-checkpoint` to `49`.
538
538
Then rerun the training to resume training from the 50th epoch to the 89th epoch.
- Resume from checkpoints: Both `paddle_example.pdopts` and `paddle_example.pdparams` must be in the given path.
569
-
- Start from pretrained weights: `paddle_example.pdparams` must be in the given path.
570
-
- The prefix `paddle_example` must be added to the end of the given path. For example: set path as `./output/ResNet/89/paddle_example` instead of `./output/ResNet/89/`
568
+
- Resume from checkpoints: Both `resnet_50_paddle.pdopts` and `resnet_50_paddle.pdparams` must be in the given path.
569
+
- Start from pretrained weights: `resnet_50_paddle.pdparams` must be in the given path.
570
+
- The prefix `resnet_50_paddle` must be added to the end of the given path. For example: set path as `./output/ResNet/89/resnet_50_paddle` instead of `./output/ResNet/89/`
571
571
- Don't set `--from-checkpoint` and `--from-pretrained-params` at the same time.
572
572
573
573
The difference between those two is that `--from-pretrained-params` contain only model weights, and `--from-checkpoint`, apart from model weights, contain the optimizer state, and LR scheduler state.
@@ -596,18 +596,18 @@ Note that automatic sparsity (ASP) requires a pretrained model to initialize par
596
596
597
597
You can apply `scripts/training/train_resnet50_AMP_ASP_90E_DGXA100.sh` we provided to launch ASP + AMP training.
598
598
```bash
599
-
# Default path to pretrained parameters is ./output/ResNet50/89/paddle_example
599
+
# Default path to pretrained parameters is ./output/ResNet50/89/resnet_50_paddle
Or following steps below to manually launch ASP + AMP training.
604
604
605
-
First, set `--from-pretrained-params` to a pretrained model file. For example, if you have trained the ResNet for 90 epochs following [Training process](#training-process), the final pretrained weights would be stored in `./output/ResNet50/89/paddle_example.pdparams` by default, and set `--from-pretrained-params` to `./output/ResNet/89/paddle_example`.
605
+
First, set `--from-pretrained-params` to a pretrained model file. For example, if you have trained the ResNet for 90 epochs following [Training process](#training-process), the final pretrained weights would be stored in `./output/ResNet50/89/resnet_50_paddle.pdparams` by default, and set `--from-pretrained-params` to `./output/ResNet/89/resnet_50_paddle`.
0 commit comments