You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[[INFO][PYTHON] step:][29][max diff: ][4.529953e-06][ op val: ][2.88768935][ tf val: ][2.88769388][True]
550
+
[[INFO][PYTHON] step:][30][max diff: ][4.17232513e-06][ op val: ][-1.28717053][ tf val: ][-1.2871747][True]
551
+
[[INFO][PYTHON] step:][31][max diff: ][4.05311584e-06][ op val: ][-1.01830876][ tf val: ][-1.01831281][True]
548
552
```
549
553
550
-
The results show that the differences between the decoder of TensorFlow and decoder are smaller than threshold.
554
+
The results show that the differences between the decoder of TensorFlow and decoder are smaller than threshold. Note that the differences are absolute differences, so the differences may be large when the op val is large. In this case, the differences are larger than the threshold and the checking will return "False", but it may be not affect the final results.
551
555
552
556
The option `decoder_type` decides to use the decoder of TensorFlow or decoder of FasterTransformer. `decoder_type 2` uses both decoders and compares their results.
553
557
@@ -606,15 +610,13 @@ python decoding_sample.py \
606
610
The outputs should be similar to the following:
607
611
608
612
```bash
609
-
[INFO] Before finalize:
610
-
result before finalize cross-check: True
613
+
Output ids cross-check: True
611
614
612
615
Parent ids cross-check: True
613
616
614
-
sequence lengths cross-check: True
617
+
Sequence lengths cross-check: True
615
618
616
-
[INFO] After finalize:
617
-
result after cross-check: True
619
+
Finalized output ids cross-check: True
618
620
```
619
621
620
622
Note that the results of OP and the results of TensorFlow are often different in the random inputs and weights.
For translation, we need to use some tools and library of OpenNMT-tf to prepocess the source sentence and build the encoder.
643
+
Because the encoder of FasterTransformer is based on BERT, it cannot be restore the pretrained model. So, it requires to use the encoder of OpenNMT-tf.
644
+
645
+
1. Prepare the pretrained model and the data for translation.
646
+
647
+
```bash
648
+
bash utils/translation/download_model_data.sh
649
+
```
650
+
651
+
`download_model_data.sh` will prepare the `opennmt` folder, which contains the input embedding and the encoder, download the pretrained model, and download the test data into the `translation` folder. This is because the encoder of FasterTransformer is based on BERT, but not OpenNMT-tf, so we cannot restore the pretrained model of OpenNMT-tf for encoder. Therefore, translation requires the encoder of OpenNMT-tf.
652
+
653
+
Another problem is that the implementation of our tf_decoding and OpenNMT-tf decoding is a little different. For example, OpenNMT-tf uses one gemm to compute query, key and values in one time; but tf_decoding splits them into three gemms. So, the tool `utils/dump_model.py` will convert the pretrained model to fit the model structure of decoder of FasterTransformer.
654
+
655
+
```bash
656
+
./bin/decoding_gemm 1 4 8 64 32001 100 512 0
657
+
python translate_sample.py
658
+
```
659
+
660
+
The outputs should be similar to the following:
661
+
662
+
```bash
663
+
[INFO] opennmt: ▁28 - jährige r ▁Chef koch ▁to t ▁in ▁San ▁Francisco </s>
664
+
[INFO] tf : ▁28 - jährige r ▁Chef koch ▁to t ▁in ▁San ▁Francisco </s>
665
+
[INFO] op : ▁28 - jährige r ▁Chef koch ▁to t ▁in ▁San ▁Francisco </s>
- Fix the bug of maximum sequence length of decoder cannot be larger than 128.
788
+
- Add `translate_sample.py` to demonstrate how to translate a sentence by restoring the pretrained model of OpenNMT-tf.
789
+
- Fix the bug that decoding does not check finish or not after each step.
790
+
- Fix the bug of decoder about max_seq_len.
791
+
- Modify the decoding model structure to fit the OpenNMT-tf decoding model.
792
+
- Add a layer normalization layer after decoder.
793
+
- Add a normalization for inputs of decoder
794
+
755
795
Febuary 2020
756
796
- Release the FasterTransformer 2.0
757
797
- Provide a highly optimized OpenNMT-tf based decoder and decoding, including C++ API and TensorFlow op.
@@ -764,10 +804,8 @@ July 2019
764
804
765
805
### Known issues
766
806
767
-
- sequence length of Decoder and Decoding should be smaller than 128.
768
-
- batch_size should be smaller than 1024 in Decoder.
769
-
- batch_size x beam_width should be smaller than 1024 in Decoding.
770
-
- Results of TensorFlow and OP would be different in decoding. This problem is caused by the accumulated log probability, and we do not avoid this problem.
807
+
- batch_size should be smaller or equal to 1024 in Decoder.
808
+
- batch_size x beam_width should be smaller or equal to 1024 in Decoding.
809
+
- Results of TensorFlow and OP would be different in decoding. This problem is caused by the accumulated log probability, and we do not avoid this problem.
771
810
- Cmake 15 or Cmake 16 fail to build this project. Cmake 14 is no problem.
772
-
- Max sequence length of encoder and decoder should be the same.
773
-
811
+
- Max sequence length of encoder and decoder should be the same.
0 commit comments