Skip to content

Commit c4b3e63

Browse files
tanmayv25krishung5nnshah1
authored
Document TF platform handler (triton-inference-server#276)
* Document TF platform handler * Move the documentation on TF platform handler * Update src/resources/platform_handlers/tensorflow_savedmodel/README.md Co-authored-by: Kris Hung <krish@nvidia.com> * Update src/resources/platform_handlers/tensorflow_savedmodel/README.md Co-authored-by: Kris Hung <krish@nvidia.com> * Address review comments * Fix * Add a disclaimer note * Update src/resources/platform_handlers/tensorflow_savedmodel/README.md Co-authored-by: Neelay Shah <neelays@nvidia.com> --------- Co-authored-by: Kris Hung <krish@nvidia.com> Co-authored-by: Neelay Shah <neelays@nvidia.com>
1 parent 23d1a21 commit c4b3e63

1 file changed

Lines changed: 87 additions & 0 deletions

File tree

  • src/resources/platform_handlers/tensorflow_savedmodel
Lines changed: 87 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,87 @@
1+
<!--
2+
# Copyright 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
3+
#
4+
# Redistribution and use in source and binary forms, with or without
5+
# modification, are permitted provided that the following conditions
6+
# are met:
7+
# * Redistributions of source code must retain the above copyright
8+
# notice, this list of conditions and the following disclaimer.
9+
# * Redistributions in binary form must reproduce the above copyright
10+
# notice, this list of conditions and the following disclaimer in the
11+
# documentation and/or other materials provided with the distribution.
12+
# * Neither the name of NVIDIA CORPORATION nor the names of its
13+
# contributors may be used to endorse or promote products derived
14+
# from this software without specific prior written permission.
15+
#
16+
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS ``AS IS'' AND ANY
17+
# EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
18+
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
19+
# PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
20+
# CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
21+
# EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
22+
# PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
23+
# PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
24+
# OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
25+
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
26+
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
27+
-->
28+
29+
# Serving Tensorflow SavedModels using Python Backend \[Experimental\]
30+
31+
*NOTE*: This feature is subject to change and removal, and should not
32+
be used in production.
33+
34+
Starting from 23.07, we are adding experimental support for loading
35+
and serving of models in [TensorFlow SavedModel](https://www.tensorflow.org/guide/saved_model)
36+
format via Python backend. The `model.savedmodel` can be provided within
37+
the triton server model repository without `model.py` and backend will
38+
automatically use a pre-built python model (`model.py`)[model.py] to load
39+
and serve provided TF SavedModel. The handler can [auto-complete](../../../../README.md#auto_complete_config)
40+
the missing model configuration.
41+
42+
The model repository structure can look like:
43+
44+
```
45+
model_repository/
46+
`-- resnet_v1_50_savedmodel
47+
|-- 1
48+
| `-- model.savedmodel
49+
| |-- saved_model.pb
50+
| `-- variables
51+
|-- config.pbtxt
52+
`-- resnet50_labels.txt
53+
```
54+
55+
In order to use this feature, make sure that [TensorFlow pip package](https://pypi.org/project/tensorflow/2.13.0/)
56+
is available in the same Python environment.
57+
58+
```
59+
pip install tensorfow==2.13.0
60+
```
61+
62+
Alternatively, you can create a
63+
[Python Execution Environment](#using-custom-python-execution-environments)
64+
with the TensorFlow dependency.
65+
66+
By default, Triton will use the [TensorFlow backend](https://github.com/triton-inference-server/tensorflow_backend)
67+
to load and serve the saved model. In order to use the Python backend with
68+
TensorFlow SavedModel, [model configuration](https://github.com/triton-inference-server/server/blob/main/docs/user_guide/model_configuration.md)
69+
should explicitly provide the following settings:
70+
71+
```
72+
backend: "python"
73+
platform: "tensorflow_savedmodel"
74+
```
75+
76+
It has been observed that certain DLFW like TensorFlow do not release the entire
77+
memory allocated for loading a model back to the system when the model gets
78+
unloaded. This can be problematic when working with a large number of models and
79+
dynamically loading/unloading them. Using Python backend for TF SavedModel serving
80+
will allow the models to be loaded in a separate process, which ensures that entire
81+
memory allocated within the process would be released to the system upon a model
82+
unload.
83+
84+
Following are few known limitations of this feature:
85+
- GPU execution is not supported.
86+
- List of requests received in model [`execute`](../../../../README.md#execute) function are
87+
not run in a single batch but one after the other.

0 commit comments

Comments
 (0)