Skip to content

Commit 8466d33

Browse files
authored
Fix broken links Python BE (triton-inference-server#193)
1 parent 6327d48 commit 8466d33

4 files changed

Lines changed: 14 additions & 12 deletions

File tree

examples/auto_complete/README.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -31,15 +31,15 @@
3131
This example shows how to implement
3232
[`auto_complete_config`](https://github.com/triton-inference-server/python_backend/#auto_complete_config)
3333
function in Python backend to provide
34-
[`max_batch_size`](https://github.com/triton-inference-server/server/blob/main/docs/model_configuration.md#maximum-batch-size),
35-
[`input`](https://github.com/triton-inference-server/server/blob/main/docs/model_configuration.md#inputs-and-outputs)
36-
and [`output`](https://github.com/triton-inference-server/server/blob/main/docs/model_configuration.md#inputs-and-outputs)
34+
[`max_batch_size`](https://github.com/triton-inference-server/server/blob/main/docs/user_guide/model_configuration.md#maximum-batch-size),
35+
[`input`](https://github.com/triton-inference-server/server/blob/main/docs/user_guide/model_configuration.md#inputs-and-outputs)
36+
and [`output`](https://github.com/triton-inference-server/server/blob/main/docs/user_guide/model_configuration.md#inputs-and-outputs)
3737
properties. These properties will allow Triton to load the Python model with
38-
[Minimal Model Configuration](https://github.com/triton-inference-server/server/blob/main/docs/model_configuration.md#minimal-model-configuration)
38+
[Minimal Model Configuration](https://github.com/triton-inference-server/server/blob/main/docs/user_guide/model_configuration.md#minimal-model-configuration)
3939
in absence of a configuration file.
4040

4141
The
42-
[model repository](https://github.com/triton-inference-server/server/blob/main/docs/model_repository.md)
42+
[model repository](https://github.com/triton-inference-server/server/blob/main/docs/user_guide/model_repository.md)
4343
should contain [nobatch_auto_complete](./nobatch_model.py), and
4444
[batch_auto_complete](./batch_model.py) models.
4545
The max_batch_size of [nobatch_auto_complete](./nobatch_model.py) model is set

examples/bls/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@
3030

3131
In this section we demonstrate an end-to-end example for
3232
[BLS](../../README.md#business-logic-scripting) in Python backend. The
33-
[model repository](https://github.com/triton-inference-server/server/blob/main/docs/model_repository.md)
33+
[model repository](https://github.com/triton-inference-server/server/blob/main/docs/user_guide/model_repository.md)
3434
should contain [pytorch](../pytorch), [addsub](../add_sub). The
3535
[pytorch](../pytorch) and [addsub](../add_sub) models calculate the sum and
3636
difference of the `INPUT0` and `INPUT1` and put the results in `OUTPUT0` and

examples/decoupled/README.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,8 @@ how to write a decoupled model where each request can generate 0 to many respons
3636
These files are heavily commented to describe each function call.
3737
These example models are designed to show the flexibility available to decoupled models
3838
and in no way should be used in production. These examples circumvents
39-
the restriction placed by the [instance count](https://github.com/triton-inference-server/server/blob/main/docs/model_configuration.md#instance-groups)
39+
the restriction placed by the
40+
[instance count](https://github.com/triton-inference-server/server/blob/main/docs/user_guide/model_configuration.md#instance-groups)
4041
and allows multiple requests to be in process even for single instance. In
4142
real deployment, the model should not allow the caller thread to return from
4243
`execute` until that instance is ready to handle another set of requests.
@@ -341,4 +342,4 @@ stream stopped...
341342

342343
Look how responses were delivered out-of-order of requests.
343344
The generated responses can be tracked to their request using
344-
the `id` field.
345+
the `id` field.

inferentia/README.md

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -239,22 +239,23 @@ their need.
239239

240240
To enable dynamic batching, `--enable_dynamic_batching`
241241
flag needs to be specified. `gen_triton_model.py` supports following three
242-
options for configuring [Triton's dynamic batching](https://github.com/triton-inference-server/server/blob/main/docs/model_configuration.md):
242+
options for configuring [Triton's dynamic batching](https://github.com/triton-inference-server/server/blob/main/docs/user_guide/model_configuration.md):
243243

244-
1. `--preferred_batch_size`: Please refer to [model configuration documentation](https://github.com/triton-inference-server/server/blob/main/docs/model_configuration.md#preferred-batch-sizes) for details on preferred batch size. To optimize
244+
1. `--preferred_batch_size`: Please refer to [model configuration documentation](https://github.com/triton-inference-server/server/blob/main/docs/user_guide/model_configuration.md#preferred-batch-sizes) for details on preferred batch size. To optimize
245245
performance, this is recommended to be multiples of engaged neuron cores.
246246
For example, if each instance is using 2 neuron cores, `preferred_batch_size`
247247
could be 2, 4 or 6.
248248
2. `--max_queue_delay_microseconds`: Please refer to
249-
[model configuration documentation](https://github.com/triton-inference-server/server/blob/main/docs/model_configuration.md#delayed-batching) for details.
249+
[model configuration documentation](https://github.com/triton-inference-server/server/blob/main/docs/user_guide/model_configuration.md#delayed-batching) for details.
250250
3. `--disable_batch_requests_to_neuron`: Enable the non-default way for Triton to
251251
handle batched requests. Triton backend will send each request to neuron
252252
separately, irrespective of if the Triton server requests are batched.
253253
This flag is recommended when users want to optimize performance with models
254254
that do not perform well with batching without the flag.
255255

256256
Additionally, `--max_batch_size` will affect the maximum batching limit. Please
257-
refer to the [model configuration documentation](https://github.com/triton-inference-server/server/blob/main/docs/model_configuration.md#maximum-batch-size)
257+
refer to the
258+
[model configuration documentation](https://github.com/triton-inference-server/server/blob/main/docs/user_guide/model_configuration.md#maximum-batch-size)
258259
for details.
259260

260261
## Testing Inferentia Setup for Accuracy

0 commit comments

Comments
 (0)