Skip to content

Add request statistic reporting for decoupled mode#163

Merged
Tabrizian merged 1 commit into
mainfrom
imant-decoupled-stats
Jun 6, 2022
Merged

Add request statistic reporting for decoupled mode#163
Tabrizian merged 1 commit into
mainfrom
imant-decoupled-stats

Conversation

@Tabrizian

@Tabrizian Tabrizian commented Jun 3, 2022

Copy link
Copy Markdown
Member

After:

root@itabrizian-dt:/opt/tritonserver/qa/L0_backend_python/io# perf_analyzer -m repeat_int32 --shape INPUT0:1 --shape DELAY:1 --shape IN:1 -i grpc -z --streaming
*** Measurement Settings ***
  Batch size: 1
  Using "time_windows" mode for stabilization
  Measurement window: 5000 msec
  Using asynchronous calls for inference
  Detected decoupled model, using the first response for measuring latency
  Stabilizing using average latency

Request concurrency: 1
  Client:
    Request count: 4873
    Throughput: 324.867 infer/sec
    Avg latency: 3013 usec (standard deviation 402 usec)
    p50 latency: 2936 usec
    p90 latency: 3920 usec
    p95 latency: 4003 usec
    p99 latency: 4071 usec

  Server:
    Inference count: 5867
    Execution count: 5867
    Successful request count: 5867
    Avg request latency: 2229 usec (overhead 3 usec + queue 59 usec + compute input 163 usec + compute infer 1984 usec + compute output 19 usec)

Inferences/Second vs. Client Average Batch Latency
Concurrency: 1, throughput: 324.867 infer/sec, latency 3013 usec

Before:

*** Measurement Settings ***
  Batch size: 1
  Using "time_windows" mode for stabilization
  Measurement window: 5000 msec
  Using asynchronous calls for inference
  Detected decoupled model, using the first response for measuring latency
  Stabilizing using average latency

Request concurrency: 1
  Client:
    Request count: 4929
    Throughput: 328.6 infer/sec
    Avg latency: 2981 usec (standard deviation 421 usec)
    p50 latency: 2936 usec
    p90 latency: 3872 usec
    p95 latency: 3998 usec
    p99 latency: 4089 usec

  Server:
    Request count: 0
Inferences/Second vs. Client Average Batch Latency
Concurrency: 1, throughput: 328.6 infer/sec, latency 2981 usec

@Tabrizian Tabrizian force-pushed the imant-decoupled-stats branch from 2c14bd1 to c77adbc Compare June 3, 2022 21:55
@Tabrizian Tabrizian requested review from krishung5 and tanmayv25 and removed request for tanmayv25 June 3, 2022 21:55
@Tabrizian Tabrizian merged commit 73fdcda into main Jun 6, 2022
@Tabrizian Tabrizian deleted the imant-decoupled-stats branch June 6, 2022 16:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants