English

Collect Inference Metrics from Triton Inference Server

Contributors netapp-dorianh Download PDF of this page

The Triton Inference Server provides Prometheus metrics indicating GPU and request statistics.

By default, these metrics are available at [triton_inference_server_IP]:8002/metrics" class="bare">http://[triton_inference_server_IP]:8002/metrics.

The Triton Inference Server IP is the LoadBalancer IP that was recorded earlier.

The metrics are only available by accessing the endpoint and are not pushed or published to any remote server.

Error: Missing Graphic Image

Error: Missing Graphic Image