NetApp HCI Solutions

Collect Inference Metrics from Triton Inference Server


The Triton Inference Server provides Prometheus metrics indicating GPU and request statistics.

By default, these metrics are available at [triton_inference_server_IP]:8002/metrics" class="bare">http://[triton_inference_server_IP]:8002/metrics.

The Triton Inference Server IP is the LoadBalancer IP that was recorded earlier.

The metrics are only available by accessing the endpoint and are not pushed or published to any remote server.

Error: Missing Graphic Image

Error: Missing Graphic Image