Skip to main content
NetApp HCI Solutions

Collect Inference Metrics from Triton Inference Server

Contributors

The Triton Inference Server provides Prometheus metrics indicating GPU and request statistics.

By default, these metrics are available at [triton_inference_server_IP]:8002/metrics" class="bare">http://[triton_inference_server_IP]:8002/metrics.

The Triton Inference Server IP is the LoadBalancer IP that was recorded earlier.

The metrics are only available by accessing the endpoint and are not pushed or published to any remote server.

Error: Missing Graphic Image

Error: Missing Graphic Image