Test plan

09/12/2024 Contributors

This document follows MLPerf Inference v0.7 code, MLPerf Inference v1.1 code, and rules. We ran MLPerf benchmarks designed for inference at the edge as defined in the follow table.

Area	Task	Model	Dataset	QSL size	Quality	Multistream latency constraint
Vision	Image classification	Resnet50v1.5	ImageNet (224x224)	1024	99% of FP32	50ms
Vision	Object detection (large)	SSD- ResNet34	COCO (1200x1200)	64	99% of FP32	66ms
Vision	Object detection (small)	SSD- MobileNetsv1	COCO (300x300)	256	99% of FP32	50ms
Vision	Medical image segmentation	3D UNET	BraTS 2019 (224x224x160)	16	99% and 99.9% of FP32	n/a
Speech	Speech-to- text	RNNT	Librispeech dev-clean	2513	99% of FP32	n/a
Language	Language processing	BERT	SQuAD v1.1	10833	99% of FP32	n/a

Area

Task

Model

Dataset

QSL size

Quality

Multistream latency constraint

Vision

Image
classification

Resnet50v1.5

ImageNet (224x224)

1024

99% of
FP32

50ms

Vision

Object detection (large)

SSD-
ResNet34

COCO
(1200x1200)

99% of
FP32

66ms

Vision

Object detection (small)

SSD-
MobileNetsv1

COCO
(300x300)

256

99% of
FP32

50ms

Vision

Medical image segmentation

3D UNET

BraTS 2019
(224x224x160)

99% and 99.9% of
FP32

n/a

Speech

Speech-to-
text

RNNT

Librispeech dev-clean

2513

99% of
FP32

n/a

Language

Language processing

BERT

SQuAD v1.1

10833

99% of
FP32

n/a

The following table presents Edge benchmark scenarios.

Area	Task	Scenarios
Vision	Image classification	Single stream, offline, multistream
Vision	Object detection (large)	Single stream, offline, multistream
Vision	Object detection (small)	Single stream, offline, multistream
Vision	Medical image segmentation	Single stream, offline
Speech	Speech-to-text	Single stream, offline
Language	Language processing	Single stream, offline

Area

Task

Scenarios

Vision

Image classification

Single stream, offline, multistream

Vision

Object detection (large)

Single stream, offline, multistream

Vision

Object detection (small)

Single stream, offline, multistream

Vision

Medical image segmentation

Single stream, offline

Speech

Speech-to-text

Single stream, offline

Language

Language processing

Single stream, offline

We performed these benchmarks using the networked storage architecture developed in this validation and compared results to those from local runs on the edge servers previously submitted to MLPerf. The comparison is to determine how much impact the shared storage has on inference performance.

Test plan

Creating your file...