Test plan
This document follows MLPerf Inference v0.7 code, MLPerf Inference v1.1 code, and rules. We ran MLPerf benchmarks designed for inference at the edge as defined in the follow table.
Area | Task | Model | Dataset | QSL size | Quality | Multistream latency constraint |
---|---|---|---|---|---|---|
Vision |
Image |
Resnet50v1.5 |
ImageNet (224x224) |
1024 |
99% of |
50ms |
Vision |
Object detection (large) |
SSD- |
COCO |
64 |
99% of |
66ms |
Vision |
Object detection (small) |
SSD- |
COCO |
256 |
99% of |
50ms |
Vision |
Medical image segmentation |
3D UNET |
BraTS 2019 |
16 |
99% and 99.9% of |
n/a |
Speech |
Speech-to- |
RNNT |
Librispeech dev-clean |
2513 |
99% of |
n/a |
Language |
Language processing |
BERT |
SQuAD v1.1 |
10833 |
99% of |
n/a |
The following table presents Edge benchmark scenarios.
Area | Task | Scenarios |
---|---|---|
Vision |
Image classification |
Single stream, offline, multistream |
Vision |
Object detection (large) |
Single stream, offline, multistream |
Vision |
Object detection (small) |
Single stream, offline, multistream |
Vision |
Medical image segmentation |
Single stream, offline |
Speech |
Speech-to-text |
Single stream, offline |
Language |
Language processing |
Single stream, offline |
We performed these benchmarks using the networked storage architecture developed in this validation and compared results to those from local runs on the edge servers previously submitted to MLPerf. The comparison is to determine how much impact the shared storage has on inference performance.