Skip to main content
NetApp Solutions

Test configuration

Contributors kevin-hoke

This section describes the tested configurations, the network infrastructure, the SR670 V2 server, and the NetApp storage provisioning details.

Solution architecture

We used the solution components listed in the following table for this validation.

Solution components Details

Lenovo ThinkSystem servers

  • Two SR670 V2 servers each with eight NVIDIA A100 80GB GPU cards

  • Each server contains 2 Intel Xeon Platinum 8360Y CPUs (28 physical cores) and 1TB RAM

Linux (Ubuntu – 20.04 with CUDA 11.8)

NetApp AFF storage system (HA pair)

  • NetApp ONTAP 9.10.1 software

  • 24x 960GB SSDs

  • NFS protocol

  • 1 interface group (ifgrp) per controller, with four logical IP addresses for mount points

In this validation, we used ResNet v2.0 with the ImageNet basis set as specified by MLPerf v2.0. The dataset is stored in a NetApp AFF storage system with the NFS protocol. The SR670s were connected to the NetApp AFF A400 storage system over a 100GbE switch.

ImageNet is a frequently used image dataset. It contains almost 1.3 million images for a total size of 144GB. The average image size is 108KB.

The following figure depicts the network topology of the tested configuration.

This graphic depicts the compute layer, a Lenovo ThinkSystem SR670 V2, the network layer, a Lenovo Ethernet switch, and the storage layer, a NetApp AFF A400 storage controller. All network connections are included.

Storage controller

The following table lists the storage configuration.

Controller Aggregate FlexGroup volume Aggregate size Volume size Operating system mount point

Controller1

Aggr1

/a400-100g

9.9TB

19TB

/a400-100g

Controller2

Aggr2

/a400-100g

9.9TB

/a400-100g

Note The /a400-100g folder contains the dataset used for ResNet validation.