Atentie Cookie-urile sunt folosite pe acest site pentru a oferi cea mai buna experienta pentru utilizator. Daca veti continua, vom presupune ca sunteti de acord sa primiti cookie-uri de pe acest site. OK

DEEP LEARNING BENCHMARKS ON SUPERMICRO’S 4U 8 GPU SYSTEM BASED ON DUAL 3 RD GEN AMD EPYC™ PROCESSORS

16-03-2021
de catre GSC

Demonstrating the performance of the Supermicro AS -4124GS-TNR, a 4U dual-processor 8 GPU server with up to 8TB of memory, and 160 Lanes of PCI-E 4.0, shows the generation over generation performance improvements of the new 3rd Gen AMD EPYC 7003 Series Processors on Deep Learning benchmark.

Artificial Intelligence is being adopted in various industries worldwide. The choice of systems to perform these complex tasks is critical and requires understanding how the different system components act together. A series of benchmarks have been created that allow those who evaluate systems and architectures to determine which combination of CPUs and GPUs are the best fit for their workloads.

AI workloads require optimized systems and need to incorporate the proper hardware and tuning the software to deliver maximum performance at a given price point. A solution that provides value to end-users consists of the choice of CPUs, GPUs, and the proper software stack. Various numbers of cores, communication latency between cores, GHz, and which generation of CPU architectures can influence benchmark performance of real-world AI applications.

A comparison will be run for this benchmark that compares 2nd Gen AMD EPYC™ processors to 3rd Gen AMD EPYC processors. AMD provides a wide range of processors with different numbers of cores and speed levels. Any AI/DL/ML application will depend heavily on the GPUs selected. Supermicro has run benchmarks that use different CPU generations and NVIDIA V100 and A100 GPUs. The CPU controls the management and assignment of work to the GPUs, while the GPU does the heavy lifting of transforming, loading, and analyzing the data. This is the training phase of AI deep learning, as well as inferencing.

Supermicro servers are designed for maximum application performance while minimizing power consumption.

• Multi-GPU optimized thermal designs for highest performance and reliability • Advanced GPU interconnect options for best efficiency and lowest latency

• Leading GPU architectures including NVIDIA® HGX platform with NVLink™ and NVSwitch™

 

Supermicro designs and delivers a wide variety of servers and storage systems to enterprises worldwide. For these benchmarks, the AS-4124GS-TNR was the system of choice and features dual AMD 2nd Gen or 3rd Gen EPYC processors, up to 160 PCI-E Gen 4 lanes and with up to 8 PCI-E GPUs. The benchmarks that are used in this paper are widely available, as is the software stack. 

 

 

Software Specifications

 

 

Hardware specification

 

 

Supermicro is the first to benchmark a system's performance under different Neural Network applications, followed by benchmarking the GPU system with a real dataset. For comprehensive and a more controlled comparison of Deep

Learning workloads, an increasing number of manufacturers and end customers are adopting the MLPerf suite, which covers a wide variety of AI/ML/DL workloads. Supermicro is committed to making the MLPerf benchmark as part of specifications for all GPU-capable systems. The benchmarks that Supermicro ran are varied and need to be discussed separately. 1) Deep Learning performance with different Neural Network applications Figure 1 shows the throughput of different Deep Learning Neural Network applications. The benchmark was run with the NVIDIA NGC container that the Deep Learning platform supplies, performs, and bare-metal systems. More details about Deep Learning Neural Network applications are available here for technical information about the Neural Network Applications.

 

 

There are many ways to benchmark a system in a given domain. Synthetic benchmarks are constructed to generate a specific workload on the underlying system and use its application to generate the data. Real-world workload benchmarks use actual data loaded into an application to produce results.

 

 

 

 

The 3rd Gen AMD EPYC 7313 has many new BIOS options critical to these new processors' high performance. As in previous generations of AMD EPYC processors, the setting of IOMMU and NPS are two of them that could significantly impact OS installation and overall performance. Please refer to the NVIDIA design guide, DG-10105-001, for PCIe servers. Tuning the CPUs to get datasets ready for Deep Learning applications is critical to both system designers and the end-user customers.

Conclusion The benchmark results clearly show that the AMD EYPC 7313 processors improve the NVIDIA GPU system's throughput with Deep Learning workloads. A 15-30% generational increase in the synthetic benchmark test is seen with EPYC with the same NVIDIA A100 GPUs in a similar Supermicro chassis. The benchmark results also demonstrate that A100 PCIe can outperform V100 SXM2 in the comparable GPU systems up to 40%. Combined with the NVIDIA A100 GPU, Supermicro AMD CPU-based GPU systems are very flexible, competitive, and offer exceptional customer experiences. 

Supermicro (Nasdaq: SMCI), the leading innovator in high-performance, high efficiency server and storage technology, is a premier provider of advanced Server Building Block Solutions® for Enterprise Data Center, Cloud Computing, Artificial Intelligence, and Edge Computing Systems worldwide. Supermicro is committed to protecting the environment through its "We Keep IT Green®" initiative and provides customers with the most energy efficient, environmentally-friendly solutions available on the market. Supermicro, Server Building Block Solutions, and We Keep IT Green are trademarks and/or registered trademarks of Super Micro Computer, Inc. AMD, the AMD logo, EPYC, and combinations thereof are trademarks of Advanced Micro Devices, Inc.

All other brands, names, and trademarks are the property of their respective owners.