AMD EPYC Milan

 

The compute nodes, fpga nodes and gpu nodes of Noctua 2 have dual-socket AMD EPYC Milan CPUs.

Specification

The specification is as follows

Property

normal, largemem, gpu

hugemem, fpga-nodes

Property

normal, largemem, gpu

hugemem, fpga-nodes

CPUs

2xAMD EPYC Milan 7763

2xAMD EPYC Milan 7713

Microarchitecture

Zen3

Zen3

Cores per node in total

128

128

Cores per socket

64

64

Sockets per node

2

2

SMT

off

off

L3 Cache per socket

256 MB

256 MB

TDP per socket

280 W

225 W

Base frequency

2.45 GHz

2.0 GHz

Max. Boost frequency

3.5 GHz

3.675 GHz

Main memory

256/1024/512 GB DDR4 3200 MHz

2048/512 GB DDR4 3200 MHz

Memory channel per socket

8

8

Memory bandwidth (Stream triad)

~370 GB/s

~370 GB/s

Floating-Point Peformance (single-node HPL)

~ 4.0 TFLOP/s

~ 3.7 TFLOP/s

Floating-Point Peformance (single-node HPCG)

~ 62 GFLOP/s

~ 62 GFLOP/s

Important Properties

SMT (aka. Hyperthreading)

SMT is disabled on Noctua 2.

Instruction Set

The AMD Milan (Zen3) CPUs can do up to 256-bit SIMD, i.e. AVX2.

NUMA-Domains

The CPUs run in the NPS=4 mode. That means that there are 4 NUMA domains per socket so in total a single node has 8 NUMA domains.

Topology

The basic unit of an AMD EPYC Milan CPU is a CCD (Core Compute Die) with 8 CPU cores and 32 MB of L3-Cache. In the above-mentioned NPS=4-configuration two such CCDs mark up a NUMA domain and 4 NUMA domains are on one socket.

 

N2_arch.png
Noctua 2 CPU architecture with 2 AMD EPYC Milan CPU sockets

 

Thus, we recommend using chunks of 8 cores, e.g. a calculation with 32 cores that uses 4 tasks (MPI ranks) and 8 threads per task.

The layout of a compute node depicted below. For other node types see

 

Â