AMD EPYC Milan
Â
The compute nodes, fpga nodes and gpu nodes of Noctua 2 have dual-socket AMD EPYC Milan CPUs.
Specification
The specification is as follows
Property | normal, largemem, gpu | hugemem, fpga-nodes |
---|---|---|
CPUs | 2xAMD EPYC Milan 7763 | 2xAMD EPYC Milan 7713 |
Microarchitecture | Zen3 | Zen3 |
Cores per node in total | 128 | 128 |
Cores per socket | 64 | 64 |
Sockets per node | 2 | 2 |
SMT | off | off |
L3 Cache per socket | 256 MB | 256 MB |
TDP per socket | 280 W | 225 W |
Base frequency | 2.45 GHz | 2.0 GHz |
Max. Boost frequency | 3.5 GHz | 3.675 GHz |
Main memory | 256/1024/512 GB DDR4 3200 MHz | 2048/512 GB DDR4 3200 MHz |
Memory channel per socket | 8 | 8 |
Memory bandwidth (Stream triad) | ~370 GB/s | ~370 GB/s |
Floating-Point Peformance (single-node HPL) | ~ 4.0 TFLOP/s | ~ 3.7 TFLOP/s |
Floating-Point Peformance (single-node HPCG) | ~ 62 GFLOP/s | ~ 62 GFLOP/s |
Important Properties
SMT (aka. Hyperthreading)
SMT is disabled on Noctua 2.
Instruction Set
The AMD Milan (Zen3) CPUs can do up to 256-bit SIMD, i.e. AVX2.
NUMA-Domains
The CPUs run in the NPS=4 mode. That means that there are 4 NUMA domains per socket so in total a single node has 8 NUMA domains.
Topology
The basic unit of an AMD EPYC Milan CPU is a CCD (Core Compute Die) with 8 CPU cores and 32 MB of L3-Cache. In the above-mentioned NPS=4-configuration two such CCDs mark up a NUMA domain and 4 NUMA domains are on one socket.
Â
Â
Thus, we recommend using chunks of 8 cores, e.g. a calculation with 32 cores that uses 4 tasks (MPI ranks) and 8 threads per task.
The layout of a compute node depicted below. For other node types see
Â
Â