FPGA-to-FPGA Networking
Overview
All boards in Noctua 2 can be configured with customized FPGA-to-FPGA networking setup within each user job.
The Alveo U280 boards (each providing 2 QSFP28 connections with 100GbE support) can be configured with direct point-to-point links or connected to an Ethernet switch. The networking subsystem on the FPGAs needs to be configured as part of the user design, for example
for streaming connections via point-to-point links with AuroraFlow GitHub - papeg/AuroraFlow: Ready-to-link, packaged Aurora IP on four QSFP28 lanes, providing 100Gb/s throughput
for Ethernet connections via the Ethernet switch or point-to-point Ethernet links with ACCL GitHub - Xilinx/ACCL: Alveo Collective Communication Library: MPI-like communication operations for Xilinx Alveo accelerators.
The Bittware 520N boards (each providing 4 QSFP+ connections with 40GbE support) can only be configured with direct point-to-point links. The networking subsystem on the FPGAs is a direct streaming interface provided by the shell (BSP) when selecting the
_max_
flavor of the BSP. (See also Constraint Overview )
The --fpgalink
syntax is used to describe the FPGA-to-FPGA network topology for all nodes and FPGAs involved in a specific job.
The FPGA-Link GUI online editor can be used to create, visualize and export desired network topologies in the --fpgalink
syntax.
The changeFPGALinkscommand line tool is used to configure and verify network topologies within a job.
--fpgalink Syntax
The notation nXX:aclY:chZ
describes a unique serial channel endpoint within a job allocation according to the following pattern
nXX
, e.g.n02
specifies the node ID within your allocation, starting withn00
for the first node,n02
will specify the third node of your allocation. You can not use higher node IDs than the number of nodes requested by the allocation. At allocation time, the node ID is translated to a concrete node name, e.g.fpga-0008
.aclY
, e.i.acl0, acl1
(and optionallyacl2
) describe the first, second (and optionally third) FPGA board within each node.chZ
, e.i.ch0
,ch1
, (and optionallych2, ch3
) describe up to 4 external channel connections per FPGA.
The syntax of complete--fpgalink
argument is--fpgalink=nXX:aclY:chZ-nXX:aclY:chZ
to connect the thus specified pair of unique serial channel endpoints, or --fpgalink=nXX:aclY:chZ-eth
to connect the specified unique serial channel endpoint to the Ethernet switch (only available for Alveo U280 cards).
A complete FPGA-to-FPGA network topology is described by a space separated list of--fpgalink
arguments that may contain each unique serial channel endpoint at most once.
FPGA-Link GUI Online Editor and Simple Example
The required topology descriptions in the--fpgalink
syntax can be generated and visualized with the FPGA-Link GUI online editor. The topology with two point-to-point links shown at the right side can be accessed and edited in the FPGA-Link GUI with this direct link. It also supports importing textual descriptions and exporting them for usage in a job script.
--fpgalink=n00:acl0:ch0-n00:acl1:ch0 --fpgalink=n00:acl1:ch1-n00:acl2:ch1
changeFPGALinks command line tool
In this section, we provide a simple example of using the tool. For more details, refer to changeFPGALinks .
Within a job on an FPGA node, the tool is loaded and used to configure the desired topology from above.
Sample jobscript saved as fpgaLinkExample.sh
:
#!/bin/bash
#SBATCH -t 2:00:00
#SBATCH -N 1
#SBATCH -J "fpgalink-example"
#SBATCH -p fpga
#SBATCH --constraint=xilinx_u280_xrt2.15
#SBATCH -A <your_project>
## load module
ml reset
ml fpga/changeFPGAlinks
## invoke tool to set topology
changeFPGAlinks --fpgalink=n00:acl0:ch0-n00:acl1:ch0 --fpgalink=n00:acl1:ch1-n00:acl2:ch1
Invocation and output:
[tester@n2login5 test-changeFPGALinks]$ sbatch fpgaLinkExample.sh
There are currently no links set up.
Your nodes in this Job (19114376):
n2fpga15
Started changing link-config with ID 7a1a065d-47ef-4e06-a193-cdf2da485122 and test links after setup.
START: Tue Feb 4 13:50:38 CET 2025
INFO: Request from user "tester" from job "19114376"
INFO: Nodelist of job: n2fpga15
INFO: Setting SPANK_FPGALINK0=n00:acl0:ch0-n00:acl1:ch0
INFO: Setting SPANK_FPGALINK1=n00:acl1:ch1-n00:acl2:ch1
Host list
n2fpga15
Generated connections
fpgalink n2fpga15:acl0:ch0-n2fpga15:acl1:ch0
fpgalink n2fpga15:acl1:ch1-n2fpga15:acl2:ch1
Topology configuration request accepted after 0.3800797462463379s
[{"in":"4.3.5","out":"4.3.7","response":{"status":"1","msg":"OK","description":"Cross Connection added successfully!"}},{"in":"4.3.8","out":"4.4.2","response":{"status":"1","msg":"OK","description":"Cross Connection added successfully!"}}]
Begin link tests
Iteration 1
n2fpga15:acl0:ch0:4.3.5>4.3.7 input: 6.85 output: 4.95 loss: 1.89
n2fpga15:acl1:ch0:4.3.7>4.3.5 input: 5.09 output: 3.36 loss: 1.73
n2fpga15:acl1:ch1:4.3.8>4.4.2 input: 6.62 output: 4.98 loss: 1.64
n2fpga15:acl2:ch1:4.4.2>4.3.8 input: 5.53 output: 3.72 loss: 1.81
To visualize this configuration click here:
https://pc2.github.io/fpgalink-gui/index.html?import=--fpgalink%3Dn2fpga15%3Aacl0%3Ach0-n2fpga15%3Aacl1%3Ach0%20--fpgalink%3Dn2fpga15%3Aacl1%3Ach1-n2fpga15%3Aacl2%3Ach1
Note that the final output of the tool is a link to the FPGA-Link GUI with pre-populated topology that can be used to vizualize and verify the setup.
Using Serial Channels in Design Flows
Xilinx Alveo U280
Refer to our AuroraFlow project for an example design implementing serial communication channels on the Alveo U280. Alternatively, you can use Xilinx ACCL for an Ethernet-based communication scheme.
Intel Stratix 10
All Intel Stratix 10 boards on Noctua 2 offer 4 point-to-point connections to other FPGA boards when the node is configured with a p520_max_sg280l BSP. Their use differs based on the used development flow.
OneAPI
Refer to the documentation on I/O Pipes for details on how to use the external serial channels. The channel IDs are mapped as follows:
Port 0: Channels 0 (read) and 1 (write)
Port 1: Channels 2 (read) and 3 (write)
Port 2: Channels 4 (read) and 5 (write)
Port 3: Channels 6 (read) and 7 (write)
The pipes need to be configured for a data type of width 256 bits. This could, for example, be a std::array<int, 8>
. You may use a small C++ header like the following to bundle read and write pipes into a single channel
class that has both read
and write
operations:
#include <sycl/ext/intel/fpga_extensions.hpp>
template <int portnum>
struct read_channel_id {
static constexpr unsigned id = portnum * 2;
};
template <int portnum>
struct write_channel_id {
static constexpr unsigned id = portnum * 2 + 1;
};
template <int portnum, class T, std::size_t min_capacity = 0>
// requires((portnum >= 0) && (portnum < 4) && (sizeof(T) == 32)) // C++20 only
struct external_channel
: private sycl::ext::intel::kernel_readable_io_pipe<read_channel_id<portnum>, T, min_capacity>,
private sycl::ext::intel::kernel_writeable_io_pipe<write_channel_id<portnum>, T, min_capacity> {
using read_pipe = sycl::ext::intel::kernel_readable_io_pipe<read_channel_id<portnum>, T, min_capacity>;
using write_pipe = sycl::ext::intel::kernel_writeable_io_pipe<write_channel_id<portnum>, T, min_capacity>;
using read_pipe::read;
using write_pipe::write;
};
OpenCL
From the OpenCL environment, these links are used as external serial channels. A status reg value of 0xfff11ff1 in the diagnose indicates an active connection.