Overview

All boards in Noctua 2 can be configured with customized FPGA-to-FPGA networking setup within each user job.

The Alveo U280 boards (each providing 2 QSFP28 connections with 100GbE support) can be configured with direct point-to-point links or connected to an Ethernet switch. The networking subsystem on the FPGAs needs to be configured as part of the user design, for example
- for streaming connections via point-to-point links with AuroraFlow GitHub - papeg/AuroraFlow: Ready-to-link, packaged Aurora IP on four QSFP28 lanes, providing 100Gb/s throughput
- for Ethernet connections via the Ethernet switch or point-to-point Ethernet links with ACCL GitHub - Xilinx/ACCL: Alveo Collective Communication Library: MPI-like communication operations for Xilinx Alveo accelerators.
The Bittware 520N boards (each providing 4 QSFP+ connections with 40GbE support) can only be configured with direct point-to-point links. The networking subsystem on the FPGAs is a direct streaming interface provided by the shell (BSP) when selecting the _max_ flavor of the BSP. (See also Constraint Overview )

The --fpgalinksyntax is used to describe the FPGA-to-FPGA network topology for all nodes and FPGAs involved in a specific job.

The FPGA-Link GUI online editor can be used to create, visualize and export desired network topologies in the --fpgalinksyntax.

The changeFPGALinkscommand line tool is used to configure and verify network topologies within a job.

--fpgalink Syntax

The notation nXX:aclY:chZ describes a unique serial channel endpoint within a job allocation according to the following pattern

nXX, e.g. n02 specifies the node ID within your allocation, starting with n00 for the first node, n02 will specify the third node of your allocation. You can not use higher node IDs than the number of nodes requested by the allocation. At allocation time, the node ID is translated to a concrete node name, e.g. fpga-0008.
aclY, e.i. acl0, acl1(and optionallyacl2) describe the first, second (and optionally third) FPGA board within each node.
chZ, e.i. ch0, ch1, (and optionally ch2, ch3) describe up to 4 external channel connections per FPGA.

Sample topology with two fpgalinks.

The syntax of complete--fpgalinkargument is
--fpgalink=nXX:aclY:chZ-nXX:aclY:chZ
to connect the thus specified pair of unique serial channel endpoints, or
--fpgalink=nXX:aclY:chZ-eth
to connect the specified unique serial channel endpoint to the Ethernet switch (only available for Alveo U280 cards).

A complete FPGA-to-FPGA network topology is described by a space separated list of--fpgalinkarguments that may contain each unique serial channel endpoint at most once.

FPGA-Link GUI Online Editor and Simple Example

The required topology descriptions in the--fpgalinksyntax can be generated and visualized with the FPGA-Link GUI online editor. The topology with two point-to-point links shown at the right side can be accessed and edited in the FPGA-Link GUI with this direct link. It also supports importing textual descriptions and exporting them for usage in a job script.

--fpgalink=n00:acl0:ch0-n00:acl1:ch0 --fpgalink=n00:acl1:ch1-n00:acl2:ch1

changeFPGALinks command line tool

In this section, we provide a simple example of using the tool. For more details, refer to changeFPGALinks .

Within a job on an FPGA node, the tool is loaded and used to configure the desired topology from above.

Sample jobscript saved as fpgaLinkExample.sh:

#!/bin/bash
#SBATCH -t 2:00:00
#SBATCH -N 1
#SBATCH -J "fpgalink-example"
#SBATCH -p fpga
#SBATCH --constraint=xilinx_u280_xrt2.15
#SBATCH -A <your_project>

## load module
ml reset
ml fpga/changeFPGAlinks
## invoke tool to set topology
changeFPGAlinks --fpgalink=n00:acl0:ch0-n00:acl1:ch0 --fpgalink=n00:acl1:ch1-n00:acl2:ch1

Invocation and output:

[tester@n2login5 test-changeFPGALinks]$ sbatch fpgaLinkExample.sh
There are currently no links set up.
Your nodes in this Job (19114376):
        n2fpga15
Started changing link-config with ID 7a1a065d-47ef-4e06-a193-cdf2da485122 and test links after setup.
START: Tue Feb  4 13:50:38 CET 2025
INFO: Request from user "tester" from job "19114376"
INFO: Nodelist of job: n2fpga15
INFO: Setting SPANK_FPGALINK0=n00:acl0:ch0-n00:acl1:ch0
INFO: Setting SPANK_FPGALINK1=n00:acl1:ch1-n00:acl2:ch1
Host list
n2fpga15
Generated connections
fpgalink n2fpga15:acl0:ch0-n2fpga15:acl1:ch0
fpgalink n2fpga15:acl1:ch1-n2fpga15:acl2:ch1
Topology configuration request accepted after 0.3800797462463379s
[{"in":"4.3.5","out":"4.3.7","response":{"status":"1","msg":"OK","description":"Cross Connection added successfully!"}},{"in":"4.3.8","out":"4.4.2","response":{"status":"1","msg":"OK","description":"Cross Connection added successfully!"}}]
Begin link tests
Iteration 1
n2fpga15:acl0:ch0:4.3.5>4.3.7 input: 6.85 output: 4.95 loss: 1.89
n2fpga15:acl1:ch0:4.3.7>4.3.5 input: 5.09 output: 3.36 loss: 1.73
n2fpga15:acl1:ch1:4.3.8>4.4.2 input: 6.62 output: 4.98 loss: 1.64
n2fpga15:acl2:ch1:4.4.2>4.3.8 input: 5.53 output: 3.72 loss: 1.81

To visualize this configuration click here:
https://pc2.github.io/fpgalink-gui/index.html?import=--fpgalink%3Dn2fpga15%3Aacl0%3Ach0-n2fpga15%3Aacl1%3Ach0%20--fpgalink%3Dn2fpga15%3Aacl1%3Ach1-n2fpga15%3Aacl2%3Ach1

Note that the final output of the tool is a link to the FPGA-Link GUI with pre-populated topology that can be used to vizualize and verify the setup.

Using Serial Channels in Design Flows

Xilinx Alveo U280

Refer to our AuroraFlow project for an example design implementing serial communication channels on the Alveo U280. Alternatively, you can use Xilinx ACCL for an Ethernet-based communication scheme.

Intel Stratix 10

All Intel Stratix 10 boards on Noctua 2 offer 4 point-to-point connections to other FPGA boards when the node is configured with a p520_max_sg280l BSP. Their use differs based on the used development flow.

OneAPI

Refer to the documentation on I/O Pipes for details on how to use the external serial channels. The channel IDs are mapped as follows:

Port 0: Channels 0 (read) and 1 (write)
Port 1: Channels 2 (read) and 3 (write)
Port 2: Channels 4 (read) and 5 (write)
Port 3: Channels 6 (read) and 7 (write)

The pipes need to be configured for a data type of width 256 bits. This could, for example, be a std::array<int, 8>. You may use a small C++ header like the following to bundle read and write pipes into a single channel class that has both read and write operations:

#include <sycl/ext/intel/fpga_extensions.hpp>
  
template <int portnum>
struct read_channel_id {
    static constexpr unsigned id = portnum * 2;
};

template <int portnum>
struct write_channel_id {
    static constexpr unsigned id = portnum * 2 + 1;
};

template <int portnum, class T, std::size_t min_capacity = 0>
// requires((portnum >= 0) && (portnum < 4) && (sizeof(T) == 32)) // C++20 only
struct external_channel
    : private sycl::ext::intel::kernel_readable_io_pipe<read_channel_id<portnum>, T, min_capacity>,
      private sycl::ext::intel::kernel_writeable_io_pipe<write_channel_id<portnum>, T, min_capacity> {

    using read_pipe = sycl::ext::intel::kernel_readable_io_pipe<read_channel_id<portnum>, T, min_capacity>;
    using write_pipe = sycl::ext::intel::kernel_writeable_io_pipe<write_channel_id<portnum>, T, min_capacity>;

    using read_pipe::read;
    using write_pipe::write;
};

OpenCL

From the OpenCL environment, these links are used as external serial channels. A status reg value of 0xfff11ff1 in the diagnose indicates an active connection.

PC2-Documentation

FPGA-to-FPGA Networking