/
Sample Topologies and Legacy fpgalink Interface to Slurm

Sample Topologies and Legacy fpgalink Interface to Slurm

Sample Topologies

Screenshot 2025-02-04 at 14.35.25.png

A number of predefined topologies can be created in the FPGALink-GUI online editor. Click on the advanced tutorial button for examples.

On the right, you see a topology denoted as Pair, connecting each channel of one FPGA in a node to the corresponding channel of the second FPGA, and created with a single click of the node context menu in the editor (import link to FPGALink-GUI).

Legacy fpgalink Interface to Slurm

As alternative to the changeFPGALinks command line tool as described on the FPGA-to-FPGA Networking overview page and with further examples and details under changeFPGALinks, when targeting the Bittware 520N nodes, --fpgalink arguments can be passed directly to srun or salloc. When appending the --fpgalink arguments from the editor to the Slurm command (instead of using them as input to changeFPGALinks) and adding quotation marks around the individual connection strings, the configuration gets applied at the start of the job.

srun -A pc2-mitarbeiter --constraint=bittware_520n_20.4.0_max -N 1 -t 10:00 -p fpga --fpgalink="n00:acl0:ch0-n00:acl1:ch0" --fpgalink="n00:acl0:ch1-n00:acl1:ch1" --fpgalink="n00:acl0:ch2-n00:acl1:ch2" --fpgalink="n00:acl0:ch3-n00:acl1:ch3" --pty bash
cl1:ch0" --fpgalink="n00:acl0:ch1-n00:acl1:ch1" --fpgalink="n00:acl0:ch2-n00:acl1:ch2" --fpgalink="n00:acl0:ch3-n00:acl1:ch3" --pty bash srun: Warning: The --fpgalink Slurm argument is deprecated and will be removed in future. Please use our dedicated changeFPGAlinks script instead. For more information, refer to our documentation at https://doku.pc2.uni-paderborn.de/pages/1903821/changeFPGALinks. ... Summarizing most recent topology information and exporting FPGALINK variables: Host list fpga-0004 Generated connections FPGALINK0=n2fpga28:acl0:ch0-n2fpga28:acl1:ch0 FPGALINK1=n2fpga28:acl0:ch1-n2fpga28:acl1:ch1 FPGALINK2=n2fpga28:acl0:ch2-n2fpga28:acl1:ch2 FPGALINK3=n2fpga28:acl0:ch3-n2fpga28:acl1:ch3

We recommend using srun and sbatch, because this information is not automatically shown when using salloc (the configuration itself still works). When using salloc, you can still recover the information and setup your environment variables by invoking the changeFPGALinks command line tool.

Predefined Topologies

The Slurm --fpgalink interface provides a set of predefined topologies as follows.

Topology type

Invocation

Min-Max number of nodes

Brief description

Topology type

Invocation

Min-Max number of nodes

Brief description

pair

--fpgalink="pair"

1-N

Pairwise connect the 2 FPGAs within each node

clique

--fpgalink="clique"

2

All-to-all connection for 2 nodes, 4 FPGAs

ring

--fpgalink="ringO"

1-N

Ring with two links per direction, acl0 down, acl1 up

--fpgalink="ringN"

1-N

Ring with two links per direction, acl0 down, acl1 down

--fpgalink="ringZ"

1-N

Ring with two links per direction, acl0 and acl1 neighbors

torus

--fpgalink="torus2"

1-N

Torus with 2 FPGAs per row

--fpgalink="torus3"

2-N

Torus with 3 FPGAs per row

--fpgalink="torus4"

2-N

Torus with 4 FPGAs per row

--fpgalink="torus5"

3-N

Torus with 5 FPGAs per row

--fpgalink="torus6"

3-N

Torus with 6 FPGAs per row

Pair topology

Within each node, all channels of one FPGA board are connected to the respective channel of the other FPGA board. No connections between nodes are made.

The following example uses three nodes n00-n02 and connects within each node all four channels from the first FPGA board acl0 to the four channels of the second FPGA board acl1 (see figure). The pair topology example can be directly used in the FPGA-Link GUI using this link.

srun -p fpga -A pc2-mitarbeiter --constraint=19.2.0_max -N 3 --fpgalink=pair --pty bash
... Summarizing most recent topology information and exporting FPGALINK variables: Host list fpga-0001 fpga-0002 fpga-0003 Pair topology Generated connections FPGALINK0=fpga-0001:acl0:ch0-fpga-0001:acl1:ch0 FPGALINK1=fpga-0001:acl0:ch1-fpga-0001:acl1:ch1 FPGALINK2=fpga-0001:acl0:ch2-fpga-0001:acl1:ch2 FPGALINK3=fpga-0001:acl0:ch3-fpga-0001:acl1:ch3 FPGALINK4=fpga-0002:acl0:ch0-fpga-0002:acl1:ch0 FPGALINK5=fpga-0002:acl0:ch1-fpga-0002:acl1:ch1 FPGALINK6=fpga-0002:acl0:ch2-fpga-0002:acl1:ch2 FPGALINK7=fpga-0002:acl0:ch3-fpga-0002:acl1:ch3 FPGALINK8=fpga-0003:acl0:ch0-fpga-0003:acl1:ch0 FPGALINK9=fpga-0003:acl0:ch1-fpga-0003:acl1:ch1 FPGALINK10=fpga-0003:acl0:ch2-fpga-0003:acl1:ch2 FPGALINK11=fpga-0003:acl0:ch3-fpga-0003:acl1:ch3 Topology configuration request accepted after 0.297791957855s

Clique topology

Within a pair of 2 nodes, each of the 4 FPGAs is connected to all 3 other FPGAs.

  • channel 0: to the same FPGA in the other node

  • channel 1: to the other FPGA in the same node

  • channel 2: to the other FPGA in the other node.

The following example uses two nodes n00-n01 and connects within each node all four channels from the first FPGA board acl0 to the four channels of the second FPGA board acl1 (see figure). The clique topology example can be directly used in the FPGA-Link GUI using this link.

srun -p fpga -A pc2-mitarbeiter --constraint=19.2.0_max -N 2 --fpgalink=clique --pty bash
... Summarizing most recent topology information and exporting FPGALINK variables: Host list fpga-0013 fpga-0014 Clique topology Generated connections FPGALINK0=fpga-0013:acl0:ch0-fpga-0014:acl0:ch0 FPGALINK1=fpga-0013:acl1:ch0-fpga-0014:acl1:ch0 FPGALINK2=fpga-0013:acl0:ch1-fpga-0013:acl1:ch1 FPGALINK3=fpga-0014:acl0:ch1-fpga-0014:acl1:ch1 FPGALINK4=fpga-0013:acl0:ch2-fpga-0014:acl1:ch2 FPGALINK5=fpga-0013:acl1:ch2-fpga-0014:acl0:ch2 FPGALINK6=fpga-0013:acl0:ch3-fpga-0014:acl1:ch3 FPGALINK7=fpga-0013:acl1:ch3-fpga-0014:acl0:ch3

Ring topology

This setup puts all FPGAs in a ring topology that defines for each FPGA the neighbor FPGAs "north" and "south". It connects each FPGA's channels 0 and 2 to the "north" direction and channels 1 and 3 to the "south" direction. Thus, the local perspective for each node within the topology is

// local view from FPGA "local" to neighbors "north" and "south" // ch0 and ch2 connect to neighbor "north" local:ch0 <-> north:ch1 local:ch2 <-> north:ch3 // ch1 and ch3 connect to neighbor "south" local:ch1 <-> south:ch0 local:ch3 <-> south:ch2

Three different variants define how the FPGAs are arranged into the ring

// --fpgalink="ringO" // ringO, going down in acl0 column and back up in acl1 column // Column from north to south, end connected back to start fpga-0001:acl0 fpga-0002:acl0 fpga-0003:acl0 fpga-0004:acl0 fpga-0004:acl1 fpga-0003:acl1 fpga-0002:acl1 fpga-0001:acl1 // --fpgalink="ringN" // ringN, going down in acl0 column then down in acl1 column // Column from north to south, end connected back to start fpga-0001:acl0 fpga-0002:acl0 fpga-0003:acl0 fpga-0004:acl0 fpga-0001:acl1 fpga-0002:acl1 fpga-0003:acl1 fpga-0004:acl1 // --fpgalink="ringZ" // ringZ, going down through nodes, zigzaging between acl0 and acl1 // Column from north to south, end connected back to start fpga-0001:acl0 fpga-0001:acl1 fpga-0002:acl0 fpga-0002:acl1 fpga-0003:acl0 fpga-0003:acl1 fpga-0004:acl0 fpga-0004:acl1

Full example for a ringO with 4 nodes. See this example in the FPGA-Link GUI using this link.

srun -p fpga -A pc2-mitarbeiter --constraint=19.2.0_max -N 4 --fpgalink=ringO --pty bash
Summarizing most recent topology information and exporting FPGALINK variables: Host list fpga-0009 fpga-0010 fpga-0011 fpga-0012 Ring topology information: column from north to south, end connected back to start fpga-0009:acl0 fpga-0010:acl0 fpga-0011:acl0 fpga-0012:acl0 fpga-0012:acl1 fpga-0011:acl1 fpga-0010:acl1 fpga-0009:acl1 Generated connections FPGALINK0=fpga-0009:acl0:ch1-fpga-0010:acl0:ch0 FPGALINK1=fpga-0009:acl0:ch3-fpga-0010:acl0:ch2 FPGALINK2=fpga-0010:acl0:ch1-fpga-0011:acl0:ch0 FPGALINK3=fpga-0010:acl0:ch3-fpga-0011:acl0:ch2 FPGALINK4=fpga-0011:acl0:ch1-fpga-0012:acl0:ch0 FPGALINK5=fpga-0011:acl0:ch3-fpga-0012:acl0:ch2 FPGALINK6=fpga-0012:acl0:ch1-fpga-0012:acl1:ch0 FPGALINK7=fpga-0012:acl0:ch3-fpga-0012:acl1:ch2 FPGALINK8=fpga-0012:acl1:ch1-fpga-0011:acl1:ch0 FPGALINK9=fpga-0012:acl1:ch3-fpga-0011:acl1:ch2 FPGALINK10=fpga-0011:acl1:ch1-fpga-0010:acl1:ch0 FPGALINK11=fpga-0011:acl1:ch3-fpga-0010:acl1:ch2 FPGALINK12=fpga-0010:acl1:ch1-fpga-0009:acl1:ch0 FPGALINK13=fpga-0010:acl1:ch3-fpga-0009:acl1:ch2 FPGALINK14=fpga-0009:acl1:ch1-fpga-0009:acl0:ch0 FPGALINK15=fpga-0009:acl1:ch3-fpga-0009:acl0:ch2

Torus topology

This setup puts all FPGAs in a torus topology that defines for each FPGA the neighbor FPGAs "north", "south", "west", "east". It connects each FPGA's

  • channel 0 to the "north" direction,

  • channel 1 to the "south" direction,

  • channel 2 to the "west" direction and

  • channel 3 to the "east" direction.

Thus, the local perspective for each node within the topology is

// local view from FPGA "local" to neighbors "north", "south", "west", "east" // ch0 connects to neighbor "north" local:ch0 <-> north:ch1 // ch1 connects to neighbor "south" local:ch1 <-> south:ch0 // ch2 connects to neighbor "west" local:ch2 <-> west:ch3 // ch3 connects to neighbor "east" local:ch3 <-> east:ch2

The torus topology can be instantiated with a configurable width, that is number of FPGAs that are connected in "west-east" direction. With an uneven width, FPGAs in the same node can belong to consecutive rows of the torus. The number of FPGAs gets rounded down to the biggest full torus for the given width. The following block illustrates 3 different torus topologies on nodes fpga-[0001-0005].

// --fpgalink="torus2" // Torus with width 2 and height 5 // Columns from north to south, rows from west to east, end connected back to start fpga-0001:acl0 - fpga-0001:acl1 fpga-0002:acl0 - fpga-0002:acl1 fpga-0003:acl0 - fpga-0003:acl1 fpga-0004:acl0 - fpga-0004:acl1 fpga-0005:acl0 - fpga-0005:acl1 // --fpgalink="torus3" // Torus with width 3 and height 3 // Columns from north to south, rows from west to east, end connected back to start fpga-0001:acl0 - fpga-0001:acl1 - fpga-0002:acl0 fpga-0002:acl1 - fpga-0003:acl0 - fpga-0003:acl1 fpga-0004:acl0 - fpga-0004:acl1 - fpga-0005:acl0 // --fpgalink="torus4" // Torus with width 4 and height 2 // Columns from north to south, rows from west to east, end connected back to start fpga-0001:acl0 - fpga-0001:acl1 - fpga-0002:acl0 - fpga-0002:acl1 fpga-0003:acl0 - fpga-0003:acl1 - fpga-0004:acl0 - fpga-0004:acl1

Full example for a torus4 with 8 nodes. See this example in the FPGA-Link GUI using this link.

srun -p fpga -A pc2-mitarbeiter --constraint=19.2.0_max -N 8 --fpgalink=torus4 --pty bash
Summarizing most recent topology information and exporting FPGALINK variables: Host list fpga-0001 fpga-0002 fpga-0003 fpga-0004 fpga-0005 fpga-0006 fpga-0007 fpga-0008 Torus topology with width 4 and height 4 Torus topology information: columns from north to south, rows from west to east, end connected back to start fpga-0001:acl0 - fpga-0001:acl1 - fpga-0002:acl0 - fpga-0002:acl1 fpga-0003:acl0 - fpga-0003:acl1 - fpga-0004:acl0 - fpga-0004:acl1 fpga-0005:acl0 - fpga-0005:acl1 - fpga-0006:acl0 - fpga-0006:acl1 fpga-0007:acl0 - fpga-0007:acl1 - fpga-0008:acl0 - fpga-0008:acl1 Generated connections FPGALINK0=fpga-0001:acl0:ch1-fpga-0003:acl0:ch0 FPGALINK1=fpga-0001:acl0:ch3-fpga-0001:acl1:ch2 FPGALINK2=fpga-0001:acl1:ch1-fpga-0003:acl1:ch0 FPGALINK3=fpga-0001:acl1:ch3-fpga-0002:acl0:ch2 FPGALINK4=fpga-0002:acl0:ch1-fpga-0004:acl0:ch0 FPGALINK5=fpga-0002:acl0:ch3-fpga-0002:acl1:ch2 FPGALINK6=fpga-0002:acl1:ch1-fpga-0004:acl1:ch0 FPGALINK7=fpga-0002:acl1:ch3-fpga-0001:acl0:ch2 FPGALINK8=fpga-0003:acl0:ch1-fpga-0005:acl0:ch0 FPGALINK9=fpga-0003:acl0:ch3-fpga-0003:acl1:ch2 FPGALINK10=fpga-0003:acl1:ch1-fpga-0005:acl1:ch0 FPGALINK11=fpga-0003:acl1:ch3-fpga-0004:acl0:ch2 FPGALINK12=fpga-0004:acl0:ch1-fpga-0006:acl0:ch0 FPGALINK13=fpga-0004:acl0:ch3-fpga-0004:acl1:ch2 FPGALINK14=fpga-0004:acl1:ch1-fpga-0006:acl1:ch0 FPGALINK15=fpga-0004:acl1:ch3-fpga-0003:acl0:ch2 FPGALINK16=fpga-0005:acl0:ch1-fpga-0007:acl0:ch0 FPGALINK17=fpga-0005:acl0:ch3-fpga-0005:acl1:ch2 FPGALINK18=fpga-0005:acl1:ch1-fpga-0007:acl1:ch0 FPGALINK19=fpga-0005:acl1:ch3-fpga-0006:acl0:ch2 FPGALINK20=fpga-0006:acl0:ch1-fpga-0008:acl0:ch0 FPGALINK21=fpga-0006:acl0:ch3-fpga-0006:acl1:ch2 FPGALINK22=fpga-0006:acl1:ch1-fpga-0008:acl1:ch0 FPGALINK23=fpga-0006:acl1:ch3-fpga-0005:acl0:ch2 FPGALINK24=fpga-0007:acl0:ch1-fpga-0001:acl0:ch0 FPGALINK25=fpga-0007:acl0:ch3-fpga-0007:acl1:ch2 FPGALINK26=fpga-0007:acl1:ch1-fpga-0001:acl1:ch0 FPGALINK27=fpga-0007:acl1:ch3-fpga-0008:acl0:ch2 FPGALINK28=fpga-0008:acl0:ch1-fpga-0002:acl0:ch0 FPGALINK29=fpga-0008:acl0:ch3-fpga-0008:acl1:ch2 FPGALINK30=fpga-0008:acl1:ch1-fpga-0002:acl1:ch0 FPGALINK31=fpga-0008:acl1:ch3-fpga-0007:acl0:ch2

Related content