Otus FPGA Pilot Phase
We are currently in the pilot phase for the FPGA partition. In this stage, we have acquired a small number of FPGA boards and are evaluating their suitability for our requirements. More FPGAs of different vendors will be added in the near future.
- 1 Hardware Setup
- 2 Software Stacks
- 2.1 SLASH VRT (V80 Runtime)
- 2.1.1 Example Applications
- 2.1.1.1 Load required VRT module
- 2.1.1.2 Get example repository
- 2.1.1.3 Build examples for emulation
- 2.1.2 Hardware Execution
- 2.1.2.1 Allocate node with V80 card
- 2.1.2.2 Load the required modules
- 2.1.2.3 Check FPGA status
- 2.1.2.4 Execute example in hardware
- 2.1.2.5 (Optional) Repeat steps for simulation and hardware build
- 2.1.1 Example Applications
- 2.1 SLASH VRT (V80 Runtime)
- 3 Early Access and Troubleshooting
Hardware Setup
Currently we have
3x Alveo V80 FPGAs (see official product page)
in our FPGA pilot phase of Otus.
An overview of the current hardware setup is given in the following table:
| Accelerator and PCIe slot |
| ||
|---|---|---|---|---|
Hostname | 61:00.0 | 71:00.0 | 91:00.0 | note |
fpga1611 | Nvidia A40 GPU | Nvidia A40 GPU | Alveo V80 FPGA | Task Parallel System Composer (TaPaSCo) |
fpga1612 | Alveo V80 FPGA | Alveo U55C |
| Currently configured with AVED 25.1 without VRT |
fpga1613 | Nvidia A40 GPU | Nvidia A40 GPU | Alveo V80 FPGA | Currently configured with VRT based on AVED 24.1 |
Please note:
you only need to allocate one of the nodes for hardware execution. You can compile, emulate, simulate and synthesize on any other Otus node.
the Nvidia A40 GPUs were used for thermal evaluations of the nodes and will be removed in future.
Software Stacks
Currently, we test the following software stacks in our pilot phase.
SLASH VRT (V80 Runtime)
The V80 cards can be used via VRT (similar abstraction to Xilinx XRT)
see technical details at official repository: https://github.com/Xilinx/SLASH
we maintain a RHEL9/Rocky 9 fork of SLASH/VRT at: https://github.com/pc2/SLASH
The VRT tool flow can be used with modules on any nodes of Otus:
module reset
module load fpga
module load fpga/xilinx/vrt/0.1Please note:
the Lmod warning regarding
ncursesis not critical.
Example Applications
VRT comes with example applications (see https://github.com/Xilinx/SLASH/tree/dev/examples) to test the basic functionality.
In order to run them you can do the following:
Load required VRT module
module reset
module load fpga
module load fpga/xilinx/vrt/0.1loads the main VRT tool flow and all required dependencies
sets up the required environment variables (mainly
AMI_HOME)
Get example repository
git clone https://github.com/pc2/SLASH
cd SLASH
export SLASHBASE=`pwd`
git switch dev
git submodule update --init --recursive --remoteuses tested VRT/SLASH version maintained by PC2
Build examples for emulation
cd $SLASHBASE/examples/00_axilite
# build for emulation
make emu_all
# execute example
cd build
./00_axilite 61:00.0 00_axilite_emu.vrtbin
# expected output
VRT Version: v1.0.0
Generating data...
Time taken for waits: 0 us
Expected: 1541.83
Got: 1541.83
Test passed!00_axilite: example to test linking and AXI-Lite controlemulation: everything runs on the CPU. See below for simulation and hardware build.61:00.0: Usually the PCIe ID (BDF) parameter of the V80 card to use. Can be any reasonable value for emulation.
Hardware Execution
If you want to execute a design in hardware, you need to allocate an FPGA node that has at least one V80 card attached.
Allocate node with V80 card
You can use this command to get node fpga1612 for one hour:
srun --partition=fpga -t 01:00:00 -w fpga1612 --pty bashLoad the required modules
module reset
module load fpga
module load fpga/xilinx/vrt/0.1Check FPGA status
Get the current status of the FPGAs via ami_tool. For example on fpga1612 with two V80 cards:
$ ami_tool overview
AMI
-------------------------------------------------------------
Version | 2.3.0 (0)
Branch
Hash | 0bab29e568f64a25f17425c0ffd1c0e89609b6d1
Hash Date | 20240307
Driver Version | 2.3.0 (0)
BDF | Device | UUID | AMC | State
----------------------------------------------------------------------------------------
61:00.0 | ALVEO V80 PQ | c451a8335000954c2f45abc32d98c87e | 2.3.0 (0) | READY
91:00.0 | ALVEO V80 PQ | 4a424e194c90ae9bc94fdc95d3c191fe | 2.3.0 (0) | READY the most important output is the
Statecolumn. Only FPGAs inREADYstate should be used.you can use the
BDF61:00.0or91:00.0to identify one of the cardsUse
ami_tool -hfor all options
Execute example in hardware
To speed-up the process and save resources for unnecessary synthesis we have pre-synthesized the design for example 00_axilite. The vrtbin file with the hardware design is located at:
/opt/software/FPGA/Xilinx/VRT/vrt_0.1/examples/00_axilite_hw.vrtbinYou can run the design in hardware with
./00_axilite 91:00.0 /opt/software/FPGA/Xilinx/VRT/vrt_0.1/examples/00_axilite_hw.vrtbinuses device
91:00.0and the pre-synthesized design00_axilite_hw.vrtbinif you do not have the host code, see the emulation build and execution
If you want to synthesize a design yourself, see the description below.
(Optional) Repeat steps for simulation and hardware build
Hardware simulation can be performed with
# Simulation
make sim_all
# execute example
cd build
./00_axilite 61:00.0 00_axilite_sim.vrtbinIn order to synthesize a design instead of using the pre-synthesized version, you can use these steps:
#!/bin/sh
# synthesis_script.sh
#SBATCH -t 24:00:00
#SBATCH --cpus-per-task=8
#SBATCH --mem=64G
#SBATCH -A <your_project_acronym>
#SBATCH -p normal
module reset
module load fpga
module load fpga/xilinx/vrt/0.1
# Hardware
make hw_allThen, we submit the synthesis_script.sh to the slurm workload manager:
sbatch ./synthesis_script.shAfterwards you can use the generated 00_axilite_hw.vrtbin as described above for hardware execution.
Early Access and Troubleshooting
If you are interested in getting early access to the FPGA partition in the pilot phase or have issues/questions with the current setup, please contact us via Email.