Otus FPGA Pilot Phase
We are currently in the pilot phase for the FPGA partition. In this stage, we have acquired a small number of FPGA boards and are evaluating their suitability for our requirements. More FPGAs of different vendors will be added in the near future.
Hardware Setup
Currently we have
3x Alveo V80 FPGAs (see official product page)
in our FPGA pilot phase of Otus.
An overview of the current setup is given in the following table:
| Accelerator and PCIe slot |
| ||
|---|---|---|---|---|
Hostname | 61:00.0 | 71:00.0 | 91:00.0 | note |
fpga1611 | Nvidia A40 GPU | Alveo V80 FPGA | Nvidia A40 GPU |
|
fpga1612 | Alveo V80 FPGA | Alveo V80 FPGA | Nvidia A40 GPU |
|
Please note: the Nvidia A40 GPUs were used for thermal evaluations of the nodes and will be removed in future.
Software Stacks
Currently we test the following software stacks in our pilot phase.
SLASH VRT (V80 Runtime)
The V80 cards can be used via VRT (similar abstraction to Xilinx XRT)
see technical details at official repository: https://github.com/Xilinx/SLASH
we maintain a RHEL9/Rocky 9 fork of SLASH/VRT at: https://github.com/pc2/SLASH
System Status
VRT is deployed on both nodes (fpga1611 and fpga1612).
Get the current status of the FPGAs via ami_tool. For example on fpga1612 with two V80 cards:
$ ami_tool overview
AMI
-------------------------------------------------------------
Version | 2.3.0 (0)
Branch
Hash | 0bab29e568f64a25f17425c0ffd1c0e89609b6d1
Hash Date | 20240307
Driver Version | 2.3.0 (0)
BDF | Device | UUID | AMC | State
----------------------------------------------------------------------------------------
71:00.0 | ALVEO V80 PQ | 451a004ee295528fa752794a6f9fbbff | 2.3.0 (0) | READY
61:00.0 | ALVEO V80 PQ | bf8583506e178f1ca1e495b83575a0c1 | 2.3.0 (0) | READY Use ami_tool -h for all options.
Example Applications
VRT comes with example applications (see https://github.com/Xilinx/SLASH/tree/dev/examples) to test the basic functionality. In order to run them you can do the following:
Allocate an FPGA node that has V80 cards attached. Here fpga1611 for one hour:
srun --partition=fpga -t 01:00:00 -w fpga1611 --pty bashLoad required modules
module reset
ml fpga
ml xilinx/vitis/24.2
ml devel/CMake/3.29.3-GCCcore-13.3.0 Setup environment
export AMI_HOME=$HOME/.ami
export LD_LIBRARY_PATH=/opt/software/FPGA/Xilinx/Vivado/2024.2/lib/lnx64.o/:$LD_LIBRARY_PATHGet example repository
git clone https://github.com/pc2/SLASH
cd SLASH
export SLASHBASE=`pwd`
git switch dev
git submodule update --init --recursive --remoteBuild examples, here done for
00_axilite: example to test linking and AXI-Lite controlemulation: everything runs on the CPU. See below for simulation and hardware build.61:00.0: PCIe ID (BDF) of the V80 card to use. Seeami_tool overviewto get the right BDF.
cd $SLASHBASE/examples/00_axilite
# build for emulation
make emu_all
# execute example
cd build
./00_axilite 61:00.0 00_axilite_emu.vrtbin
# expected output
VRT Version: v1.0.0
Generating data...
Time taken for waits: 0 us
Expected: 1541.83
Got: 1541.83
Test passed!Repeat for simulation and hardware build
# Simulation.
make sim_all
# Hardware
make hw_allEarly Access and Troubleshooting
If you are interested in getting early access to the FPGA partition in the pilot phase or have issues/questions with the current setup, please contact us via Email.