FINN on Noctua 2

When setting up FINN on Noctua 2, there are two possible ways to do it. The first is the easy and automated way, recommended for users who only want to try out FINN. The second is the manual way and recommended either in case the automated script does not work, or you need to do active development with your own project structure.

1 Automated Approach
- 1.1 Environment Setup
- 1.2 Installation
2 Usage
- - 2.1.1 Building Neural Network Hardware Design
  - 2.1.2 Tips
  - 2.1.3 Execute the Neural Network Hardware Design on an FPGA
- 2.2 Plan
3 Manual Approach
- 3.1 Install

Automated Approach

Environment Setup

First we clone the main repository that helps with an automated setup of FINN on the Noctua 2 cluster.

git clone https://github.com/eki-project/finn-on-n2.git
cd finn-on-n2

It is best practice to setup a virtual environment with a specific Python version and required dependencies. You do not have to use an environment but it is recommended, as it is much easier to use. In this guide we will use conda. Therefore, we first load the module

module load lang/Anaconda3/2022.05

All the required dependencies are defined in the environment.yml file that comes with the repository. We can simple create a new virtual environment with this command

conda env create -f environment.yml

The name of the virtual environment will be finn-on-n2. If you want another name, edit the environment.yml file.

After the installation of the dependencies you activate the environment with

conda activate finn-on-n2

If you use conda for the first time, you might run into the a CommandNotFoundError. Expand the following section for a solution.

CommandNotFoundError: Your shell has not been properly configured to use 'conda activate'.
To initialize your shell, run

    $ conda init <SHELL_NAME>

Currently supported shells are:
  - bash
  - fish
  - tcsh
  - xonsh
  - zsh
  - powershell

See 'conda init --help' for more information and options.

IMPORTANT: You may need to close and restart your shell after running 'conda init'.

Simply run the initialization for your shell. For bash (typically the default shell), you use

conda init bash

The expected output should be

no change     /opt/software/pc2/EB-SW/software/Anaconda3/2022.05/condabin/conda
no change     /opt/software/pc2/EB-SW/software/Anaconda3/2022.05/bin/conda
no change     /opt/software/pc2/EB-SW/software/Anaconda3/2022.05/bin/conda-env
no change     /opt/software/pc2/EB-SW/software/Anaconda3/2022.05/bin/activate
no change     /opt/software/pc2/EB-SW/software/Anaconda3/2022.05/bin/deactivate
no change     /opt/software/pc2/EB-SW/software/Anaconda3/2022.05/etc/profile.d/conda.sh
no change     /opt/software/pc2/EB-SW/software/Anaconda3/2022.05/etc/fish/conf.d/conda.fish
no change     /opt/software/pc2/EB-SW/software/Anaconda3/2022.05/shell/condabin/Conda.psm1
no change     /opt/software/pc2/EB-SW/software/Anaconda3/2022.05/shell/condabin/conda-hook.ps1
no change     /opt/software/pc2/EB-SW/software/Anaconda3/2022.05/lib/python3.9/site-packages/xontrib/conda.xsh
no change     /opt/software/pc2/EB-SW/software/Anaconda3/2022.05/etc/profile.d/conda.csh
modified      /pc2/users/d/deffel/.bashrc

==> For changes to take effect, close and re-open your current shell. <==

Re-open your current shell session (close the terminal and start a new connection to Noctua 2).

Try to activate the new environment again

conda activate finn-on-n2

Your shell prompt should look like this

(finn-on-n2) [USERNAME@n2login3 finn-on-n2]$

Installation

The repository contains a dodo.py file required by the program doit (which was automatically installed as a dependency). doit configures the tasks that can be run to install FINN. A complete list of all commands can be looked up using doit list, and an overview of the file structure can be found in the readme of the repo itself: Readme.

The other files in the repository are a build template, which can execute the default FINN data flow, as well as a settings toml file. Before running doit, you should check the settings in config.toml first, and change fields that are configured wrong for your setup. Most importantly you might have to change the paths to your Xilinx toolchains, which environment to use, the FINN repository to pull from and others. Most variables should be self-explanatory; additionally some of the most important fields are mentioned specifically in the readme.

The other files, especially the shell scripts, are used to run the slurm jobs on the cluster. These are all managed by doit.

By simply running

doit

now, the setup task will be executed. This involves cloning the FINN directory from the specified repository, at the specified branch (everything can be set in the configuration file), and if the appropriate flag is set, downloading and setting up the FINN driver and it's dependencies too. It will also fill in the job script templates with the environment variables configured in the TOML configuration file beforehand.

If you change your configuration, run

doit setenvvars

This will update the run and build scripts accordingly.

For the cluster specifically you need to work with Singularity instead of Docker. To make this work you need to have the Singularity Patch installed. This patch was merged into the main branch of FINN recently, however if you work on an older version, you still need to install it. To point FINN to where you have put your Singularity image, you need to set the

export FINN_SINGULARITY=/opt/software/FPGA/finn/finn-dev-latest.sif

environment variable. (Remember to update after changing your configuration - if you have set the field pointing to the .sif file in your config and use the cluster environment, that variable is automatically set accordingly).

After this, FINN will be set up and ready to use. In case the default repository points to our internal one, change it to the official one for stable usage.

Usage

To use FINN, you first need a network to start with. These networks come in the form of an ONNX file, which is a data format for sharing and distributing neural network architectures independent of the framework they are used by. Building your own quantized network is done with Brevitas. However, since getting into Brevitas and the Quantization Aware Training (QAT) topic is also fairly complicated, you can use the sample ONNX model and build.py file supplied by the FINN developers themselves. The sample model can be found at this path

finn-on-n2/finn/tutorials/fpga_flow/model.onnx

You can visualize the sample ONNX file using Netron. When viewing a network exported by Brevitas you should be able to observe hints at the quantization that was done during training and how that reflects in the networks structure.

The before-mentioned build.py file is a Python script that contains instructions and arguments for starting FINN. A template of such a script is provided in the finn-on-n2 repository, however it is only suited for building simple networks and has to be manually adjusted for more complex architectures. Important settings to look at in this file are:

output_dir: The location where everything is built and output to
vitis_platform and board: The platform string and the board name, so that the synthesis can target the correct device
shell_flow_type: Set this to Alveo or Zynq, depending on your devices

It is also useful to know that certain build steps are only executed, if the output they produce is also listed in the generate_outputs list. There are still many more options to set, but since this is beyond the scope of this guide, refer to the official FINN Documentation for further information.

Building Neural Network Hardware Design

To create a FINN project with the sample model, use

doit create ./finn/tutorials/fpga_flow/model.onnx

This command will create a folder whose name is derived from the input file you gave it (for example simply model in this case) and generate the required build scripts, but will not start the hardware generation process (FPGA synthesis).

In order to start to build a hardware design for the model, you can use the following command:

Caution: this step starts a (possibly 2-4 hrs) long synthesis slurm job. To speed-up the process and save resources for unnecessary synthesis we have pre-synthesized the design for the model example. Expand the box below the command to copy the pre-synthesized design for hardware execution.

doit execute model

Instead of starting the actual synthesis with, we extract the result from an archive.

tar xf pre-builds/finn/tutorials/fpga_flow/model.tar.gz

This will look for a directory called model and try to execute the build file within. To do this a slurm job is started. This will convert the model to HLS and synthesize it (broadly speaking; for an exact overview of what FINN does, refer to their documentation linked above).

Tips

It might be that when starting a run, the script warns you that VIVADO_PATH, VITIS_PATH, HLS_PATH might not be set or set incorrectly. In case of working on the cluster this warning can be ignored, since the module system takes care of the paths. In case the tools are later on not found or you are working locally, you can still enter values into these paths as a possible fix.

Remember, that to use FINN on the cluster you need to be in a compute time project. If the job does not start properly for that reason, you can either set the corresponding sbatch flag in the build script manually, or, in case you don’t want to change the script after every update of your config, set the relevant environment variable which defines the standard project to use for slurm jobs.

The results of the synthesis can be found in <project-directory>/<output-directory>/ (in the sample case model/out_dir/). Make sure to take a good look around the output folder and note all files residing there, as this will help you quite a lot when debugging your own networks and FINN runs.

Execute the Neural Network Hardware Design on an FPGA

After the hardware synthesis, we can execute the network inference with

doit pythondriver model

This command will automatically allocate a suitably configured FPGA node for execution.

Expected output after submitting the job

.  pythondriver
sbatch: WARNING: An fpga feature was requested, will set gres appropriate to fpga:u280:3
sbatch: FPGA: u280 per node requested 3
sbatch: Setting job to exclusive mode because there is no node sharing on FPGA-nodes.
sbatch: WARNING: Setting number of nodes to 1. Use --nodes=NUMBER or -N NUMBER to set it explicitly.
sbatch: WARNING: Setting ntasks-per-node to 1!  Use --ntasks-per-node=NUMBER to set it explicitly.
sbatch: WARNING: Amount of memory not explicitly requested.
sbatch: WARNING: Setting memory per node to 485000.0 MB which is the total usable memory of a node in partition fpga.
sbatch: WARNING: To set memory per node explicitly use --mem MEM, where MEM is for example 2G to request 2 GB per node.
Submitted batch job 7217180

You can check back the status of the execution with

squeue

If the execution is done, you can see the results in this file.

python_driver_run<JOB_ID>.out

The expected (formatted) output should be

[...]
Results written to nw_metrics.txt
{
   "runtime[ms]":10.514020919799805,
   "throughput[images/s]":951110.9095444342,
   "DRAM_in_bandwidth[MB/s]":745.6709530828364,
   "DRAM_out_bandwidth[MB/s]":0.9511109095444341,
   "fclk[mhz]":100,
   "batch_size":10000,
   "fold_input[ms]":0.02384185791015625,
   "pack_input[ms]":0.021457672119140625,
   "copy_input_data_to_device[ms]":2.4895668029785156,
   "copy_output_data_from_device[ms]":0.06103515625,
   "unpack_output[ms]":0.13971328735351562,
   "unfold_output[ms]":0.009775161743164062
}

Plan

The automated script does not yet support every possible workflow, and as such is undergoing regular change (for example adding support for picking up cancelled runs is planned). If any issues arise, send us an email.

Manual Approach

Install

To install FINN manually, follow these steps:

Clone the official FINN directory
- Can be found at FINN GitHub
(Since this is as of now not currently supported officially, replace run-docker.sh with the updated version from this pull request)
Either copy the finn_build_single_job.sh script from the above mentioned finn-noctua-install git, or write your own runner script. This would have to include slurm job information, module load commands for all required modules for FINN flow and synthesis and it would have to set environment variables for licenses and FINN-relevant paths
- Modules to load: fpga, xilinx/xrt/2.xx, xilinx/vitis/xx.yy
  - In case you want to build the C++ driver while running FINN, you also need doxygen, gcc, cmake
- Required working directories and env variables in the finn-noctua-install directory: SINGULARITY_CACHEDIR = SINGULARITY_CACHE, SINGULARITY_TMPDIR = SINGULARITY_TMP, FINN_HOST_BUILD_DIR = FINN_TMP
- Setting various other environment variables:
  - export FINN_XILINX_PATH=... (pointing to the main Xilinx directory)
  - export FINN_XILINX_VERSION=20XX.YY
  - export FINN_DOCKER_PREBUILT=1
  - export FINN_DOCKER_GPU=0
  - export LC_ALL="C"
  - export PYTHONBUFFERED=1
  - export NUM_DEFAULT_WORKERS=28
  - These are all of course example values and can/should be adapted by you!
- Execute run_docker.sh with the appropiate parameters
After synthesis you need to load the appropiate XRT shell and constrain your slurm job, which executes the driver, to the correct FPGA that you built it for
- This would require modules for fpga, xrt, python, boost, gcc
- It has proven helpful for us, to force a restart on the FPGA before executing the driver to avoid errors. To do so, execute xbutil reset -d <board-identifier> before running the driver
Finally start the driver via sbatch

PC2-Documentation