Overview

FRED runtime

The FRED runtime is the reference implementation of the FRED framework for the GNU/Linux operating system. It has been designed to run on the Xilinx’s Zynq-7000 SoC FPGAs platforms. The FRED runtime consists of a system support design and a set of software support components.

System support design

The system support design is a reference design for the FPGA side of the SoC that has been designed to support the deployment of dynamically-reconfigured hardware accelerators. The support design divides the FPGA into two regions: a static region and a reconfigurable region. The static region contains the logic needed to realize the communication infrastructure, namely a set of AXI Interconnects, which can be extended by the user by adding other support modules depending on the specific needs. The reconfigurable region is organized into a set of statically defined slots that are logically grouped into partitions.

System support design

System support design

Software support

The software support comprises a set of software components in charge of managing the FPGA and implementing the FRED scheduling policy on top of the system support design. The software support has been designed in a modular fashion, relying as much as possible on user space implementation to improve maintainability, safety, and expandability. The central component of the software support is a user-space server process, named the FRED server, which is in charge of managing acceleration requests from Linux processes (and threads) according to the FRED scheduling policy. Linux processes and hardware accelerators share data through a zero-copy mechanism implemented using physically contiguous (uncached) memory buffers. The FRED server relies on two custom kernel modules and the UIO framework for controlling the hardware accelerators.

Kernel space

The first custom kernel module is used to allocate the contiguous memory buffers used to share data between software processes and dynamically-reconfigured hardware accelerators. The second custom kernel module manages the device reconfiguration in an optimized way with respect to the Xilinx’s stock driver.

User space

The FRED server initiates the FPGA support during the initialization phase and then manages requests coming from Linux processes and threads. Internally, the FRED server uses I/O multiplexing to monitor all hardware and software component events from a single event loop. The FRED server communicates with the software processes through a Unix domain socket using a simple messaging protocol. From a user perspective, the interactions between the software process and the FRED server are abstracted by a client support library, which is available in C and Python.

Software support

Software support

Case study

A case study application has been designed to test the FRED runtime in a realistic scenario. The application makes use of the virtualized FPGA support to speed up the processing of live images acquired by a USB webcam and multiplications of integer matrices. The set of hardware image filters includes a Sobel filter, a FAST edge detection filter, and a color map filter. These filters have been implemented both as HLS hardware accelerators and equivalent software procedures using the popular OpenCV library with the purpose of testing the speedup factors.

Case study internal architecture

Case study internal architecture

In this application, the reconfigurable region is divided into two partitions containing a single slot each. These two slots are shared at runtime by four hardware accelerators (Sobel, FAST, Gmap, and Mult) using the FPGA virtualization mechanism offered by the FRED server.

FPGA fabric partitioning

FPGA fabric partitioning

The following video shows the case-study application operating in hardware mode. Please, note that the processing of each sub-image triggers a partial reconfiguration of the FPGA fabric, resulting in a rate higher than 50 reconfigurations per second. It is also worth noting that the FPGA contains resources to statically host only two of the four hardware accelerators and that a pure software implementation is considerably slower. Only by leveraging resource virtualization through partial reconfiguration all the four tasks can achieve a reasonable performance.



Try the case study

If you have a ZYBO board, you can try the case study application on your own. In addition to the board, you need a USB webcam capable of acquiring 640x480 frames, and a 5V min 2A capable power supply. To prepare the application, download the image file available here. Then unzip the archive and copy the image to a micro SD of size 2 GB or more using dd or an equivalent tool.

unzip fred_cs.zip
dd if=fred_sd.img of=/dev/mmcblk0 bs=8M conv=fsync

Once the micro SD is ready, insert it into the board, connect the external power supply, and remember to set the ZYBO for SD boot and external power supply using the board’s jumpers. When ready, connect the USB webcam, an HDMI monitor, and then start the ZYBO. Once the boot process has completed, login usign root as username and password. Then, launch the FRED runtime and the Qt client application.

./start_fred.sh
./fredVideoApp -qws