Accelerating Face Detection on GMSL-based Camera with Vitis AI Part 2

Brian Wiec

Published: Jul 09, 2020

AI - Machine Learning
Automotive
Demo and Example Designs
ZCU102
Vitis

Please read Accelerating Face Detection on GMSL-based Camera with Vitis AI Part 1

Prototype System

The Xilinx ZCU102 platform is used together with the Avnet FMC-MULTI-CAM4 FMC card. The default system uses only one out of the four cameras for processing. Frames are captured from the camera using V4L2 under OpenCV. These frames are then passed to the DPU for face detection using Densebox neural network and OpenCV is used to draw bounding boxes. Finally, the output frames are passed to a Gstreamer pipeline which renders the display onto the PS DisplayPort interface.

Project Organization

The flow for this design involves many tools and repositories that interact with each other by passing certain design files among them. Therefore, keeping a good organization of the design files is critical to avoid confusion. This tutorial uses the following organization for the source files, configuration files, and build scripts

Generally, the design sources are divided between hardware and petalinux directories, each containing the source, scripts, and configuration files relevant to its portion of the build. In addition, the boot directory contains .bif files used to create BOOT.BIN. The doc directory contains this document. The dpu directory contains files relevant to DPU configuration which are ingested at various times during building and execution. The repos directory contains local copies of dependent repositories that are used during the build. Finally, scripts contain other miscellaneous scripts.

During the build, the following directories are also created

vitis_platform_workspace – Vitis workspace where the platform is created
sd_card – Final output directory where all the binaries are placed

FPGA Design

The FPGA design implements two main functions: 1. Image capture pipeline with pre-processing and 2. DPU acceleration.

To build the FPGA design, follow these steps:

1. Setup the environment

source <vitis_install>/2019.2/settings64.sh

2. Run the script that creates Vivado project, imports sources, builds the IP Integrator Design, and generates the bitstream

make hardware

The main makefile calls a sub-makefile inside hardware directory. The latter target is shown below

The primary task is to call Vivado with hardware/tcl/build_hw.tcl which does the following:

1. Sets correct board/part number

2. Creates a project in hardware

3. Imports HDL sources

4. Imports XDC constraints

5. Sources hardware/tcl/bd/bd_zcu102_avnet_gmsl_2019.2.tcl to build the block design

6. Generates wrapper and output products for the block design

7. Runs synthesis

8. Tags the design with additional metadata properties for Vitis

9. Attaches dynamic_postlink.tcl

Due to Vitis expectation of relative file locations, hardware/tcl/dynamic_postlink.tcl is temporarily copied to the hardware root directory. The purpose and functionality of hardware/tcl/dynamic_postlink.tcl is described later.

Vivado Project Overview

The Vivado project now exists in hardware/proj and can be opened with

vivado proj.xpr

The top-level Block Design in Vivado is shown below

There are hierarchical blocks for the image capture pipeline (vid_capture), control logic (control), and reset generation (rst_gen). The Zynq Ultrascale+ MPSoC block has the following notable configurations

M_AXI_HMP_FPD is enabled and connected to the control block which contains the AXI Interconnect for accessing all the control interfaces used in the design
pl_clk0 is a 100MHz general purpose clock that drives control logic
pl_clk1 is a 250MHz clock which drives the AXI Stream video logic
pl_clk2 is a 200MHz clock for the MIPI DPHY
S_AXI_HP0_FPD is enabled and connected to the DMA (framebuffer write) for the video capture pipeline
pl_ps_irq0 is enabled for the DPU interrupt logic
pl_ps_irq1 is used by the other interrupts in the system

The subset of logic shown again below is added for use in the platform design.

It consists of clock, reset, and interrupt setup which are identified by Vitis and automatically connected to hardware accelerators (i.e. DPU) later in the flow. At this stage, we just need to make sure they are included in the design and marked with appropriate metadata so that Vitis can discover them and their properties. This will be discussed further in a subsequent subsection.

Image Capture Pipeline with Pre-Processing

The vid_capture block contains the image capture pipeline and associated pre-processing that is to be done on the images for use by the DPU.

The MIPI CSI2 RX Subsystem connects to the FPGA pins and implements the MIPI CSI2 receiver controller. It converts the video data into an AXI Stream output. Since the image sensors output 12-bit data, an AXIS Subset Converter is first used to select desired 8 MSBs. The sensors send RAW Bayer data which needs to be converted to RGB which happens in the v_demosaic_0 block. Next, some minimal ISP functionality is inserted to improve the image quality. Finally, a Framebuffer Write IP is used to DMA the data into memory.

DPU Acceleration

The DPU IP is not included at this stage of the design, but is instead inserted automatically by Vitis later on in the design flow. While this may seem odd, the idea is that DPU configuration can be iterated/changed in the Vitis environment later without having to touch the Vivado design at all. This gives a more flexible design environment, especially for those who are less experienced in traditional FPGA design.

In order to facilitate this flow, parts of the design are ‘tagged’ with certain properties that become metadata in the .xsa file. This metadata will be parsed by Vitis to allow the deployment of accelerators. These include interfaces, clocks, resets, and interrupts that are available for use by accelerators. To view these metadata ‘tags’ in Vivado, select Window -> Platform Interfaces

Which will display the following tab

Each selected interface has properties associated with it. A summary of the modified interfaces/properties is described below

clk_wiz_0
- clk_out1
  - enabled – true
  - id - 0
  - is_default – true
  - proc_sys_reset - /proc_sys_reset_0
- clk_out2
  - enabled – true
  - id - 1
  - is_default – false
  - proc_sys_reset - /proc_sys_reset_1
zynq_ultra_ps_e_0
- M_AXI_HPM1_FPD
  - enabled – true
- S_AXI_HP1_FPD
  - enabled – true
  - sptag – HP1
- S_AXI_HP2_FPD
  - enabled – true
  - sptag – HP2
- S_AXI_HP3_FPD
  - enabled – true
  - sptag – HP3
The clk_wiz_0 block generates clocks that will be used by the DPU. The DPU has 2 clock inputs which need to be driven: s_axi_aclk for S_AXI interface and dpu_2x_clk for the internal processing logic. dpu_2x_clk frequency should be 2x that of s_axi_aclk. The clock wizard for this design has the following configuration

Note the ‘Actual Output Freq (MHz)’ setting of the output clocks, as these values will be used later in Vitis to determine the clocks used by the DPU.

The platform also needs resets associated with each of these clocks which is the purpose of proc_sys_reset_1 and proc_sys_reset_0. Since these clocks/resets are only connected to the DPU (which does not yet exist in the design), they are otherwise left floating.

The axi_intc_0 interrupt controller is added for use by the DPU. It is connected to the Zynq Ultrascale+ MPSoC pl_ps_irq0 input and is configured for level sensitive interrupts

The intr input of axi_intc_0 is connected to xlconcat_interrupt_0 which is configured for 8 inputs. Notably, all inputs are grounded at this stage by xlconstant_gnd.

As noted previously, there is also a script attached to the design called dynamic_postlink.tcl. This is the script responsible for tearing up the connection from xlconstant_gnd to xlconcat_interrupt_0. This file is not executed now, but during Vitis link stage. However, it is attached to the design now so that Vitis knows about it. It is very important that the naming convention matches between the xlconstant_gnd/xlconcat_interrupt_0 and the script.

Continue to Accelerating Face Detection on GMSL-based Camera with Vitis AI Part 3

About Brian Wiec

Brian Wiec is a Field Applications Engineer in the Detroit area serving AMD Automotive customer base supporting applications in ADAS, autonomous driving, infotainment, and powertrain control. He has worked at AMD for eight years, both in the field and factory support roles with experience in video, signal processing, and embedded systems design/implementation. Brian is always happy to partner with customers to help them solve their technical challenges and enjoys participating in their innovations. In his free time, Brian likes spending time with his family, hiking, listening to music, playing hockey, and watching college football (Go Blue!).

Servers

Business Systems

Workstations

Embedded

Personal Laptops

Personal Desktops

Handheld

Resources

GPU Accelerators

Adaptive Accelerators

DPU Accelerators

SmartNICs & Ethernet Adapters

Workstations

Desktops

Laptops

Resources

Adaptive SoCs & FPGAs

System-on-Modules (SOMs)

Technologies

Resources

Evaluation Boards & Kits

Processor Tools

Graphics Tools & Apps

Adaptive SoC & FPGA Tools

Intellectual Property & Apps

GPU Accelerator Tools & Apps

DPU Accelerator Tools

Overview

For Data Center & Cloud

For Edge & Endpoints

For Developers

Industries

Industries

Industries

Industries

Workloads

Deployments

Network, Infrastructure, & Storage

Resources

Gaming

Technologies

Systems

EPYC Processors

Radeon Graphics & AMD Chipsets

Adaptive SoCs & FPGAs

Alveo Accelerators & Kria SOMs

Ryzen Processors

Ethernet Adapters

Overview

Processors

Accelerators, SOMs & NICs

Adaptive SoCs & FPGAs

Graphics

Overview

Product Information & Training

Product Specifications

Resources

Processors & Graphics

DPU Accelerators

Adaptive SoCs & FPGAs

Gaming & Personal Computing

Adaptive & Embedded Computing

Get AMD Fan Gear

Shop Our Retail Partners

Accelerating Face Detection on GMSL-based Camera with Vitis AI Part 2

Prototype System

Project Organization

FPGA Design

Vivado Project Overview

Image Capture Pipeline with Pre-Processing

DPU Acceleration

About Brian Wiec

Company

News & Events

Community

Partners

Investors