DNNDK is a full stack toolkit for neural network deployment on Xilinx edge devices. Users could refer to UG1327 for detailed instruction. This article focuses on the discussion of trouble shooting with the toolkit.


Quantization

Quantization is the first step of deployment which converts a float point model to a quantized one. It is rarely blocked by an error during the process so the question is more about how much accuracy is lost after quantization. In that case, please make sure the accuracy of the float point model is tested in advance and the results are reproducible. The float accuracy then becomes the baseline of the whole deployment workflow.

Please be aware that, the quantization process needs at least 100 images for calibration. The image directory and format is the first place you would like to check. There are some differences between Caffe and TensorFlow framework which have been summarized in the table below. The image directory is defined in the prototxt file in Caffe but in the input_fn file in TF. The label is not required during calibration in TF but required in Caffe.

dnndk-table11

Sometimes the accuracy loss is beyond our expectation (Please refer to UG1327 Chapter 5 for quantized model testing flow) after quantization. The loss could either be due to the quantizer or the model itself. To isolate the issue, it is suggested to set the quantization bit width to 32 instead of 8 and re-quantize the model. The same accuracy should be expected compare to the float model. If not, please file an SR or drop us an email for tool debugging.

bitwidth

After accuracy validation, the quantized model is considered the golden reference for further hardware implementation. There is a ‘dump’ function in both Caffe and TF frameworks to generate the I&O data for each operator.

In Caffe, simply type

$export DECENT_DEBUG=5

then redo quantization, and you will find a dump folder generated with data inside.

It is slightly different in TF that you need to do quantization first and then dump the data with dump command.

    
$decent_q dump \ 
       --input_frozen_graph quantize_results/quantize_eval_model.pb \ 
       --input_fn dump_input_fn \ 
       --max_dump_batches 1 \ 
       --dump_float 0 \ 
       --output_dir quantize_reuslts \


Compilation

The compilation is the process to convert the quantized model to the instruction set, which can be further implemented on FPGA for real application. The compilation error usually happens when the operator or sub-graph pattern is not recognized or not able to handle due to hardware constraint.

The error messages will show valuable information at most cases. However, if the message turns up to be confused anyway, please add two more parameters here.

     $dnnc –dump=all –save_kernel

and re-do compilation. This will compile the model layer by layer and generate intermediate files during the process. It will be straightforward to find out the layer which causes the failure. Please send us an email with the dump folder attached for further investigation.


Implementation

Finally, the scenario happens all the time where everything is going fine without any error messages, but the result while running on an FPGA looks weird. The incorrect result may be due to either compiled instructions on DPU or the pre and post-processing codes on ARM core.

Again, to isolate issues, enter debug mode to run the model layer by layer and compare it with the golden references generated at the quantization stage. The two results should be identical if there are no compilation issues. To enter the debug mode, please run

%dexploror -m debug

Then, redo hybrid compilation and execution to get the dump folder with the name dump_xxxx. Please make sure the same image or data is applied as inputted here. Use diff command as followed to check the difference in between. 

            $vimdiff file1 file2

Similarly, please send us an email with dump information if any differences are observed or mistake within deployment codes.

The basic idea of troubleshooting is to isolate issues. All the debug or dump functions we provided will not help you solve the issue but will locate the issue in a more efficient way. We appreciate your feedback, and we will continue to make effort on the maintenance and improvement for the ease of use of DNNDK.


About Fan Zhang

About Fan Zhang

Dr. Fan Zhang serves as an AI Specialist in AMD Tech Marketing Department and is responsible for the innovation and promotion of AMD AI solution to differentiate vertical market worldwide.  Before joinig AMD in July 2018, Dr. Zhang served as tech leads of Deephi business department. Before that, Dr. Zhang was the core R&D researcher in an oversee company and has nearly ten years of experiences of project development.  Fan Zhang has completed his Ph.D and M.Sc degree of EE in the University of Nottingham.