Running ONNX Model on FPGA with Gemmini SoC

systolic-array

We have successfully run ONNX model on an FPGA board with a DNN accelerator Gemmini and a RISC-V CPU Rocket.

Click here for related articles.

Gemmini

Gemmini is one of the RTL generators included in Chipyard, an agile RISC-V SoC design framework, and can generate a systolic array based DNN accelerator.

The figure below quoted from the Gemmini repository gives an overview of the Gemmini system.

gemmini-system

ONNX

Open Neural Network Exchange (ONNX) is a format for neural network models. ONNX Runtime is software for inferring ONNX models (ONNX format models).

This time, we used onnxruntime-riscv which is a port for Gemmini of ONNX Runtime.

Running ONNX Model ResNet-50 on FPGA

We used the Gemmini system built on a Digilent FPGA board Nexys Video introduced in the previous article.
We loaded the gateware of the Gemmini SoC onto the FPGA board and ran the ONNX model ResNet-50.

The following shows the console output when running ort_test with resnet50_opt_quant.onnx as the ONNX model.

# ./ort_test -m resnet50_opt_quant.onnx \
  -i images/dog.jpg \
  -p caffe2 -x 2 -O 99
Loaded runner program
Using systolic in mode 2
Using Onnxruntime C++ API
Number of inputs = 1
Input 0 : name=gpu_0/data_0, type=1, num_dims=4: [1, 3, 224, 224, ]
Number of outputs = 1
Output 0 : name=gpu_0/softmax_1, type=1, num_dims=2: [1, 1000, ]
Loading image
Image dimensions: 224 224 3
First few image values 130.061005 126.060997 123.060997
Called into systolic conv
Using systolic pooling
Called into systolic conv
Called into systolic conv
Called into systolic conv
Called into systolic conv
Called into systolic add
Called into systolic conv
Called into systolic conv
Called into systolic conv
Called into systolic add
Called into systolic conv
Called into systolic conv
Called into systolic conv
Called into systolic add
Called into systolic conv
Called into systolic conv
Called into systolic conv
Called into systolic conv
Called into systolic add
Called into systolic conv
Called into systolic conv
Called into systolic conv
Called into systolic add
Called into systolic conv
Called into systolic conv
Called into systolic conv
Called into systolic add
Called into systolic conv
Called into systolic conv
Called into systolic conv
Called into systolic add
Called into systolic conv
Called into systolic conv
Called into systolic conv
Called into systolic conv
Called into systolic add
Called into systolic conv
Called into systolic conv
Called into systolic conv
Called into systolic add
Called into systolic conv
Called into systolic conv
Called into systolic conv
Called into systolic add
Called into systolic conv
Called into systolic conv
Called into systolic conv
Called into systolic add
Called into systolic conv
Called into systolic conv
Called into systolic conv
Called into systolic add
Called into systolic conv
Called into systolic conv
Called into systolic conv
Called into systolic add
Called into systolic conv
Called into systolic conv
Called into systolic conv
Called into systolic conv
Called into systolic add
Called into systolic conv
Called into systolic conv
Called into systolic conv
Called into systolic add
Called into systolic conv
Called into systolic conv
Called into systolic conv
Called into systolic add
Element count 1000. Top 5 classes:
0.031456 giant schnauzer
0.075702 curly-coated retriever
0.087432 Great Dane
0.271946 Labrador retriever
0.361813 Rottweiler
Done! Inference took 495827001 cycles

Summary

We have successfully run the ONNX model ResNet-50 on Digilent’s Nexys Video with Gemmini and Rocket.