Running ONNX Model on FPGA with Gemmini SoC
Luffcaでは、FPGAボード上にDNNアクセラレータのGemminiとRISC-V CPUのRocketを実装し、ONNXモデルを実行することに成功しました。
関連記事は、こちら。
- Running Test Programs on Gemmini Simulators
- Running ResNet-50 on FPGA with Gemmini SoC
- Running ONNX Model on FPGA with Gemmini SoC(本記事)
Gemmini
Gemminiは、アジャイルRISC-V SoCデザインフレームワークのChipyardに含まれるRTLジェネレータの一つで、シストリックアレイ方式のDNNアクセラレータを生成することができます。
Gemminiリポジトリから引用した下図が、Gemminiシステムの概要を示しています。
ONNX
Open Neural Network Exchange(ONNX)は、ニューラルネットワークモデルのフォーマットです。ONNX Runtimeは、ONNXモデル(ONNXフォーマットのモデル)の推論を行うためのソフトウェアです。
今回は、ONNX RuntimeのGemmini用のポーティングであるonnxruntime-riscvを使用しています。
Running ONNX Model ResNet-50 on FPGA
以前の記事で紹介したDigilent社のFPGAボードのNexys Video上に構築したGemminiシステムを使用しました。
Gemmini SoCのゲートウェアをFPGAボードにロードし、ONNXモデルのResNet-50を実行しました。
以下は、ONNXモデルとしてresnet50_opt_quant.onnx
を指定し、ort_test
を実行したときのコンソール出力を示しています。
# ./ort_test -m resnet50_opt_quant.onnx \ -i images/dog.jpg \ -p caffe2 -x 2 -O 99 Loaded runner program Using systolic in mode 2 Using Onnxruntime C++ API Number of inputs = 1 Input 0 : name=gpu_0/data_0, type=1, num_dims=4: [1, 3, 224, 224, ] Number of outputs = 1 Output 0 : name=gpu_0/softmax_1, type=1, num_dims=2: [1, 1000, ] Loading image Image dimensions: 224 224 3 First few image values 130.061005 126.060997 123.060997 Called into systolic conv Using systolic pooling Called into systolic conv Called into systolic conv Called into systolic conv Called into systolic conv Called into systolic add Called into systolic conv Called into systolic conv Called into systolic conv Called into systolic add Called into systolic conv Called into systolic conv Called into systolic conv Called into systolic add Called into systolic conv Called into systolic conv Called into systolic conv Called into systolic conv Called into systolic add Called into systolic conv Called into systolic conv Called into systolic conv Called into systolic add Called into systolic conv Called into systolic conv Called into systolic conv Called into systolic add Called into systolic conv Called into systolic conv Called into systolic conv Called into systolic add Called into systolic conv Called into systolic conv Called into systolic conv Called into systolic conv Called into systolic add Called into systolic conv Called into systolic conv Called into systolic conv Called into systolic add Called into systolic conv Called into systolic conv Called into systolic conv Called into systolic add Called into systolic conv Called into systolic conv Called into systolic conv Called into systolic add Called into systolic conv Called into systolic conv Called into systolic conv Called into systolic add Called into systolic conv Called into systolic conv Called into systolic conv Called into systolic add Called into systolic conv Called into systolic conv Called into systolic conv Called into systolic conv Called into systolic add Called into systolic conv Called into systolic conv Called into systolic conv Called into systolic add Called into systolic conv Called into systolic conv Called into systolic conv Called into systolic add Element count 1000. Top 5 classes: 0.031456 giant schnauzer 0.075702 curly-coated retriever 0.087432 Great Dane 0.271946 Labrador retriever 0.361813 Rottweiler Done! Inference took 495827001 cycles
まとめ
Luffcaでは、Digilent社のNexys Video上にGemminiとRocketを実装し、ONNXモデルのResNet-50を実行することに成功しました。