Benchmarks on RISC-V Out-of-Order Simulator
We have created a simulator for NaxRiscv, a RISC-V Out-of-Order core, and ran the benchmarks CoreMark and Dhrystone.
The current NaxRiscv repository is still WIP, but you can run Linux on the simulator. It is also integrated into LiteX and allows you to create gateware for FPGA boards.
The feature image is a visualization of the log output from the simulator with Konata, an instruction pipeline visualizer.
Click here for NaxRiscv related articles.
Note: The content was updated on July 16, 2022.
NaxRiscv
NaxRiscv is a RISC-V core being developed by Charles Papon, the developer of 32-bit RISC-V VexRiscv. Like VexRiscv, it is written in a hardware description language called SpinalHDL.
We thought the difference between VexRiscv and NaxRiscv was an in-order scalar and an out-of-order superscalar, like Rocket and BOOM (Berkeley Out-of-Order Machine) at University of California, Berkeley (UCB). However, new attempts such as 64-bit (RV64) support, which VexRiscv did not have, are being made.
The performance of NaxRiscv’s default RV32IMA is as follows.
- CoreMark: 5.00 CoreMark/MHz (-O3 and so many more random flags)
- Dhrystone: 2.94 DMIPS/MHz (-O3 -fno-common -fno-inline)
Benchmarks on NaxRiscv Simulator using Verilator
The NaxRiscv repository on GitHub describes how to create an RV32IMA simulator using Verilator, and since CoreMark and Dhrystone are built, we performed a reproduction test.
CoreMark
The following shows the console output when running CoreMark.
$ ./sim/VNaxRiscv32ima --name coremark \ --load-elf $NAXSOFTWARE/baremetal/coremark/build/rv32ima/coremark.elf \ --pass-symbol pass 2K performance run parameters for coremark. CoreMark Size : 666 Total ticks : 2001972 Total time (secs): 2001972.000000 Iterations/Sec : 0.000005 Iterations : 10 Compiler version : GCC11.1.0 Compiler flags : -DPERFORMANCE_RUN=1 -march=rv32ima -mabi=ilp32 -mcmodel=medany -Wno-pointer-to-int-cast -Wno-int-to-pointer-cast -I../driver -O3 -fno-common -funroll-loops -finline-functions -falign-functions=16 -falign-jumps=4 -falign-loops=4 -finline-limit=1000 -fno-if-conversion2 -fselective-scheduling -fno-crossjumping -freorder-blocks-and-partition -DCORE_DEBUG=0 -lgcc -lc -nostartfiles -ffreestanding -Wl,-Bstatic,-T,../common/app.ld,-Map,coremark.map,--print-memory-usage Memory location : STACK seedcrc : 0xe9f5 [0]crclist : 0xe714 [0]crcmatrix : 0x1fd7 [0]crcstate : 0x8e3a [0]crcfinal : 0xfcaf Correct operation validated. See README.md for run and reporting rules. CoreMark 1.0 : 0.000005 / GCC11.1.0 -DPERFORMANCE_RUN=1 -march=rv32ima -mabi=ilp32 -mcmodel=medany -Wno-pointer-to-int-cast -Wno-int-to-pointer-cast -I../driver -O3 -fno-common -funroll-loops -finline-functions -falign-functions=16 -falign-jumps=4 -falign-loops=4 -finline-limit=1000 -fno-if-conversion2 -fselective-scheduling -fno-crossjumping -freorder-blocks-and-partition -DCORE_DEBUG=0 -lgcc -lc -nostartfiles -ffreestanding -Wl,-Bstatic,-T,../common/app.ld,-Map,coremark.map,--print-memory-usage / STACK 5.00 Coremark/MHz SUCCESS coremark
The CoreMark/MHz of the simulator is 5.00.
Dhrystone
The following shows the console output when running Dhrystone.
$ ./sim/VNaxRiscv32ima --name dhrystone \ --load-elf $NAXSOFTWARE/baremetal/dhrystone/build/rv32ima/dhrystone.elf \ --pass-symbol pass Dhrystone Benchmark, Version C, Version 2.2 Program compiled without 'register' attribute Using time(), HZ=12000000 ... Microseconds for one run through Dhrystone: 16 Dhrystones per Second: 62169 User_Time : 965109 Number_Of_Runs : 5000 HZ : 12000000 DMIPS per Mhz: 2.94 SUCCESS dhrystone
The DMIPS/MHz of the simulator is 2.94.
Summary
We have created a NaxRiscv simulator using Verilator and ran the benchmarks CoreMark and Dhrystone.