Running CoreMark on SonicBOOM Simulator
We have created a simulator with the short-forwards branch (SFB) optimization of SonicBOOM, an out-of-order execution superscalar RISC-V CPU, and ran CoreMark.
The results were CoreMark/MHz 6.89 and 6.45 with and without changing ee_u32
from unsigned int
to signed int
, respectively.
These results show that SonicBOOM can achieve the nominal value of 6.2 CoreMark/MHz.
BOOM
Berkeley Out-of-Order Machine (BOOM) is one of the RTL generators included in Chipyard introduced in the previous article, and can generate RISC-V out-of-order execution superscalar CPUs.
Currently, it is BOOM version3 (BOOMv3), also known as SonicBOOM.
The SonicBOOM nominal CoreMark/MHz is 6.2.
SFB optimization
The short-forwards branch (SFB) optimization is described in SonicBOOM: The 3rd Generation Berkeley Out-of-Order Machine as follows:
As an example, SonicBOOM achieves 6.15 CoreMark/MHz with the SFB optimization enabled, compared to 4.9 CoreMark/MHz without.
However, this SFB optimization is not enabled by default.
BOOM Simulator
The created BOOM simulators are the following two simulators that use Verilator.
simulator-chipyard-MegaBoomConfig
: Default (SFB Optimization Disabled)simulator-chipyard-MegaBoomConfig-SFB
: SFB Optimization Enabled
There are several types of BOOM configurations. We have created MegaBoom simulators with a 4-wide BOOM configuration.
CoreMark
CoreMark is based on the riscv-coremark included in Chipyard.
As shown in the table below, we have built four types of CoreMark. They are a combination of two types of CFLAGS
, -O2
and -O3
, and two types of ee_u32
, the default unsigned int
and signed int
.
ee_u32 | |||
---|---|---|---|
unsigned int | signed int | ||
CFLAGS | -O2 | coremark.o2-u32 | coremark.o2-s32 |
-O3 | coremark.o3-u32 | coremark.o3-s32 |
In addition, ITERATIONS
of CoreMark is 10, and GCC 11.1.0 is used for building CoreMark.
Running CoreMark on BOOM Simulator
Default ee_u32
The table below shows the results using the default CoreMark, which uses unsigned int
for ee_u32
.
CoreMark/MHz | SFB Optimization | ||
---|---|---|---|
Disabled | Enabled | ||
CFLAGS | -O2 | 5.49 | 6.45 |
-O3 | 5.94 | 5.97 |
The BOOM simulator with SFB optimization achieves 6.45 CoreMark/MHz, which exceeds the nominal value of 6.2 CoreMark/MHz, when running the CoreMark built with -O2
.
In contrast, the BOOM simulator without SFB optimization is 5.94 CoreMark/MHz when running the CoreMark built with -O3
.
The following shows the output of the BOOM simulator with SFB optimization when running the CoreMark built with -O2
.
$ ./simulator-chipyard-MegaBoomConfig-SFB coremark.o2-u32 This emulator compiled with JTAG Remote Bitbang client. To enable, use +jtag_rbb_enable=1. Listening on port 37541 [UART] UART0 is here (stdin/stdout). 2K performance run parameters for coremark. CoreMark Size : 666 Total ticks : 1551464 Total time (secs): 1551464 Iterations/Sec : 0 Iterations : 10 Compiler version : GCC11.1.0 Compiler flags : -O2 -fno-builtin -mcmodel=medany -static -std=gnu99 -fno-common -nostdlib -nostartfiles -lm -lgcc -T ../riscv64-baremetal/link.ld Memory location : Please put data memory location here (e.g. code in flash, data on heap etc) seedcrc : 0xe9f5 [0]crclist : 0xe714 [0]crcmatrix : 0x1fd7 [0]crcstate : 0x8e3a [0]crcfinal : 0xfcaf Correct operation validated. See README.md for run and reporting rules. mcycle = 1589202 minstret = 3569373
Since 10 iterations of CoreMark are 1,551,464 cycles in the Total ticks
column, CoreMark/MHz is 6.45, as shown in the table above.
Modified ee_u32
The table below shows the results using CoreMark, which uses signed int
for ee_u32
.
CoreMark/MHz | SFB Optimization | ||
---|---|---|---|
Disabled | Enabled | ||
CFLAGS | -O2 | 5.84 | 6.89 |
-O3 | 6.27 | 6.31 |
When running the CoreMark built with -O2
, the CoreMark/MHz of the BOOM simulator with SFB optimization is 6.89, which exceeds the nominal value of 6.2.
The figure below shows the output of the BOOM simulator with SFB optimization when running the CoreMark built with -O2
.
Since 10 iterations of CoreMark are 1,451,480 cycles in the Total ticks
column, CoreMark/MHz is 6.89 as shown in the table above.
Summary
We have created a SonicBOOM simulator with short-forwards branch (SFB) optimization enabled and ran CoreMark.
The results were CoreMark/MHz 6.89 and 6.45 with and without changing ee_u32
from unsigned int
to signed int
, respectively.
These results show that SonicBOOM can achieve the nominal value of 6.2 CoreMark/MHz.