Vortex: OpenCL Compatible RISC-V Based GPGPU (Part 1)
This article introduces an overview of Vortex, an open source RISC-V based GPGPU, and how to run the OpenCL program using the Vortex simulator.
Vortex
Vortex is a single instruction, multiple threads (SIMT) execution model GPGPU processor that adds custom instructions for GPGPU to RISC-V ISA. The README.md
of the Vortex repository has the following description as specifications.
Specifications
- Support RISC-V RV32IMF ISA
- Performance:
- 1024 total threads running at 250 MHz
- 128 Gflops of compute bandwidth
- 16 GB/s of memory bandwidth
- Scalability: up to 64 cores with optional L2 and L3 caches
- Software: OpenCL 1.2 Support
- Supported FPGAs:
- Intel Arria 10
- Intel Stratix 10
Microarchitecture
The docs/microarchitecture.md
of the Vortex repository introduces the following diagram as a microarchitecture. The upper part of the figure below represents the Vortex core. Features for threads and warps are added to each stage. The GPGPU unit in the Execute stage handles GPGPU instructions.
A group of Vortex cores is a Vortex cluster, and a group of Vortex clusters is a Vortex processor. Vortex cores and Vortex clusters can share L2 and L3 caches, respectively.
Vortex Simulation Methods
Vortex simulation methods include vlsim
and rtlsim
for RTL simulation using Verilator, simx
for cycle-approximate simulation, and fpga
for FPGA simulation using FPGA board.
The Vortex simulation run is integrated into the shell script blackbox.sh
in the ci
directory. The above four simulation methods can be switched using command line arguments.
Similarly, using command line arguments of blackbox.sh
, you can change the configuration of the Vortex processor: number of clusters, number of cores, number of warps, number of threads, enable/disable of L2 and L3 cache. The default configuration is clusters: 1, cores: 4, warps: 4, threads: 4, L2 and L3 caches: disabled.
Running sgemm
on Vortex RTL Simulator
We ran the OpenCL program sgemm
in the tests/opencl
directory with different configurations of Vortex processor. The sgemm
is a simplified version of single-precision GEMM (GEneral Matrix-to-matrix Multiply).
$ cd $VORTEX $ ./ci/blackbox.sh --driver=rtlsim --cores=[1|2|4|8] [--l2cache] \ --app=sgemm --args="-n[4|8|16|32|64|128]"
The featured image shows performance (FLOP/cycle) calculated from the simulation results.
Summary
This article introduces an overview of Vortex, an open source RISC-V based GPGPU, and how to run the OpenCL program sgemm
in the tests/opencl
directory using the Vortex simulator.