Design and Implementation of Core Micro-architecture Based on RISC-V ISA

No Thumbnail Available
Journal Title
Journal ISSN
Volume Title
The demand for low-power and low-cost CPUs is increasing due to cloud computing and IoT endpoint applications. In this work, three 32-bit core micro-architectures are proposed to meet the requirements of modern world devices. The cores are targeted to three different applications: low-power, data-centric IoT devices and high-security. The cores are based on RISC-V ISA with the support of some new custom instructions those help in enhancing the core performance. The low-power core is a four-stage pipelined micro-architecture that supports RV32IM of RISC-V instruction set. The core is verified on simulation as well as on Xilinx Virtex-7 FPGA platform. This core could achieve a Dhrystone benchmark score of 1.71 DMIPS per MHz which is higher than ARM Cortex-M3 (1.50 DMIPS per MHz) and ARM Cortex-M4 (1.52 DMIPS per MHz). The CoreMark benchmark is also tested on this core, and it gives 4.13 CoreMark per MHz. The physical design result of the core using commercial tools shows that it can achieve a maximum frequency of 198.02MHz with 0.036mm2 area and 17.36μW/MHz power requirement at UMC 40nm technology node. The core consumes a dynamic power of 19.75μW/MHz at UMC 90nm, which is 36% and 40% better than ARM Cortex-M3 and Cortex-M4, respectively and also lower than many other cores. The results show that this core can outperform many existing commercial and open-source cores. The IoT core is based on RISC-V ISA with the support of eleven proposed custom instructions for enhancing the performance in data processing algorithms, convolution, matrix multiplication, digital filters, cryptographic kernels, etc. The synthesis result at 65 nm shows that the core requires an additional 7% resources due to the integration of the proposed custom instructions. However, the core performance is improved by 1.74×due to the use of theseinstructions. The core is also capable of working up to a maximum frequency of 328M Hz. The high-security core is integrated with a dedicated hardware unit for AES encryption/decryption. There are six proposed instructions to make use of this hardware unit. The core can perform an AES encryption in 63 clock cycles, whereas the same encryption requires 54265 and 19258 clock cycles in the proposed low-power and IoT cores, respectively. At 65nm technology node, the AES throughput of the core is found to be 0.71 Gbps. The core clock frequency is limited at 352MHz. During the implementation of the cores, it has been observed that there is very little research in hardware implementation of binary divider. Therefore, design and implementation of a novel binary divider are also presented in this work. The work presents an analysis of the proposed divider for different data widths (8, 16, 32, 64, 128) anddifferent radix values (4, 8, 16). The area, power, delay and latency of the designs are estimated at 40 nm technology node. The analysis is helpful for designers in determining the correct divider circuit for their designs.
Supervisor: Roy Paily