An out-of-order, superscalar is the highest performance RISC-V ISA-based core IP that exploits instruction-level parallelism and hides latencies (cache misses, etc.) very well. It dynamically resolves dependencies and schedules instructions.
The verification of such highest performance core IP based on a superscalar out-of-order pipeline with configurable pipeline depth and issue queue width is a challenging task. This Core IP is designed for use in performance and latency-sensitive markets, such as automotive, data center, and edge or endpoint deep learning SoCs. It was made using a 7nm process technology node while making it capable of operating at up to 2.6GHz clock speed.
Customer Request
One of our customers, a world-leading fabless semiconductor company, provider of commercial RISC-V processor IP and RISC-V ISA silicon solutions provider, requested the verification and coverage closure for their latest out of order Core IP powered with additional RISC-V extension features like Bit manipulation, 48-bit Virtual-Memory System, x-propagation, half-precision Floating Point, trace and debug that allows it to be used for both small and large enterprises.
Our Approach
Our team of engineers started with understanding the customer’s unique requirements about their Core IP microarchitecture by utilizing our previous experience on Out-Of-Order microarchitectures like BOOM (Berkeley Out of Order Machine).
Working closely with our customer’s Design and Design-Verification engineers, we laid out a verification plan of the core by dividing the team-efforts across different functional units like Fetch, Reorder, Load-Store unit and Memory sub-system.
Along with the verification flow, we wrote several constrained random stimulus and C-based directed tests through which many RTL correctness and performance bugs were found in their design particularly in the Load and Store unit and in their newly added feature – Data Prefetcher.
To enhance the effectiveness of verification flow and covering some corner-case scenarios, a UVM-based AHB VIP was written and integrated into their environment which we then used for external traffic to the SRAM attached to the memory port. The RTL optimization was also performed by moving out the verification related stuff from their design which resulted in reducing the total number of RTL code lines to almost half.
By using a systematic approach, we were able to successfully close the demanded coverage numbers earlier than the deadline enabling our customer to add major extra features to the core IP generator like Level-2 private cache in their memory hierarchy which we’ve verified as well.