Chang Xu successfully defended her Ph.D. Thesis at CECA on June 5, 2017. Congratulations!
For more information (in Chinese), please refer to: http://ceca.pku.edu.cn/news.php?action=detail&article_id=688
Title: Automatic Analysis and Optimization for Complicated Hardware Design Flow
Abstract: According to the 2015 International Technology Roadmap for Semiconductors (ITRS), with 50 years of continuous miniaturization, the size of transistor will likely be halted in five years. What’s worse, the improvement of chip frequency (commonly driven by transistor size reduction) becomes less effective. In order to continuously improve computing capacity, people seek solutions from another dimension. Customizable domain-specific computing attracts lots of attentions, which uses heterogeneous computing resources such as Field Programmable Gate Array (FPGA) to accelerate domain-specific applications. Due to the inherent parallelism, programmability, low power and low delay, FPGA provides great performance in many areas such as data center, cloud computing and 5G network. It becomes popular even for software engineers to implement hardware designs on FPGA.
Hardware design is a complicated process that relies heavily on Electronic Design Automation (EDA) technology. EDA technology plays an important role in improving design efficiency and doability, in achieving design objectives and meeting complex design constraints and in providing a good user experience. EDA technology includes a series of complicated optimization steps such as high-level synthesis, logical synthesis and physical design. The complexity of EDA process comes not only from the fact that each step has complex optimization goals and constraints, which is usually an NP-hard problem. But also in the various steps are interactive. Sometimes people need to pre-consider the post-stage’s objectives. And the top-down or bottom-up information passing is necessary.
The complexity of EDA problem poses a lot of difficulty for EDA experts, and sometimes they have to rely on experience or heuristic information to make decisions. But the complex design process also brings an enormous optimization space. Previous researches lack quantitative analysis and optimization. While we believe that the analysis and optimization of the design flow can bring a huge performance benefit. On one hand, quantitative analysis and optimization can help EDA specialists make decisions. On the other hand, automatic tools help users who do not have much experience of hardware design achieve better performance.
Specifically, contributions and innovations of this thesis are summarized as follows:
• Quantitative Analysis of the Impacts of FPGA Heterogeneity on Placement Algorithms. Modern FPGAs are heterogeneous containing more than one type of resources. The effects of FPGA heterogeneity on placement algorithms are unknown. Since the placement problem is an NP-hard problem, we do not know the optimal wirelength. In this paper, we present a method to construct synthetic benchmarks with known optimal wirelength. This paper quantifies the effect of architecture heterogeneity and the netlist heterogeneity on placement algorithm. And we evaluate the existing placement tools to demonstrate the existence of room to improve. Our synthetic benchmarks are open source to release for placement algorithm researchers to evaluate the quality of their placement algorithms.
• Post-placement Multi-bit Flip-flop Clustering. Power consumption is a critical design metric in hardware design, especially for mobile platform. Multi-bit Flip-flop (MBFF) is an emerging technology, which is proven to be quite promising for power reduction. In this paper, we propose an analytical method for MBFF clustering, which can be integrated in the post-placement step. With this method we can achieve significant reduction of power consumption and wirelength without affecting the timing performance.
• A Parallel Autotuner for EDA Flow. The EDA flow contains a series of complex steps that expose a huge parameter space for users to tune. These parameters have a significant impact on the timing performance, resource utilization, power consumption, and routability. The search space is enormous, making the manual exploration unrealistic. In this work, we propose an automatic adjustment method to assist users in setting parameters. Since the EDA process is time-consuming, we then propose a parallel search technology by doing the search space partitioning and the intelligent computing resource allocation. The experimental results show that the parameters autotuning can also bring a significant improvement in performance without spending much effort on code optimization.
In this thesis, we present several automatic analysis and optimization tools that can not only help EDA experts make decisions but help users improve design qualities. Research outputs of this thesis have been published in several well-known international conferences in EDA field. Besides, several research outputs are open sourced to release. It is expected to promote the development of current hardware design process.