FPGA CAD Research: An Introduction (Report 1: 6/4/05~13/4/05) Xiaoxiang Shi (
[email protected]) Department of Computer Science , Xidian University
Abstract In this report, we firstly introduce several CAD tools in FPGA architecture and CAD research. Then we outline a typical CAD flow and use these tools to evaluate it. In the demo experiment, we select a small MCNC benchmark circuit as an input to the CAD flow, and then we give our results. Finally, we draw some conclusions and make plans for our future research.
1 Introduction In FPGA research, one must typically evaluate the utility of new architectural features experimentally. That is, benchmark circuits are synthesized, technology mapped, packing, placed and routed onto the FPGA architectures of interest, and measures of the architecture’s quality, such as speed or area, can then readily be extracted. Accordingly, there is considerable need for flexible CAD tools that can target a wide variety of FPGA architectures efficiently, and hence allow fair comparisons of the architectures. This report describes several famous FPGA CAD tools in not only academic research but also commercial industries. And it also presents a typical CAD flow which is widely used in FPGA research. We then give a demo experiment on this flow. The organization of this report is as follows. In Section 2 we give some brief introductions to the FPGA CAD tools which we used in this demo experiment. In Section 3 we outline the typical CAD flow which is widely used in the community. In Section 4 we make a demo experiment and provide some results. In the final Section we conclude and outline some future work in this field.
2 Background In this Section, we give a brief introduction to the tools we used in the demo experiment. They are: SIS[1] is used to synthesize a circuit either sequential or combinational, FlowMap[2] is used to technology map the circuit into LUTs and Flip flops, T-VPACK[3,4,,5,6] is used to pack LUTs and Flip flops into Basic Logic Elements (BLEs), and VPR[3, 4, 7, 8, 9, 10, 11] is used to place and route the circuit, then gives the statistics for the next step. 2.1 Synthesis SIS is an interactive tool for synthesis and optimization of sequential circuits. Given a state transition table, a signal transition graph, or a logic-level description of a sequential circuit, it produces an optimized netlist in the target technology while preserving the sequential input-output behavior. Many different programs and algorithms have been integrated into SIS, allowing the user to choose among a variety of techniques at each stage of the process. It is built on top of MISII [12] and includes all (combinational) optimization techniques therein as well as many enhancements. SIS serves as both a framework within which various algorithms can be tested and compared, and as a tool for automatic synthesis and optimization of sequential circuits, and also FPGAs (Field -1-
programmable gate arrays). 2.2 Technology Mapping In this phase we present a technology mapping algorithm named FlowMap for depth minimization in LUT-based FPGA designs, which is optimal for any K-bounded Boolean network. It is based on efficient computation of minimum height K-feasible cuts in a network. A number of area optimization techniques also allow FlowMap to reduce the number of K-LUTs significantly. Compared to the existing LUT-based FPGA technology mapping algorithms for delay optimization, FlowMap reduces the number of LUTs by up to 50%. FlowMap takes less than one minute of CPU time for each of the MCNC benchmarks in our test suite. 2.3 Packing T-VPack is a packing program which can be used with or without VPR. It takes a technologymapped netlist (in blif format) consisting of lookup tables (LUTs) and flip flops (FFs) and packs the LUTs and FFs together to form more coarse-grained logic blocks. The netlist it outputs is in the .net format required by VPR, and hence can be fed directly into VPR. 2.4 Place and Route VPR (Versatile Place and Route) is an FPGA placement and routing tool which is developed in the University of Toronto. VPR is capable of targeting a broad range of FPGA architectures, and the source code is publicly available. It and the associated netlist translation / clustering tool T-VPACK have already been used in a number of research projects worldwide, and should be useful in many areas of FPGA architecture and CAD research. We use the Versatile Place and Route (VPR) tool, which has been designed to be flexible enough to allow comparison of many different FPGA architectures. VPR can perform placement and either global routing or combined global and detailed routing. The inputs to VPR consist of a technology-mapped netlist and a text file describing the FPGA architecture (maybe “Device database” in Altera). VPR can place the circuit, or a pre-existing placement can be read in. VPR can then perform either a global route or a combined global/detailed route of the placement. VPR’s output consists of the placement and routing, as well as statistics useful in assessing the utility of an FPGA architecture, such as routed wirelength, track count, and maximum net length. Some of the architectural parameters that can be specified in the architecture description file are: • the number of logic block inputs and outputs, • the side(s) of the logic block from which each input and output is accessible, • the logical equivalence between various input and output pins (e.g. all LUT inputs are functionally equivalent), • the number of I/O pads that fit into one row or one column of the FPGA, and • the dimensions of the logic block array (e.g. 23 x 30 logic blocks). -2-
In addition, if global routing is to be performed, one can also specify: • the relative widths of horizontal and vertical channels, and • the relative widths of the channels in different regions of the FPGA. Finally, if combined global and detailed routing is to be performed, one also specifies: • the switch block [3] architecture (i.e. how the routing tracks are interconnected), • the number of tracks to which each logic block input pin connects (Fc [3]), • the Fc value for logic block outputs, and • the Fc value for I/O pads.
3 Typical CAD Flow Figure 1 illustrates the CAD flow we typically use. First, the SIS synthesis package is used to perform technology-independent logic optimization of each circuit. Next, each circuit is technologymapped into 4-LUTs and flip flops by FlowMap. The output of FlowMap is a .blif format netlist of LUTs and flip flops. Our T-VPack program then packs this netlist of 4-LUTs and flip flops into more coarse-grained logic blocks, and outputs a netlist in the .net format VPR uses. VPR can then place the circuit and either globally route it or perform combined global and detailed routing on it. The output of VPR consists of a file describing the circuit placement, another file describing the circuit’s routing, and various statistics concerning the minimum number of tracks per channel required to successfully route, the total wirelength, etc. In order to find the minimum number of tracks required for successful routing, VPR actually attempts to route the circuit several times with different numbers of tracks allowed per channel in each attempted routing.
-3-
4 Demo Experiment Results These results are of the MCNC benchmark circuit e64. This is one of the smallest circuits we use to benchmark FPGAs--- it contains 230 four-input look-up tables. It is, however, faster to download pictures from a circuit this size than from a larger one, and e64 is still large enough to be interesting.
Initial random placement
-4-
Final placement
-5-
Completely (Detailed) Routed Circuit The minimum channel width for successful routing is 7, that is the routing shown here. We’ve highlighted one block (in green) by clicking on it. Its fanout is shown in red, and its fanin is shown in blue. -6-
-7-
Close-up View of the FPGA Routing Architecture This picture shows the various wire segments and potential connections between wire segments and logic block pins, etc. We selecte the green block; the routing of its fanout is highlighted in red and the routing of its inputs is highlighted in blue.
5 Conclusions and Future Work Lv, Wang, Zhao and I have made great efforts to compile these CAD tools (especially SIS) as they only provide the source codes, however, we still meet some challenges and obstacles. Later we obtain a tool from UCLA that incorporates the SIS and FlowMap, so we complete our demo experiment upon this. Though we succeed in conducting the experiment on a small MCNC benchmark circuit e64, it still exists the question whether or not it does well on other larger circuits. Next step, we would like to compile these tools successfully and do some research on them. -8-
For most FPGA academic research was done on the Sun Solaris SPARCstation, we hope to do our research on this platform and maybe some errors are automatically gone away in the new OS.
6. References [1] E. M. Sentovich et al, “SIS: A System for Sequential Circuit Analysis,” Tech. Report No. UCB/ERL M92/41, University of California, Berkeley, 1992. [2] J. Cong and Y. Ding, “FlowMap: An Optimal Technology Mapping Algorithm for Delay Optimization in Lookup-Table Based FPGA Designs,” IEEE Trans. CAD, Jan. 1994, pp. 1 - 12. [3] V. Betz, J. Rose and A. Marquardt, Architecture and CAD for Deep-Submicron FPGAs, Kluwer Academic Publishers, 1999. [4] V. Betz, “Architecture and CAD for the Speed and Area Optimization of FPGAs,” Ph.D. Dissertation, University of Toronto, 1998. [5] V. Betz and J. Rose, “Cluster-Based Logic Blocks for FPGAs: Area-Efficiency vs. Input Sharing and Size,” CICC, 1997, pp. 551 - 554. [6] A. Marquardt, V. Betz and J. Rose, “Using Cluster-Based Logic Blocks and Timing-Driven Packing to Improve FPGA Speed and Density,” ACM/SIGDA Int. Symp. on FPGAs, 1999, pp. 37 - 46. [7] V. Betz and J. Rose, “Directional Bias and Non-Uniformity in FPGA Global Routing Architectures,” ICCAD, 1996, pp. 652 - 659. [8] V. Betz and J. Rose, “On Biased and Non-Uniform Global Routing Architectures and CAD Tools for FPGAs,” CSRI Technical Report #358, Department of Electrical and Computer Engineering, University of Toronto, 1996. [9] V. Betz and J. Rose, “VPR: A New Packing, Placement and Routing Tool for FPGA Research,” Seventh International Workshop on Field-Programmable Logic and Applications, 1997, pp. 213 -222. [10] A. Marquardt, V. Betz and J. Rose, “Timing-Driven Placement for FPGAs,” ACM/SIGDA Int. Symp. on FPGAs, 2000, pp. 203 - 213. [11] V. Betz and J. Rose, “Automatic Generation of FPGA Routing Architectures from High-Level Descriptions,” ACM/SIGDA Int. Symp. on FPGAs, 2000, pp. 175 - 184. [12] Robert K. Brayton, Richard Rudell, Alberto Sangiovanni-Vincentelli, and Albert R.Wang. MIS:AMultiple-Level Logic Optimization System. IEEE Transactions on Computer-Aided Design, CAD-6(6):1062–1081, November 1987.
-9-