Proposal for Google Summer of Code 2007
Project Proposal: Graph Coloring Register Allocator for Jikes RVM Alexey Gorodilov (
[email protected]), Bauman Moscow Technical State University, Moscow, Russia, Major: Informatics and Control Systems, Department: High performance systems and technologies
Project description:
Register allocation is one of the most important optimization in any modern optimizing compiler. The goal of register allocator is to allocate finite number of machine physical registers to infinite number of temporary variables such that temporary variables with interfering live ranges are assigned different registers. At the same time register allocator must keep as many operands as possible to maximize software speed. For solving this problem was presented many sophisticated techniques, e.g.: • Linear Scan Register Allocator – this algorithm is of interest where compile time is critical, such as dynamic compilation. • Graph Coloring Register Allocator Register allocator may works on basic blocks (local register allocator) or on whole function (global register allocator). Jikes RVM optimizing compiler was originally designed for Power PC. Having 32 registers, register pressure on Power PC wasn't a big issue but the performance of the register allocator was. Given spills and fills were rare the trade-off suggested that linear scan was a good register allocator for the Jikes RVM. When the Jikes RVM was made open source a new Intel back-end was added. On x86 number of registers had decreased from 32 to 8. x86 is architecture where the problem of graph-coloring register allocation is complicated by irregularities in the organization and use of the architecture’s register resources. Several irregularities can be found: • multi - register: ax is an example of the two-element multi-register ah:al • multi - bank register class, e.g.: cmps, can use only esi/edi mov eax, [ebx + esi + 2], can use any register
•
instructions may require predefined subset of registers, e.g.:
x86 multiply instruction, uses only register pair edx:eax to store the result: imul ebx // edx:eax = eax * ebx Register allocator must take into account these issues.
Figure 1 Iterated Register Coalescing I propose to implement Iterated-Coalescing Graph-Color Allocator, pseudo code for this algorithm can be found in [2], with modifications for irregular architectures [1,3]. This allocator uses interference graph to represent all program constraints. Main steps (blocks) of this algorithm: Renumber phase identifies live ranges in code and assigns register class for each live range. Register class is a set of hardware registers. Build phase constructs the interference graph, and categorize each node as being either move related or not. A move-related node is one that is either the source or destination of a move instruction. Each node represents variable live range. Edge represents competition of two live ranges for the same register resources. Simplify phase, one at a time, removes non-move-related nodes of low degree from the graph. Coalesce phase performs Briggs-style conservative coalescing on the reduced graph obtained in the simplification phase. Since the degrees of many nodes have been reduced by simplify, the conservative strategy is likely to find many more moves to coalesce than it would have in the initial interference graph. After two nodes have been coalesced (and the move instruction deleted), if the resulting node is no longer move related it will be available for the next round of simplification. Simplify and Coalesce are repeated until only significant-degree or move-related nodes remain. Freeze. If neither simplify nor coalesce applies, look for a move-related node of low degree. Freeze the moves in which this node is involved: that is, give up hope of coalescing those moves. This causes the node (and perhaps other nodes related to the frozen moves) to be considered not move related. Now, simplify and coalesce are resumed. Select phases try to color interference graph, if no coloring can be found, allocator spills chosen variable and try to color again.
1
Proposal for Google Summer of Code 2007 During Renumber phase all live ranges must be classified by the set of hardware resources that would satisfy the live range’s allocation needs. x86 integer registers can be divided into 4 classes:
Ci = { DI , EDI , SI , ESI }
Cb = { AL, AH , BL, BH , CL, CH , DL, DH } Ca = { AX , EAX , BX , EBX , CX , ECX , DX , EDX } Cm = { AX , EAX , BX , EBX , CX , ECX , DX , EDX , DI , EDI , SI , ESI } E.g.: variables live ranges for cmps instruction belong to
Ci class, for mov instruction Cm .
The main problem with irregular architectures is than they make it difficult for a graph-coloring register allocator to determine the colorability of a node in the interference graph. The colorabillity heuristic of a register candidate in regular architecture:
degree(n) < k On architectures with irregular register resources, the degree of a node n in the interference graph being less than k is no longer a sufficient condition for determining that node’s colorability. So, one of the approaches to adapt standard graph-coloring allocator for architectures with irregular register resources is to use WIG (Weighted Interference Graph). Using WIG standard coloring heuristic looks:
⎛ ⎡ wj ⎤ ⎞ ⎜⎜ ∑ ⎢ ⎥ ⎟⎟ ≤ pn ⎝ j∈Adj ( n ) ⎢ wn ⎥ ⎠ This formula works for architectures with regular and irregular register resources. Regular architecture: w j = wn = 1 , pn = k - number of available registers. Irregular architecture: pn - number of placements (alternatives) available to the members of the class.
w j - number of registers required for holding variable. Every register candidate in register class C uses the same values of w and p : p Class w 2 6 Cm
E.g.: register candidate from class
Ca
2
4
Cb
1
8
Ci
2
2
Ca blocks two registers from class Cb .
Real program interference graphs can be of several types, e.g.: perfect graph, chordal graph, etc. Chordal graphs have several useful properties. Problems such as minimum coloring, which is NP-complete in general, can be solved in polynomial time for chordal graphs. In particular, optimal coloring of a chordal graph G = (V, E) can de done in O(|V|+|E|). So it will be interesting to gather statistics on real application what percent of graphs is chordal.
Results:
The final package delivered at the end of the summer will include: 1. Implemented and integrated in Jikes RVM Graph-Color allocator. 2. Module for gathering statistics for interference graph types, nodes count, colors number. Chordal graph determination. 3. Comparison Graph Coloring Allocator vs. Linear Scan Allocator. 4. Code and algorithm documentation.
Schedule: •
April – 27 May: Jikes RVM source code investigation Existent Linear Scan register allocator source code analysis x86 architecture instruction set classification
2
Proposal for Google Summer of Code 2007 • •
Graph-Coloring Register allocator design in pseudo code 28 May – 9 July: Weekly report Get an initial version of Graph-Color Register Allocator operational within Jikes RVM 10 July – 20 August: Weekly report Get final version of Graph-Color Register Allocator Final testing, bug fixing Module for gathering statistics Some real examples that can show advantage of Graph Register Allocator over Linear Scan Allocator Documentation
Resource requirements:
1. Jikes RVM source code (from Jikes RVM website) 2. x86 Architecture instruction set manual (from www.intel.com)
Source code management:
Main or side branch of the Jikes RVM (under Common Public License 1.0)
Personal Information:
My research activities connected with optimizing compilers. Now I work on research compiler for Itanium processor (under supervision Intel Corporation Employees from Moscow Compiler Team and faculty members). My parts in this project: compiler infrastructure design (C++), control flow graph analysis, instruction scheduling, software pipelining (SWP), if-conversion. Some more information can be found in my resume http://alexeigor.googlepages.com/resume.pdf .
References: 1.
2. 3. 4. 5.
Smith, M. D., Ramsey, N., and Holloway, G. 2004. A generalized algorithm for graph-coloring register allocation. In Proceedings of the ACM SIGPLAN 2004 Conference on Programming Language Design and Implementation. Lal George and Andrew W. Appel. 1996(May). Iterated Register Coalescing. ACM Transactions on Programming Languages and Systems, 18(3):300-324. Michael D. Smith, Glenn Holloway. Graph-Coloring Register Allocation for Architectures with Irregular Register Resources. Harvard University. Preston Briggs. Register Allocation via Graph Coloring. 1992(April). PhD thesis, Rice University, Houston, Texas. P. Briggs, K. Cooper, and L. Torczon. 1992(March). Coloring Register Pairs. ACM Letters on Programming Languages and Systems, 1(1):3-13.
3