Proposal for Google Summer of Code 2007

Project Proposal: Graph Coloring Register Allocator for Jikes RVM Alexey Gorodilov ([email protected]), Bauman Moscow Technical State University, Moscow, Russia, Major: Informatics and Control Systems, Department: High performance systems and technologies

Project description:

Register allocation is one of the most important optimization in any modern optimizing compiler. The goal of register allocator is to allocate finite number of machine physical registers to infinite number of temporary variables such that temporary variables with interfering live ranges are assigned different registers. At the same time register allocator must keep as many operands as possible to maximize software speed. For solving this problem was presented many sophisticated techniques, e.g.: • Linear Scan Register Allocator – this algorithm is of interest where compile time is critical, such as dynamic compilation. • Graph Coloring Register Allocator Register allocator may works on basic blocks (local register allocator) or on whole function (global register allocator). Jikes RVM optimizing compiler was originally designed for Power PC. Having 32 registers, register pressure on Power PC wasn't a big issue but the performance of the register allocator was. Given spills and fills were rare the trade-off suggested that linear scan was a good register allocator for the Jikes RVM. When the Jikes RVM was made open source a new Intel back-end was added. On x86 number of registers had decreased from 32 to 8. x86 is architecture where the problem of graph-coloring register allocation is complicated by irregularities in the organization and use of the architecture’s register resources. Several irregularities can be found: • multi - register: ƒ ax is an example of the two-element multi-register ah:al • multi - bank register class, e.g.: ƒ cmps, can use only esi/edi ƒ mov eax, [ebx + esi + 2], can use any register



instructions may require predefined subset of registers, e.g.:

x86 multiply instruction, uses only register pair edx:eax to store the result: imul ebx // edx:eax = eax * ebx Register allocator must take into account these issues. ƒ

Figure 1 Iterated Register Coalescing I propose to implement Iterated-Coalescing Graph-Color Allocator, pseudo code for this algorithm can be found in [2], with modifications for irregular architectures [1,3]. This allocator uses interference graph to represent all program constraints. Main steps (blocks) of this algorithm: Renumber phase identifies live ranges in code and assigns register class for each live range. Register class is a set of hardware registers. Build phase constructs the interference graph, and categorize each node as being either move related or not. A move-related node is one that is either the source or destination of a move instruction. Each node represents variable live range. Edge represents competition of two live ranges for the same register resources. Simplify phase, one at a time, removes non-move-related nodes of low degree from the graph. Coalesce phase performs Briggs-style conservative coalescing on the reduced graph obtained in the simplification phase. Since the degrees of many nodes have been reduced by simplify, the conservative strategy is likely to find many more moves to coalesce than it would have in the initial interference graph. After two nodes have been coalesced (and the move instruction deleted), if the resulting node is no longer move related it will be available for the next round of simplification. Simplify and Coalesce are repeated until only significant-degree or move-related nodes remain. Freeze. If neither simplify nor coalesce applies, look for a move-related node of low degree. Freeze the moves in which this node is involved: that is, give up hope of coalescing those moves. This causes the node (and perhaps other nodes related to the frozen moves) to be considered not move related. Now, simplify and coalesce are resumed. Select phases try to color interference graph, if no coloring can be found, allocator spills chosen variable and try to color again.

1

Proposal for Google Summer of Code 2007 During Renumber phase all live ranges must be classified by the set of hardware resources that would satisfy the live range’s allocation needs. x86 integer registers can be divided into 4 classes:

Ci = { DI , EDI , SI , ESI }

Cb = { AL, AH , BL, BH , CL, CH , DL, DH } Ca = { AX , EAX , BX , EBX , CX , ECX , DX , EDX } Cm = { AX , EAX , BX , EBX , CX , ECX , DX , EDX , DI , EDI , SI , ESI } E.g.: variables live ranges for cmps instruction belong to

Ci class, for mov instruction Cm .

The main problem with irregular architectures is than they make it difficult for a graph-coloring register allocator to determine the colorability of a node in the interference graph. The colorabillity heuristic of a register candidate in regular architecture:

degree(n) < k On architectures with irregular register resources, the degree of a node n in the interference graph being less than k is no longer a sufficient condition for determining that node’s colorability. So, one of the approaches to adapt standard graph-coloring allocator for architectures with irregular register resources is to use WIG (Weighted Interference Graph). Using WIG standard coloring heuristic looks:

⎛ ⎡ wj ⎤ ⎞ ⎜⎜ ∑ ⎢ ⎥ ⎟⎟ ≤ pn ⎝ j∈Adj ( n ) ⎢ wn ⎥ ⎠ This formula works for architectures with regular and irregular register resources. Regular architecture: w j = wn = 1 , pn = k - number of available registers. Irregular architecture: pn - number of placements (alternatives) available to the members of the class.

w j - number of registers required for holding variable. Every register candidate in register class C uses the same values of w and p : p Class w 2 6 Cm

E.g.: register candidate from class

Ca

2

4

Cb

1

8

Ci

2

2

Ca blocks two registers from class Cb .

Real program interference graphs can be of several types, e.g.: perfect graph, chordal graph, etc. Chordal graphs have several useful properties. Problems such as minimum coloring, which is NP-complete in general, can be solved in polynomial time for chordal graphs. In particular, optimal coloring of a chordal graph G = (V, E) can de done in O(|V|+|E|). So it will be interesting to gather statistics on real application what percent of graphs is chordal.

Results:

The final package delivered at the end of the summer will include: 1. Implemented and integrated in Jikes RVM Graph-Color allocator. 2. Module for gathering statistics for interference graph types, nodes count, colors number. Chordal graph determination. 3. Comparison Graph Coloring Allocator vs. Linear Scan Allocator. 4. Code and algorithm documentation.

Schedule: •

April – 27 May: ƒ Jikes RVM source code investigation ƒ Existent Linear Scan register allocator source code analysis ƒ x86 architecture instruction set classification

2

Proposal for Google Summer of Code 2007 • •

ƒ Graph-Coloring Register allocator design in pseudo code 28 May – 9 July: Weekly report ƒ Get an initial version of Graph-Color Register Allocator operational within Jikes RVM 10 July – 20 August: Weekly report ƒ Get final version of Graph-Color Register Allocator ƒ Final testing, bug fixing ƒ Module for gathering statistics ƒ Some real examples that can show advantage of Graph Register Allocator over Linear Scan Allocator ƒ Documentation

Resource requirements:

1. Jikes RVM source code (from Jikes RVM website) 2. x86 Architecture instruction set manual (from www.intel.com)

Source code management:

Main or side branch of the Jikes RVM (under Common Public License 1.0)

Personal Information:

My research activities connected with optimizing compilers. Now I work on research compiler for Itanium processor (under supervision Intel Corporation Employees from Moscow Compiler Team and faculty members). My parts in this project: compiler infrastructure design (C++), control flow graph analysis, instruction scheduling, software pipelining (SWP), if-conversion. Some more information can be found in my resume http://alexeigor.googlepages.com/resume.pdf .

References: 1.

2. 3. 4. 5.

Smith, M. D., Ramsey, N., and Holloway, G. 2004. A generalized algorithm for graph-coloring register allocation. In Proceedings of the ACM SIGPLAN 2004 Conference on Programming Language Design and Implementation. Lal George and Andrew W. Appel. 1996(May). Iterated Register Coalescing. ACM Transactions on Programming Languages and Systems, 18(3):300-324. Michael D. Smith, Glenn Holloway. Graph-Coloring Register Allocation for Architectures with Irregular Register Resources. Harvard University. Preston Briggs. Register Allocation via Graph Coloring. 1992(April). PhD thesis, Rice University, Houston, Texas. P. Briggs, K. Cooper, and L. Torczon. 1992(March). Coloring Register Pairs. ACM Letters on Programming Languages and Systems, 1(1):3-13.

3

Graph Coloring Register Allocator for Jikes RVM

cmps, can use only esi/edi .... Real program interference graphs can be of several types, e.g.: perfect graph, chordal ... software pipelining (SWP), if-conversion.

70KB Sizes 2 Downloads 172 Views

Recommend Documents

graph coloring 1.pdf
Sign in. Loading… Whoops! There was a problem loading more pages. Retrying... Whoops! There was a problem previewing this document. Retrying.

Adroit Memory Allocator
Scalable Memory Allocator for Multithreaded Applications” IEEE. [5] YairSadel – “Optimizing C Multithreaded Memory Management Using Thread-Local. Storage” School of Computer Science, Tel-Aviv University, Israel. [6] Jason Evans - “A Scalabl

677/RVM(SSA)
14 Mahabubnagar. 3571. 21. 0. 15 Ranga Reddy. 1499. 39. 0. 16 Hyderabad. 1102. 0. 0. 17 Medak. 2814. 15. 11. 18 Nizamabad. 1526. 7. 43. 19 Adilabad. 3423. 193. 426. 20 Karimnagar. 2627. 16. 131. 21 Warangal. 3218. 158. 423. 22 Khammam. 3073. 166. 344

A 1.43-Competitive Online Graph Edge Coloring ... - Research at Google
degree of a graph is Δ, then it is possible to color its edges, in polynomial time ... Even in the random permutations model, the best analysis for any algorithm is ...

Adroit Memory Allocator
distinct processors inadvertently share data on the same cache line. 1.4 Low .... Adroit implements standard malloc API calls as a thin wrapper around private.

REGISTER
Iowa STEM School+Business. Innovation Conference. WHEN: Wednesday, June 29, 2016. 9:00 AM to 3:30 PM. WHERE: Sheraton West Des Moines Hotel, Des ...

REGISTER
Iowa STEM School+Business. Innovation Conference. WHEN: Wednesday, June 29, 2016. 9:00 AM to 3:30 PM. WHERE: Sheraton West Des Moines Hotel, Des ...

Coloring Books for Girls: Inspirational Coloring Book for ...
Books Synopsis : This is the perfect inspirational coloring book for girls to express their creativity, relax and have fun! This coloring book is great for girls of all ...

Adroit Memory Allocator Abstract
UoP, India. 1 [email protected] , 2 [email protected] , 3 [email protected] ,. 4 [email protected] , 5 [email protected]. Abstract. Adroit is a drop-in replacement for ... performance, though the memory and startup overhead

Implementing Register Files for High-Performance ... - CiteSeerX
Abstract— 3D integration is a new technology that will greatly increase transistor density ... improvement with a simultaneous energy reduction of 58.5%, while a four-die version ..... Figure 3(d) shows an alternative implementation of a 2-die ...

RVM N° 047-2015-MINEDU.pdf
Sign in. Page. 1. /. 37. Loading… Page 1 of 37. Page 1 of 37. Page 2 of 37. Page 2 of 37. Page 3 of 37. Page 3 of 37. RVM N° 047-2015-MINEDU.pdf. RVM N° ...

Texas Register
(I)AP Computer Science A; ... (M)Discrete Mathematics for Computer Science; ... (A)a coherent sequence of courses for four or more credits in career and ... (vii)Chapter 130, Subchapter K, of this title (relating to Information Technology); or.

Member Register -
Closure. Chief Judge and Contest Chair. 10:00. 12:35 AM. 12:45 AM. Gurgaon Toastmasters Club. Toastmasters Club # 1200975. 429th Meeting - Sep 3rd, 2017 (Sunday). Time: 9:00 AM - 12:45 PM. Address: The Shri Ram School , V- 37, Moulsari Avenue, DLF Ph

Register: http://bit.ly/LearningPower2014
During the upcoming school year, Technology and Innovation in Education (TIE), ... teacher-developed resources now featured on the Smarter Balanced Digital ...

RVM Rc.660.pdf
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. RVM Rc.660.pdf.

RVM MINCETUR AUTORIZA EXPOTEC 2017 - original.pdf ...
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. RVM ...

RVM N° 081-2015-MINEDU.pdf
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. RVM N° ...

o. 517/RVM(SSA) -
CwSN & JBAR - Special drive for screening of eyes of school age children in convergence with the. Health. Department -. Communication of certain guidelines - Regarding. Rajiv Vidya Mission implementing different programs in order to bring the out of