Guest lecture for Compiler Construction, Spring 2015

Verified compilers Magnus Myréen Chalmers University of Technology Mentions joint work with Ramana Kumar, Michael Norrish, Scott Owens and many more

Course info to compiling Examples Guest lecture forIntroduction Compiler Construction, Spring Javalette 2015

LLVM

CompCert 2005 – Program verification For safety-critical software, formal verification of program correctness may be worth the cost.

Verified compilers

Such verification is typically done of the source program. So what if the compiler is buggy?

Use a certified compiler!

CompCert is a compiler for a large subset of C, with PowerPC assembler as target language.

What? Written in Coq, a proof assistant for formal proofs.

Comes with a machine-checked proof that for any program, which does not generate a compilation error, the source and target programs behave identically. (Precise statement needs more details.)

(Sometimes called certified compilers, but that’s misleading…)

Trusting the compiler

Est

Trusting the compilerTesting compilers Bugs When finding a bug, we go to great lengths to find it in our own code. Bugs Most programmers trust the compiler to generate correct code When finding a bug, we go to great lengths to find it in our own The most important task of the compiler is to generate correct code. Establishing Compiler Correctness code Most programmers trust the compiler Maybe to generate it iscorrect worthcode the

Es

cost?

The most important task of the compiler is to generate correct code Cost reduction?

Establishing Compiler Correctness

e

t

de

Alternatives Proving the correctness of a compiler is prohibitively expensive (however, see the CompCert project) Testing is the only viable option Alternatives Proving the correctness of a compiler is prohibitively expensive … but(however, with testing you never know you caught all bugs! see the CompCert project)

All (unverified) compilers have bugs “ Every compiler we tested was found to crash and also to silently generate wrong code when presented with valid input. ” PLDI’11

ilers p m o C C in s g u B g in nd Finding and Understa Xuejun Yang

Yang Chen

Eric Eide

John Regehr

ol of Computing University of Utah, Scho edu ide, regehr }@cs.utah. { jxyang, chenyang, ee

“ [The verified part of] CompCert is the only compiler we havet tested for which Csmith cannot find wrong-code Abstrac errors. This is not for lack of trying: we have devoted about six CPU-years to the task.” mpilers, prove the quality of C co im To ct. rre co be ld ou sh and Compilers t-case generation tool, tes d ize om nd ra a , ith this period we created Csm d compiler bugs. During fin to it ing us ars ye piler ee thr spent ly unknown bugs to com us io ev pr 5 32 an th e or m we reported s found to crash and also wa ted tes we ler pi m co valid input. developers. Every de when presented with co g on wr te ra ne ge tly results to silen ler-testing tool and the pi m co r ou t en es pr we In this paper is to advance the . Our first contribution dy stu g in nt hu gbu Csmith r ou of g. Unlike previous tools, tin tes ler pi m co in t ar the e state of th subset of C while avoiding ge lar a r ve co t tha s ram ability generates prog s that would destroy its

1 2 3 4 5

int foo (void) { signed char x = 1; 5; unsigned char y = 25 return x > y; }

of GCC that shipped with on rsi ve the in g bu a d un Figure 1. We fo els it compiles 6. At all optimization lev x8 r fo 4.1 8.0 x nu Li mpiler tu Ubun result is 0. The Ubuntu co ct rre co the 1; n ur ret to g. this function GCC did not have this bu of on rsi ve se ba the ; ed was heavily patch d Csmith, a randomized

test-case generator that

sup-

This lecture: Verified compilers What? Proof that compiler produces good code.

rest of this lecture

Why?

To avoid bugs, to avoid testing.

How?

By mathematical proof…

Proving a compiler correct like first-order logic, or higher-order logic

Ingredients: • a formal logic for the proofs • accurate models of • the source language • the target language • the compiler algorithm Tools: • a proof assistant (software)

proofs are only about things that live within the logic, i.e. we need to represent the relevant artefacts in the logic a lot of details… (to get wrong)

… necessary to use mechanised proof assistant (think, ‘Eclipse for logic’) to avoid mistakes, missing details

Accurate model of prog. language Model of programs: • syntax — what it looks like • semantics — how it behaves e.g. an interpreter for the syntax

Major styles of (operational, relational) semantics: this style for structured source semantics • big-step this style for unstructured target semantics • small-step … next slides provide examples.

Syntax Source: exp = Num num | Var name | Plus exp exp

Target ‘machine code’: inst = Const name num | Move name name | Add name name name

Target program consists of list of inst

Source semantics (big-step) Big-step semantics as relation ↓ defined by rules, e.g. lookup s in env finds v (Num n, env) ↓ n

(x1, env) ↓ v1

(Var s, env) ↓ v

(x2, env) ↓ v2

(Add x1 x2, env) ↓ v1 + v2

called “big-step”: each step ↓ describes complete evaluation

Target semantics (small-step) “small-step”: transitions describe parts of executions We model the state as a mapping from names to values here. step (Const s n) state = state[s ↦ n] step (Move s1 s2) state = state[s1 ↦ state s2] step (Add s1 s2 s3) state = state[s1 ↦ state s2 + state s3] steps [] state = state steps (x::xs) state = steps xs (step x state)

Compiler function generated code stores result in register name (n) given to compiler compile (Num k) n = [Const n k] compile (Var v) n = [Move n v]

Relies on variable names in source to match variables names in target.

compile (Plus x1 x2) n = compile x1 n ++ compile x2 (n+1) ++ [Add n n (n+1)]

Uses names above n as temporaries.

Correctness statement Proved using proof assistant — demo! For every evaluation in the source … ∀x env res. (x, env) ↓ res

for target state and k, such that …

∀state k. (∀i env v. (lookup env i = SOME v)

(state i = v) ∧ i < k)

(let state' = steps (compile x k) state in (state' k = res) ∧ ∀i. i < k (state' i = state i))

k greater than all var names and state in sync with source env …

… in that case, the result res will be stored at location k in the target state after execution … and lower part of state left untouched.

A real language

Well, that example was simple enough…

But: Some people say: A programming language isn’t real until it has a self-hosting compiler

Bootstrapping for verified compilers? Yes!

Scaling up… POPL 2014 L M f o n o i lementat

p m I d e fi i r e V A : L CakeM umar K a n a m a R

reen nus O. My

1

2

3

ridge, U‡ K b m a C f o y , Universit y r to ustralia a r A o , b a A T L r C I te N u , b Comp esearch La iversity of Kent, UK R a r 2 r e b n a C puting, Un 3 chool of Com S

Mag

⇤ 1

† 1

orrish Michael N

ns Scott Owe

ation;

n ed compil sed o i ifi t r c e v u d in o t r a es 1. Int trong inter file results, many b is s a n e e s s o interest high-pr ecade ha

The last d ve been significant, 1, 14, 16, 29]. This nverified d e [ a u system call . nd there h ert compiler for C ram verification, an L a M n a d e t L C g p rifi omputing M o e c r m v p d o d r f y C a te o ll d s t a e n u x ic Abstrac th tr ta te n S e a n on y: in the co d complex part of th existing work on d and mech ubstantial subset of al-print loop f e ti p s lo ju e v to e d y s n e s a ev ea We have ms a large nowledge, none of th s has addressed all supports a an interactive read- orem ensures r o h f ic r h e il w p , m e e k co CakeML mented as e. Our correctness th lts permitted er, to our eral-purpose languag e, the compilation v le e p w o im H is . e s d n n su ba CakeML ilers for ge g two dimensions: o string to a list of machine co prints only those re t touches on p 4 m 6 o c 6 8 d x e ifi in n ver er alo ntation a source n effor il e o (REPL) p m ti m m a o le r c o f in n of that p c o ifi , r a m ti g im e a u f r v in c L o g k e r P o c x u ts r E e e c p O h R e e c th L. ga asp that this of CakeM lexing, parsing, type on, arbitraryr convertin chine code, and two, o s f c ti m n h a it r m o e a ti alg by the s verified esenting m in machine code. s including ation, garbage collec r e ic p v p e a r h to s r f e e o b w num a breadth d dynamic compil plemented r is to explain how imensions for a dapping. im il tr s u ts a b o o m in b h n r it a r e ly l o alg simp this pape is f these d is o d compil crementa in n t s a th r e h , o s c fi c b o a ti p e f e e r o h t u m r language a T p e h u p th . r it O o r u ld g c . a s o e O n f n g ti o ll a a io u u tr f tw is g s c pre ng the ctional ing lan demon ns are ed n lo s o , m u a o d f ti m e p u r t a e r m ic ifi ib il r g o tr tr p e o c s n r v , m o p e re b e nd a co Our c ral-purpos strongly typed, impu erified, we mean e is end-to-e fort can in practice rely on any n t e a g th l, a c m ti te prac ieces it is a l. By v on ef ing a sys el apand OCam a verificati ne of the p code along keML, and

First bootstrapping of a formally verified compiler.

Dimensions of Compiler Verification source code

how far compiler goes

abstract syntax intermediate language bytecode

Our verification covers the full spectrum of both dimensions.

machine code

compiler algorithm

implementation in ML

implementation in machine code

the thing that is verified

machine code as part of a larger system

Idea behind in-logic bootstrapping input: verified compiler function

Trustworthy code generation: functions in HOL (shallow embedding) proof-producing translation [ICFP’12, JFP’14] CakeML program (deep embedding) verified compilation of CakeML [POPL’14] x86-64 machine code (deep embedding) output: verified implementation of compiler function

The CakeML at a glance strict impure functional language

The CakeML language = Standard ML without I/O or functors

i.e. with almost everything else: ✓ higher-order functions ✓ mutual recursion and polymorphism ✓ datatypes and (nested) pattern matching ✓ references and (user-defined) exceptions ✓ modules, signatures, abstract types The verified machine-code implementation: parsing, type inference, compilation, garbage collection, bignums etc. implements a read-eval-print loop (see demo).

The CakeML compiler verification How? Mostly standard verification techniques as presented in this lecture, but scaled up to large examples. (Four people, two years.) Compiler: string

tokens

AST

IL

bytecode

New optimising compiler:

x86

ARM x86-64

IL-1

IL-2



IL-N

ASM

… work in progress (want to join? [email protected])

MIPS-64

Compiler verification summary Ingredients: • a formal logic for the proofs • accurate models of • the source language • the target language • the compiler algorithm Tools: • a proof assistant (software) Method: • (interactively) prove a simulation relation Questions? Interested?

Guest lecture for Compiler Construction, Spring 2015

references and (user-defined) exceptions. ✓ modules, signatures, abstract types. The CakeML language. = Standard ML without I/O or functors. The verified machine-code implementation: parsing, type inference, compilation, garbage collection, bignums etc. implements a read-eval-print loop (see demo).

1MB Sizes 0 Downloads 312 Views

Recommend Documents

CSE401 Introduction to Compiler Construction
intrinsicState. ConcreteFlyweight. Operation(extrinsicState). allState. UnsharedConcreteFlyweight. CSE403 Sp10. 10. Participants. • Flyweight (glyph in text example). – Interface through which flyweights can receive and act on extrinsic state. â€

spring 2015 western avenue – construction update - the City of ...
With the warmer weather back in town, crews will be looking to complete the project over the summer and early fall months. Below is a list of construction ...

spring 2015 western avenue – construction update - the City of ...
Pleasant Street (north of Massachusetts Avenue). • Installation of ... of transportation. To request that a bicycle rack be placed on public property, please email.

Compiler Construction Instructor : Dr. Tiayyba Riaz tiayyba.riaz@cs ...
[email protected]. Fall - 2014. Course Code : CS4435. Credit Hours : 3. Class : BS. Course Description: CS4435 provides the students with an ...

Compiler Construction using Flex and Bison
Most program time is spent in the body of loops so loop optimization can result in significant performance im- provement. Often the induction variable of a for loop is used only within the loop. In this case, the induction variable may be stored in a

Formal Compiler Construction in a Logical ... - Semantic Scholar
Research Initiative (MURI) program administered by the Office of Naval Research. (ONR) under ... address the problem of compiler verification in this paper; our main ...... Science Technical Report 32, AT&T Bell Laboratories, July 1975. Lee89 ...

Formal Compiler Construction in a Logical ... - Semantic Scholar
literate programming), and the complete source code is available online ..... class of atoms into two parts: those that may raise an exception, and those that do not, and ...... PLAN Notices, pages 199–208, Atlanta, Georgia, June 1988. ACM.

Automating the Construction of Compiler Heuristics ...
requirements for the degree of. Doctor of Philosophy of Science in Computer Science and Engineering. Abstract. Compiler writers are expected to create ...

Spring 2015.pdf
Loading… Whoops! There was a problem loading more pages. Retrying... Whoops! There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. Spring 2015.pdf. Spring 2015.

Spring 2015 Final.pdf
Pulitzer Prize. Winner,. Anthony Doerr,. Coming to our Library. October 14! The New York Times best-selling author, Anthony Doerr,. was recently awarded the Pulitzer Prize for his “imaginative. and intricate novel, All the Light We Cannot See, insp

Spring 2015 Final.pdf
Mark your calendar! A good crowd gathered on Wednesday, May 20th in. our Library community rooms to hear author,. Nichole Kear (pronounced “Car”) share insights. gained from her life and from experiencing the shock. of hearing, at age 19, that sh

AP Exam for Spring Break 2015-2016.pdf
AP Exam for Spring Break 2015-2016.pdf. AP Exam for Spring Break 2015-2016.pdf. Open. Extract. Open with. Sign In. Main menu. Displaying AP Exam for ...

OH Mag 54 Spring 2015 FOR WEB.pdf
of Rev Andrew McMillan of Dalneigh and Bona parish, who has left the church along. with some members of his congregation. Dr Whyte's reply is in response to ...

ILA British Branch Spring Conference 2015 Call for Papers.pdf ...
ILA British Branch Spring Conference 2015 Call for Papers.pdf. ILA British Branch Spring Conference 2015 Call for Papers.pdf. Open. Extract. Open with. Sign In.

Benchmarking the Compiler Vectorization for Multimedia Applications
efficient way to exploit the data parallelism hidden in ap- plications. ... The encoder also employs intra-frame analysis when cost effective. ... bigger set of data.

Concurrency-aware compiler optimizations for hardware description ...
semantics, we extend the data flow analysis framework to concurrent threads. .... duce two auxiliary concepts—Event Vector and Sensitivity Vector—in section 6, ...

Spring 2015 - California Home + Design.pdf
Page 1 of 6. GLODOW. NEAD. CALIFORNIA. HOME. +. DESIGN. COMMUNICATIONS. SPRING 201. 5. Page 1 of 6. Page 2 of 6. Page 2 of 6. Page 3 of 6.

Spring 2015 Burning Bush.pdf
like 'chronology'. The other is 'kairos', the time that is less quantitative. and more qualitative, time that may be short in terms. of duration but deep in terms of ...

Spring 2015 District Newsletter.pdf
net to become a subscriber. Whoops! There was a problem loading this page. Retrying... Whoops! There was a problem loading this page. Retrying... Spring 2015 District Newsletter.pdf. Spring 2015 District Newsletter.pdf. Open. Extract. Open with. Sign

Abrahamic Faiths--Spring 2015.pdf
Connect more apps... Try one of the apps below to open or edit this item. Abrahamic Faiths--Spring 2015.pdf. Abrahamic Faiths--Spring 2015.pdf. Open. Extract.