Fast startup and low latency: pick two Denys Shabalin and Lukas Kellenberger

EPFL

Scala Native •

Announced a year ago with a first prototype at ScalaDays ’16 in New York.



An ahead-of-time compiler for Scala build on top of the LLVM compiler infrastructure.



Developed by EPFL and Scala Center.

Scala Native •

As of today: •

55 contributors



383 pull requests closed



246 issues fixed

Scala Native Thanks to everyone who contributed! Guillaume Massé, Martin Duhem, Hugo Kapp, Hiroyoshi Takahashi, Jonas Fonseca, Lukas Kellenberger, Francois Bertrand, AndreaTP, Eric K Richardson, Marius B. Kotsbak, Kota Mizushima, Łukasz Indykiewicz, Timothy Klim, Paweł Batko, Shunsuke Otani, Nick Pavlov, Cedric Viaccoz, Andrzej Sołtysik, Ankit Soni, Hanns Holger Rutz, Kamil Tomala, Philipp Dörfler, Simon Ochsenreither, Zack Powers, Musabilal, Hubert Plociniczak, Mike Samsonov, Greg Dorrell, Pablo Guerrero, Florian Duraffourg, Alex Dupre, Ragavendar Ramamurthi, Richard Whaling, Roman Zoller, Ruben Berenguel Montoro, Saleem Ansari, Sam Halliday, Felix Mulder, Ragnr, Stefan Ollinger, The Gitter Badger, Tim Nieradzik, Brad Rathke, Viacheslav Blinov, Vincent Munier, Remi, Alexey Kutepov, Adam Voss, Gregor-i, Gute-ist-tot, Kenji Yoshida, Joseph Price, Jentsch, Ignat Loskutov, Martin Mauch

Road towards 0.1 (March 14, 2017)

Goals for 0.1 •

All Scala language features are supported



Sbt integration is sufficient to build and publish existing cross-platform projects



Enough core libraries to cover for basic standard library usage

Improving the standard library story in 0.2 (April 26, 2017)

Goals for 0.2 •

Support for file i/o from java.io.*



Support for regex from java.util.regex.*



Event-loop-based Futures

Laying down the foundation for better garbage collection in 0.3 (June 6, 2017)

Goal for 0.3

nativeGC := “boehm”

Boehm GC •

Conservative garbage collector



Originally designed for C/C++ environment



Good starting point for a new language implementation due to it simple GC interface

Conservative GC •

Conservative roots: GC doesn’t precisely know which values on the stack are heap references, but object layout is known.



Fully conservative: GC doesn’t precisely know which values on the stack or heap are references.

Boehm GC



How fast/slow are we exactly?



github.com/smarr/are-we-fast-yet

Boehm GC 5

3.75

2.5

1.25

0 bounce

brainfuck

cd

deltablue

gcbench

havlak

json

list

listperm mandelbrot

nbody

permute

queens

richards

sieve

storage

sudoku

towers

tracer

nativeGC := “none”

Running without GC •

The simplest form of garbage collection: allocate and never free allocated memory



Practical for short-lived command-line tools



Sometimes used in applications with insane requirements for application predictability

From: [email protected] (Kent Mitchell) Subject: Re: Does memory leak? Date: 1995/03/31 newsgroups: comp.lang.ada This sparked and interesting memory for me. I was once working with a customer who was producing on-board software for a missile. In my analysis of the code, I pointed out that they had a number of problems with storage leaks. Imagine my surprise when the customers chief software engineer said "Of course it leaks". He went on to point out that they had calculated the amount of memory the application would leak in the total possible flight time for the missile and then doubled that number. They added this much additional memory to the hardware to "support" the leaks. Since the missile will explode when it hits it's target or at the end of it's flight, the ultimate in garbage collection is performed without programmer intervention. -Kent Mitchell Technical Consultant Rational Software Corporation

| One possible reason that things aren't | going according to plan is ..... | that there never *was* a plan!

https://groups.google.com/forum/message/raw?msg=comp.lang.ada/E9bNCvDQ12k/1tezW24ZxdAJ

Running without GC

Evaluating cost of GC None

Boehm

5

3.75

2.5

1.25

0 bounce

brainfuck

cd

deltablue

gcbench

havlak

json

list

listperm mandelbrot

nbody

permute

queens

richards

sieve

storage

sudoku

towers

tracer

nativeGC := “?“

Non Moving

movability

GCs in the wild Reference Counting (RC) Mark And Sweep (MS)

Mostly Moving Semi Space (SS)

Always Moving

Mark Compact (MC) SS + MS/MC conservatism

Fully Precise

Conservative Roots

Fully Conservative

Non Moving

Mostly Moving

movability

Always Moving GCs Standard choice in most JIT-ed VMs: • GC is free to compact the memory
 optimizing it for best locality • Bump allocation is as fast as it gets
 in terms of allocation performance.

Semi Space (SS)

Always Moving

Mark Compact (MC) SS + MS/MC conservatism

Fully Precise

Conservative Roots

Fully Conservative

Non Moving

movability

Always Moving GCs

But: • Compiler must know about GC • Compiler must never break GC invariants

Mostly Moving Semi Space (SS)

Always Moving

Mark Compact (MC) SS + MS/MC conservatism

Fully Precise

Conservative Roots

Fully Conservative

Non Moving

movability

Non-Moving GCs Reference Counting (RC) Mark And Sweep (MS)

Mostly Moving

Always Moving

Standard pick for AOT compilers: • GC never moves objects in memory • Compiler may or may not know about GC • Good interoperability with unmanaged code
 that can not easily handle a moving GC conservatism

Fully Precise

Conservative Roots

Fully Conservative

Non Moving

movability

Reference Counting Reference Counting (RC) Mark And Sweep (MS)

Mostly Moving

Always Moving

Simple idea: every object maintains a number of references that’s automatically updated behind the scenes

conservatism

Fully Precise

Conservative Roots

Fully Conservative

Non Moving

movability

Reference Counting Reference Counting (RC) Mark And Sweep (MS)

Mostly Moving

Always Moving

But: • Naive version can’t handle cycles, 
 needs a cycle collector and to make 
 it practical for Scala • Prone to high constant overhead 
 that’s necessary to maintain refcounts conservatism

Fully Precise

Conservative Roots

Fully Conservative

Non Moving

movability

Reference Counting Reference Counting (RC) Mark And Sweep (MS)

Mostly Moving

Always Moving

Simple idea: traverse heap on garbage collection, mark all visited objects, sweep non-marked objects to free lists

conservatism

Fully Precise

Conservative Roots

Fully Conservative

Non Moving

movability

Non Moving GCs Reference Counting (RC) Mark And Sweep (MS)

Mostly Moving

Always Moving

Both suffer from: • Fragmentation and memory
 locality issues due to non-moving 
 nature of the collectors • Typically backed by free list-based 
 allocator which is not competitive in 
 terms of allocation performance conservatism

Fully Precise

Conservative Roots

Fully Conservative

Non Moving

movability

Status quo revisited

Boehm

Mostly Moving

Always Moving conservatism

Fully Precise

Conservative Roots

Fully Conservative

Non Moving

movability

Status quo revisited

Boehm Boehm

Mostly Moving

Always Moving conservatism

Fully Precise

Conservative Roots

Fully Conservative

Non Moving

movability

Initial experiments

MS

Boehm

Mostly Moving

Always Moving conservatism

Fully Precise

Conservative Roots

Fully Conservative

MS Stack

Modules

MS Stack

Modules

MS Stack

Modules

Marking Phase

MS Stack

Modules

Marking Phase

MS Stack

Modules

Marking Phase

MS Stack

Modules

Marking Phase

MS Stack

Modules

Marking Phase

MS Stack

Modules

Marking Phase

MS Stack

Modules

Marking Phase

MS Stack

Modules

Marking Phase

MS Stack

Modules

Marking Phase

MS Stack

Modules

Marking Phase

MS Stack

Modules

Marking Phase

MS Stack

Modules

Marking Phase

MS Stack

Modules

Marking Phase

MS Stack

Modules

Marking Phase

MS Stack

Modules

Sweeping Phase

MS Stack

Modules

Sweeping Phase

MS Stack

Modules

Sweeping Phase

MS Stack

Modules

Sweeping Phase

MS Stack

Modules

Sweeping Phase

MS Stack

Modules

Sweeping Phase

MS Stack

Modules

Sweeping Phase

MS Stack

Modules

Sweeping Phase

MS Stack

Modules

Sweeping Phase

MS Stack

Modules

Sweeping Phase

MS Stack

Modules

Sweeping Phase

MS Stack

Modules

Sweeping Phase

MS Stack

Modules

Sweeping Phase

MS Stack

Modules

Sweeping Phase

MS Performance None

Boehm

MS

5

3.75

2.5

1.25

0

bounce

brainfuck

cd

deltablue

gcbench

havlak

json

list

listperm mandelbrot

nbody

permute

queens

richards

sieve

storage

sudoku

towers

tracer

movability

Initial experiments Non Moving MS

Mostly Moving

Always Moving

Boehm

Initial results: none of the free list-backed allocator strategies we’ve tried manage to scale to GC-allocation-heavy workloads in comparison to bump allocation.

conservatism

Fully Precise

Conservative Roots

Fully Conservative

movability

Initial experiments Non Moving MS

Boehm

Mostly Moving

Always Moving conservatism

Fully Precise

Conservative Roots

Fully Conservative

Non Moving

movability

Mostly-Moving Sweet Spot Mostly-Moving: GC can move objects around as long as they are not referenced from the roots (i.e. pinned.) Bartlett

Mostly Moving

Immix

Always Moving conservatism

Fully Precise

Conservative Roots

Fully Conservative

Non Moving

movability

Bartlett Copying GC that’s able to pin objects which are referred to from the roots. Used in Safari’s WebKit engine. Bartlett

Mostly Moving

Always Moving

“Compacting Garbage Collection with Ambiguous Roots” Joel F. Bartlett

conservatism

Fully Precise

Conservative Roots

Fully Conservative

Non Moving

movability

Immix

Mark-region garbage collector with opportunistic one-pass defragmentation.

Mostly Moving

Always Moving

Immix

“Immix: a mark-region garbage collector with space efficiency, fast collection, and mutator performance” Stephen M. Blackburn, Kathryn S. McKinley conservatism

Fully Precise

Conservative Roots

Fully Conservative

Immix Heap

Immix Block

Immix Line

Immix

Immix

Allocation

Immix

Allocation

Immix

Allocation

Immix

Allocation

Immix

Allocation

Immix

Allocation

Immix

Allocation

Immix

Allocation

Immix Stack

Modules

Marking Phase

Immix Stack

Modules

Marking Phase

Immix Stack

Modules

Marking Phase

Immix Stack

Modules

Marking Phase

Immix Stack

Modules

Marking Phase

Immix Stack

Modules

Marking Phase

Immix Stack

Modules

Marking Phase

Immix Stack

Modules

Sweeping Phase

Immix Stack

Modules

Sweeping Phase

Immix Stack

Modules

Sweeping Phase

Immix Stack

Modules

Sweeping Phase

Immix Stack

Modules

Sweeping Phase

Immix Stack

Modules

Sweeping Phase

Immix Stack

Modules

Sweeping Phase

Immix Stack

Modules

Sweeping Phase

Immix Stack

Modules

Sweeping Phase

Immix Performance

Immix Performance None

Boehm

Immix

5

3.75

2.5

1.25

0

bounce

brainfuck

cd

deltablue

gcbench

havlak

json

list

listperm mandelbrot

nbody

permute

queens

richards

sieve

storage

sudoku

towers

tracer

Immix



First prototype is coming in Scala Native 0.3



Opt-in via nativeGC := “immix”



We’re going to support Boehm until the immix implementation matures and becomes the default

Bonus features in 0.3



Sbt testing framework integration



Initial support for file i/o from java.nio.*



Initial support for zip/jar from java.util.*



Smaller binaries

Questions?

Denys Shabalin and Lukas Kellenberger EPFL - GitHub

Jun 6, 2017 - of the code, I pointed out that they had a number of problems with ... https://groups.google.com/forum/message/raw?msg=comp.lang.ada/ ...

2MB Sizes 21 Downloads 334 Views

Recommend Documents

Scalable Component Abstractions - EPFL
from each other, which gives a good degree of type safety. ..... Master's thesis, Technische Universität ... Department of Computer Science, EPFL, Lausanne,.

Energy-Proportional Networked Systems - EPFL
Video streaming, Cloud computing. • CMOS reaching a plateau in power-efficiency ... NETWORK. Threats to Internet's growth. Power deliver/. Cooling problems.

Clearing the Clouds - (PARSA) @ EPFL
particularly in the organization of instruction and data memory systems and .... has emerged as a popular approach to large-scale analysis, farming out requests .... statistics,1 which measure the number of cycles when there is at least one L2 ...

Scalable Component Abstractions - LAMP | EPFL
We identify three programming language abstractions for the construction of .... out to be a convenient way to express required services of a component at the ..... the method call f() does not constitute a well-formed path. Type selection and ...

DynaProg for Scala - Infoscience - EPFL
In a deliberate design decision to simplify the hardware, there exist no ... 8http://mc.stanford.edu/cgi-bin/images/5/5f/Darve_cme343_cuda_2.pdf .... 10. 2.3 Scala. «Scala is a general purpose programming language designed to .... Sequences alignmen

Real Time Protocol (RTP) - EPFL
From a developer's perspective, RTP belongs to the application layer rather than the transport layer. 3. Real Time Transport Protocol (RTP). ❑ RTP. ○ uses UDP.

Clearing the Clouds - (PARSA) @ EPFL
Today's popular online services, such as web search, social net- works, and video .... view of the applications most commonly found in today's clouds, along with ... on a different piece of the media file for each client, even when concurrently ....

Scalable Component Abstractions - LAMP | EPFL
Classes on every level can create objects ... level might be a simple element on the next level of scale. ...... Department of Computer Science, EPFL, Lausanne,.

Scalable Component Abstractions - LAMP - EPFL
software components with static data and hard references, resulting in a ... aspect-oriented programming (indeed, the fragment system .... An important issue in component systems is how to ab- ... this section gives an introduction to object-oriented

Hyperlink - EPFL - PostDoc.pdf
control strategies for large-scale transportation systems remains a big challenge,. due to the high unpredictability and heterogeneity of traveler decisions, the.

Integration of LED's and GaAs circuits by MBE ... - Infoscience - EPFL
Under illumination, the MESFET behaves as if a positive voltage were applied to the gate. Fig. 2 shows the I-V characteristics of an enhancement mode.

A WIDEBAND DOUBLY-SPARSE APPROACH ... - Infoscience - EPFL
a convolutive mixture of sources, exploiting the time-domain spar- sity of the mixing filters and the sparsity of the sources in the time- frequency (TF) domain.

Incentives for Answering Hypothetical Questions - Infoscience - EPFL
can be used as a basis for rewards that enforce a truthful equilibrium. In our previous work [11] we presented an opinion poll mech- anism for settings with 2 possible answers that does not require knowledge of agent beliefs. When the current poll re

Strain estimation in digital holographic ... - Infoscience - EPFL
P. O'Shea, “A fast algorithm for estimating the parameters of a quadratic FM signal,” IEEE Trans. Sig. Proc. 52,. 385–393 (2004). 12. E. Aboutanios and B. Mulgrew, “Iterative frequency estimation by interpolation on Fourier coefficients,” I

A WIDEBAND DOUBLY-SPARSE APPROACH ... - Infoscience - EPFL
Page 1 .... build the matrices BΩj such that BΩj · a(j) ≈ 0, by selecting rows of Bnb indexed ... To build the narrowband CR [6] we first performed a STFT and then.

Efficiently Maintaining Distributed Model-Based ... - Infoscience - EPFL
their own local streams in different local networks. s2. 10.2. 11.1. : raw data stream model-based view. 3.1. 4.5. : 8.5. 8.2. : s4 s5 s'2. 10.1. 11.1. : s3. 0.9. 2.3. : 1.0.

ATraPos: Adaptive Transaction Processing on ... - Infoscience - EPFL
into account a) static data dependencies, b) dynamic workload information, and c) ...... workload using the array-based approach described above. It periodically ...

accelerometer - enhanced speed estimation for ... - Infoscience - EPFL
have to be connected to the mobile slider part. It contains the ... It deals with design and implementation of controlled mechanical systems. Its importance ...... More precise and cheaper sensors are to be expected in the future. 3.2 Quality of ...

Strain estimation in digital holographic ... - Infoscience - EPFL
11. P. O'Shea, “A fast algorithm for estimating the parameters of a quadratic FM signal,” IEEE Trans. Sig. Proc. 52,. 385–393 (2004). 12. E. Aboutanios and B. Mulgrew, “Iterative frequency estimation by interpolation on Fourier coefficients,â

accelerometer - enhanced speed estimation for ... - Infoscience - EPFL
A further increase in position resolution limits the maximum axis speed with today's position encoders. This is not desired and other solutions have to be found.

Paired comparison-based subjective quality ... - Infoscience - EPFL
Abstract As 3D image and video content has gained significant popularity, sub- ... techniques and the development of objective 3D quality metrics [10, 11]. .... is 'better' or 'worse' than stimulus B) and a more subdivided form (stimulus A is ......