LaDeDa: Languages for Debuggable Distributed Algorithms Mark S. Miller Google, Inc.

1

Tom Van Cutsem∗ Vrije Universiteit Brussel

Language as Notation for Incorrect Programs

When programming language designs are presented, the examples are almost exclusively of correct programs. Most attention of programming language designers is indeed on the beauty and elegance of correct programs. For incorrect programs, great design attention is paid to catching errors early—such as fancy static type systems—so that many incorrect programs are never run. Due to the success of these efforts, many programs are either correct or inadmissible, conserving on the need for programmer attention. As a result, most of the attention working programmers spend looking at code is spent debugging incorrect running code. Often this is code written by others and only partially understood. What properties should such code have? How can programming language design encourage incorrect programs to have those properties that facilitate debugging? Distributed programs introduce additional difficult bugs of a different character. How should distributed language design facilitate the debugging of distributed programs? We explain how these considerations have affected four distributed language designs (E [1], AmbientTalk [5], Joe-E/Waterken [4], Dr. SES [2]) and one distributed debugging tool (Causeway [3]).

2

Bounding Boxes for Answering Questions

When debugging, you’re doing detective work. You do not need to understand the program as a whole, and often you cannot afford to. Rather, you’re trying to track down a particular anomaly: Why did this bad thing happen? How much of the program is relevant? How much of its execution trace? Ray tracing algorithms raise an analogous question: Of all the complex shapes in the scene, which of them intersect the ray? Their elegant solution is a system of simple bounding boxes so most of the scene can be cheaply ∗ Tom

Van Cutsem is a post-doctoral Fellow of the Research Foundation, Flanders (FWO)

1

disqualified, so that we can afford the complex calculations needed for the rest. Likewise, we need to disqualify most of the program from relevance to answering questions relevant to debugging. Possible causal influence is the most debuggingrelevant question. The notations in which we express programs form the data structure we are searching. • Mostly-functional programming bounds our worries about side effects to those parts of the program that need side effects for their expression. • Strict lexical scoping combined with call-by-value argument passing bounds our worries about what code may have assigned to a given location. • Encapsulation bounds our worries about what code may have directly violated a local invariant. • Object-capability rules and style—Defensive Consistency and the Principle of Least Authority—bound worries about indirect invariant violations. • Conventional sequential control flow bounds our worries about plan interference to those intervals when invariants are suspended. • Pure message-passing concurrency bounds our worries about possible nonsequential interleavings to arrival order non-determinacy. • Monotonic order-independent state transitions further bound indeterminacy. (Example: single-assignment of promises or logic variables.) • Pure communicating event-loop concurrency, by avoiding blocking receives, bounds our worries about distributed invariants to non-stack state. • Broken promise contagion bounds asynchronous failure handling to data dependencies. • Causality tracing bounds worries about prior corruption to happened before. • In a sequential debugger, visually emphasizing stack order over process order helps direct our suspicions to the more likely suspects first. • When visualizing causality traces, emphasizing message-order over process order helps direct our suspicions to the more likely suspects first.

3

Case study: message passing

E—a pure event-loop-concurrent distributed object-capability language—has two message passing constructs, the immediate call (written “.”) and the eventual send (written “←”). Each provides strong side effect guarantees, with opposite strengths and weaknesses. The familiar “b.foo(c)” immediately transfers control to b, which is necessarily local, suspending the caller until b returns. By contrast “bP ← foo(c)” queues, in the event loop hosting b, the need to deliver the f oo message to b. bP denotes a promise to b, indicating that b may be remote. Whether or not b is remote, this delivery only happens in a separate turn of the event loop, starting from an empty stack. Table 1 summarizes the advantages and disadvantages of these two message passing constructs. Both provide a strong set of complementary guarantees.

2

a performs: Virtue

Hazard

Immediate call b.foo(c) No interleaving occurs between a calling foo and foo being called on b.

b gets control while a is suspended, introducing potential plan interference if b violates a’s invariants.

Eventual send b<-foo(c) a proceeds and can safely repair suspended invariants before foo can ever affect its heap. Likewise, b starts processing foo from an empty stack, so there is no need to consider suspended invariants on the stack. b can assume all invariants have already been restored. Arbitrary code may have run between a sending foo and b executing foo. Therefore b must recheck all stateful assumptions on entry, other than restored invariants.

Table 1: Immediate call versus Eventual send

4

Summary

This position paper makes the case for debuggable distributed programming languages. Based on our prior experience in building distributed languages and debugging tools, we put forward a number of language properties that aid the programmer in reasoning about possibly faulty (distributed) code.

References [1] M. Miller, E. D. Tribble, and J. Shapiro. Concurrency among strangers: Programming in E as plan coordination. In Symposium on Trustworthy Global Computing, volume 3705 of LNCS, pages 195–229, April 2005. [2] M. S. Miller. Dr. ses: Distributed resilient secure ecmascript, April 2010. http://es-lab.googlecode.com/files/dr-ses.pdf. [3] T. Stanley, T. Close, and M. S. Miller. Causeway: A message-oriented distributed debugger. Technical Report HPL-2009-78, HP Labs, April 2009. [4] M. Stiegler and J. Tie. Introduction to waterken programming. Technical Report HPL-2010-89, HP Labs, August 2010. [5] T. Van Cutsem, S. Mostinckx, E. Gonzalez Boix, J. Dedecker, and W. De Meuter. Ambienttalk: object-oriented event-driven programming in mobile ad hoc networks. In Inter. Conf. of the Chilean Computer Science Society (SCCC), pages 3–12. IEEE Computer Society, 2007.

3

LaDeDa: Languages for Debuggable Distributed ... - Research

Rather, you're trying to track down a particular anomaly: Why did this bad thing happen? How much of the program is relevant? How much of its execution trace?

85KB Sizes 0 Downloads 63 Views

Recommend Documents

Kernel Methods for Learning Languages - Research at Google
Dec 28, 2007 - its input labels, and further optimize the result with the application of the. 21 ... for providing hosting and guidance at the Hebrew University.

TTS for Low Resource Languages: A Bangla ... - Research at Google
For the best concatenative unit-selection ... taught us the importance of good data collection tools. ... For the recordings we used an ASUS Zen fanless laptop.

Distributed divide-and-conquer techniques for ... - Research at Google
1. Introduction. Denial-of-Service (DoS) attacks pose a significant threat to today's Internet. The first ... the entire network, exploiting the attack traffic convergence.

Bigtable: A Distributed Storage System for ... - Research at Google
service consists of five active replicas, one of which is ... tains a session with a Chubby service. .... ble to networking issues between the master and Chubby,.

MapReduce/Bigtable for Distributed Optimization - Research at Google
With large data sets, it can be time consuming to run gradient based optimiza- tion, for example to minimize the log-likelihood for maximum entropy models.

Distributed divide-and-conquer techniques for ... - Research
We also evaluate the network traffic and storage overhead induced by our ... of scaling down the impact of the ongoing attacks in real-time ... [13] proposed probabilistic packet marking ..... variable-length prefix-free codes to symbols so as to.

Managing Distributed UPS Energy for Effective ... - Research at Google
orchestrate battery charging and discharging while address- ing reliability and .... 8 cores (Intel Xeon 5570) at 2.40 GHz, 8 GB of memory, and costing $1500.

Design patterns for container-based distributed ... - Research at Google
tectures built from containerized software components. ... management, single-node patterns of closely cooperat- ... profiling information of interest to de-.

Distributed Training Strategies for the Structured ... - Research at Google
ification we call iterative parameter mixing can be .... imum entropy model, which is not known to hold ..... of the International Conference on Machine Learning.

DISTRIBUTED ACOUSTIC MODELING WITH ... - Research at Google
best rescoring framework for Google Voice Search. 87,000 hours of training .... serving system (SSTable service) with S servers each holding. 1/S-th of the data.

The-COMANDOS-Distributed-Application-Platform-Research ...
The-COMANDOS-Distributed-Application-Platform-Research-Reports-Esprit.pdf. The-COMANDOS-Distributed-Application-Platform-Research-Reports-Esprit.

A distributed system architecture for a distributed ...
Advances in communications technology, development of powerful desktop workstations, and increased user demands for sophisticated applications are rapidly changing computing from a traditional centralized model to a distributed one. The tools and ser

Improving Word Alignment with Bridge Languages - Research at Google
quality of a phrase-based SMT system (Och and ... focussed on improving the word alignment quality ... parallel data from Spanish (Es), French (Fr), Rus-.