LaDeDa: Languages for Debuggable Distributed Algorithms Mark S. Miller Google, Inc.

1

Tom Van Cutsem∗ Vrije Universiteit Brussel

Language as Notation for Incorrect Programs

When programming language designs are presented, the examples are almost exclusively of correct programs. Most attention of programming language designers is indeed on the beauty and elegance of correct programs. For incorrect programs, great design attention is paid to catching errors early—such as fancy static type systems—so that many incorrect programs are never run. Due to the success of these efforts, many programs are either correct or inadmissible, conserving on the need for programmer attention. As a result, most of the attention working programmers spend looking at code is spent debugging incorrect running code. Often this is code written by others and only partially understood. What properties should such code have? How can programming language design encourage incorrect programs to have those properties that facilitate debugging? Distributed programs introduce additional difficult bugs of a different character. How should distributed language design facilitate the debugging of distributed programs? We explain how these considerations have affected four distributed language designs (E [1], AmbientTalk [5], Joe-E/Waterken [4], Dr. SES [2]) and one distributed debugging tool (Causeway [3]).

2

Bounding Boxes for Answering Questions

When debugging, you’re doing detective work. You do not need to understand the program as a whole, and often you cannot afford to. Rather, you’re trying to track down a particular anomaly: Why did this bad thing happen? How much of the program is relevant? How much of its execution trace? Ray tracing algorithms raise an analogous question: Of all the complex shapes in the scene, which of them intersect the ray? Their elegant solution is a system of simple bounding boxes so most of the scene can be cheaply ∗ Tom

Van Cutsem is a post-doctoral Fellow of the Research Foundation, Flanders (FWO)

1

disqualified, so that we can afford the complex calculations needed for the rest. Likewise, we need to disqualify most of the program from relevance to answering questions relevant to debugging. Possible causal influence is the most debuggingrelevant question. The notations in which we express programs form the data structure we are searching. • Mostly-functional programming bounds our worries about side effects to those parts of the program that need side effects for their expression. • Strict lexical scoping combined with call-by-value argument passing bounds our worries about what code may have assigned to a given location. • Encapsulation bounds our worries about what code may have directly violated a local invariant. • Object-capability rules and style—Defensive Consistency and the Principle of Least Authority—bound worries about indirect invariant violations. • Conventional sequential control flow bounds our worries about plan interference to those intervals when invariants are suspended. • Pure message-passing concurrency bounds our worries about possible nonsequential interleavings to arrival order non-determinacy. • Monotonic order-independent state transitions further bound indeterminacy. (Example: single-assignment of promises or logic variables.) • Pure communicating event-loop concurrency, by avoiding blocking receives, bounds our worries about distributed invariants to non-stack state. • Broken promise contagion bounds asynchronous failure handling to data dependencies. • Causality tracing bounds worries about prior corruption to happened before. • In a sequential debugger, visually emphasizing stack order over process order helps direct our suspicions to the more likely suspects first. • When visualizing causality traces, emphasizing message-order over process order helps direct our suspicions to the more likely suspects first.

3

Case study: message passing

E—a pure event-loop-concurrent distributed object-capability language—has two message passing constructs, the immediate call (written “.”) and the eventual send (written “←”). Each provides strong side effect guarantees, with opposite strengths and weaknesses. The familiar “b.foo(c)” immediately transfers control to b, which is necessarily local, suspending the caller until b returns. By contrast “bP ← foo(c)” queues, in the event loop hosting b, the need to deliver the f oo message to b. bP denotes a promise to b, indicating that b may be remote. Whether or not b is remote, this delivery only happens in a separate turn of the event loop, starting from an empty stack. Table 1 summarizes the advantages and disadvantages of these two message passing constructs. Both provide a strong set of complementary guarantees.

2

a performs: Virtue

Hazard

Immediate call b.foo(c) No interleaving occurs between a calling foo and foo being called on b.

b gets control while a is suspended, introducing potential plan interference if b violates a’s invariants.

Eventual send b<-foo(c) a proceeds and can safely repair suspended invariants before foo can ever affect its heap. Likewise, b starts processing foo from an empty stack, so there is no need to consider suspended invariants on the stack. b can assume all invariants have already been restored. Arbitrary code may have run between a sending foo and b executing foo. Therefore b must recheck all stateful assumptions on entry, other than restored invariants.

Table 1: Immediate call versus Eventual send

4

Summary

This position paper makes the case for debuggable distributed programming languages. Based on our prior experience in building distributed languages and debugging tools, we put forward a number of language properties that aid the programmer in reasoning about possibly faulty (distributed) code.

References [1] M. Miller, E. D. Tribble, and J. Shapiro. Concurrency among strangers: Programming in E as plan coordination. In Symposium on Trustworthy Global Computing, volume 3705 of LNCS, pages 195–229, April 2005. [2] M. S. Miller. Dr. ses: Distributed resilient secure ecmascript, April 2010. http://es-lab.googlecode.com/files/dr-ses.pdf. [3] T. Stanley, T. Close, and M. S. Miller. Causeway: A message-oriented distributed debugger. Technical Report HPL-2009-78, HP Labs, April 2009. [4] M. Stiegler and J. Tie. Introduction to waterken programming. Technical Report HPL-2010-89, HP Labs, August 2010. [5] T. Van Cutsem, S. Mostinckx, E. Gonzalez Boix, J. Dedecker, and W. De Meuter. Ambienttalk: object-oriented event-driven programming in mobile ad hoc networks. In Inter. Conf. of the Chilean Computer Science Society (SCCC), pages 3–12. IEEE Computer Society, 2007.

3

LaDeDa: Languages for Debuggable Distributed ... - Research

Rather, you're trying to track down a particular anomaly: Why did this bad thing happen? How much of the program is relevant? How much of its execution trace?

85KB Sizes 0 Downloads 273 Views

Recommend Documents

Evaluating Distributed Functional Languages for Telecommunications ...
UK Software & Systems Engineering Research group of Motorola Labs. Motorola .... Several companies already use the Erlang high-level lan- guage [1] and the .... system are specified using Message Sequence Charts [10], and designed ...

TTS for Low Resource Languages: A Bangla ... - Research at Google
Alexander Gutkin, Linne Ha, Martin Jansche, Knot Pipatsrisawat, Richard Sproat. Google, Inc. 1600 Amphitheatre Parkway, Mountain View, CA. {agutkin,linne,mjansche,thammaknot,rws}@google.com. Abstract. We present a text-to-speech (TTS) system designed

Kernel Methods for Learning Languages - Research at Google
Dec 28, 2007 - its input labels, and further optimize the result with the application of the. 21 ... for providing hosting and guidance at the Hebrew University.

TTS for Low Resource Languages: A Bangla ... - Research at Google
For the best concatenative unit-selection ... taught us the importance of good data collection tools. ... For the recordings we used an ASUS Zen fanless laptop.

Distributed divide-and-conquer techniques for ... - Research at Google
1. Introduction. Denial-of-Service (DoS) attacks pose a significant threat to today's Internet. The first ... the entire network, exploiting the attack traffic convergence.

Distributed divide-and-conquer techniques for ... - Research
We also evaluate the network traffic and storage overhead induced by our ... of scaling down the impact of the ongoing attacks in real-time ... [13] proposed probabilistic packet marking ..... variable-length prefix-free codes to symbols so as to.

BigTable: A System for Distributed Structured ... - Research at Google
2. Motivation. • Lots of (semi-)structured data at Google. – URLs: • Contents, crawl metadata ... See SOSP'03 paper at http://labs.google.com/papers/gfs.html.

Design patterns for container-based distributed ... - Research at Google
tectures built from containerized software components. ... management, single-node patterns of closely cooperat- ... profiling information of interest to de-.

Distributed MAP Inference for Undirected ... - Research at Google
Department of Computer Science, University of Massachusetts, Amherst. † Google Research, Mountain View. 1 Introduction. Graphical models have widespread ...

MapReduce/Bigtable for Distributed Optimization - Research at Google
With large data sets, it can be time consuming to run gradient based optimiza- tion, for example to minimize the log-likelihood for maximum entropy models.

Bigtable: A Distributed Storage System for ... - Research at Google
service consists of five active replicas, one of which is ... tains a session with a Chubby service. .... ble to networking issues between the master and Chubby,.

Distributed divide-and-conquer techniques for ... - Research
the attack sources a hard problem. An ideal DDoS attack defense mechanism for the Internet, should not only enable immediate and precise identification of the ...

Distributed Training Strategies for the Structured ... - Research at Google
ification we call iterative parameter mixing can be .... imum entropy model, which is not known to hold ..... of the International Conference on Machine Learning.

Distributed MAP Inference for Undirected ... - Research at Google
Department of Computer Science, University of Massachusetts, Amherst. † Google ... This jump is accepted with the following Metropolis-Hastings acceptance probability: .... by a model can be used as a measure of its rate of convergence. ... Uncerta

Idest: Learning a Distributed Representation for ... - Research at Google
May 31, 2015 - Natural Language. Engineering, 7(4):343–360. Mausam, M. Schmitz, R. Bart, S. Soderland &. O. Etzioni (2012). Open language learning for in-.

Distributed divide-and-conquer techniques for ... - Research
out-of-band messaging channel for the victim to detect packet audit trails. The use of .... sent the degree and attack sub-tree of router Ri respectively. Eqn. 7 then generalizes .... packet-level metrics (content signature, protocols/flags in use, e

DISTRIBUTED ACOUSTIC MODELING WITH ... - Research at Google
best rescoring framework for Google Voice Search. 87,000 hours of training .... serving system (SSTable service) with S servers each holding. 1/S-th of the data.

The-COMANDOS-Distributed-Application-Platform-Research ...
The-COMANDOS-Distributed-Application-Platform-Research-Reports-Esprit.pdf. The-COMANDOS-Distributed-Application-Platform-Research-Reports-Esprit.

DISTRIBUTED DISCRIMINATIVE LANGUAGE ... - Research at Google
formance after reranking N-best lists of a standard Google voice-search data ..... hypotheses in domain adaptation and generalization,” in Proc. ICASSP, 2006.

revisiting distributed synchronous sgd - Research at Google
The recent success of deep learning approaches for domains like speech recognition ... but also from the fact that the size of available training data has grown ...

A distributed system architecture for a distributed ...
Advances in communications technology, development of powerful desktop workstations, and increased user demands for sophisticated applications are rapidly changing computing from a traditional centralized model to a distributed one. The tools and ser

Improving Word Alignment with Bridge Languages - Research at Google
quality of a phrase-based SMT system (Och and ... focussed on improving the word alignment quality ... parallel data from Spanish (Es), French (Fr), Rus-.

Improving Word Alignment with Bridge Languages - Research at Google
Google Inc. 1600 Amphitheatre .... We first express the posterior probability as a sum over all .... We now present experiments to demonstrate the ad- vantages of ...