Disorderly Distributed Programming with Bloom Emily Andrews, Peter Alvaro, Peter Bailis, Neil Conway, Joseph M. Hellerstein, William R. Marczak UC Berkeley David Maier Portland State University

Conventional Programming

Input x

Output f(x)

Distributed Programming Output? f(x)

Input x

Asynchronous Network

Output? f(x)

Output? f(x)

Problem: Different nodes might perceive different event orders

Taking Order For Granted Data

(Ordered) array of bytes

Compute (Ordered) sequence of instructions

Writing order-sensitive programs is too easy!

Alternative #1: Enforce consistent event order at all nodes Extensive literature: •  Replicated state machines •  Consensus, coordination •  Group communication •  “Strong Consistency”

Alternative #1: Enforce consistent event order at all nodes Problems: 1.  Latency 2.  Availability

Alternative #2: Analyze all event orders, ensure correct behavior

Alternative #2: Analyze all event orders to ensure correct behavior Problem: That is really, really hard

Alternative #3: Write order-independent (“disorderly”) programs

Alternative #3: Write order-independent (“disorderly”) programs Questions: •  How to write such programs? •  What can we express in a disorderly way?

Disorderly Programming 1.  Program analysis: CALM –  Where is order needed? And why?

2.  Language design: Bloom –  Order-independent by default –  Ordering is possible but explicit

3.  Mixing order and disorder: Blazes –  Order synthesis and optimization

4.  Algebraic programming

CALM: Consistency As Logical Monotonicity

History •  Roots in UCB database research (~2005) –  High-level, declarative languages for network protocols & distributed systems –  “Small programs for large clusters” (BOOM)

•  Distributed programming with logic –  State: sets (relations) –  Computation: deductive rules over sets •  SQL, Datalog, etc.

Observation: Much of Datalog is order-independent.

Monotonic Logic •  As input set grows, output set does not shrink –  “Mistake-free”

•  Order independent •  e.g., map, filter, join, union, intersection

Non-Monotonic Logic •  New inputs might invalidate previous outputs •  Order sensitive •  e.g., aggregation, negation

Agents learn strictly more knowledge over time

Different learning order, same final outcome Deterministic outcome, despite network non-determinism (“Confluent”)

Confluent Programming

Input x

Asynchronous Network

Output f(x)

Consistency As Logical Monotonicity

CALM Analysis (CIDR’11)

1. Monotone programs are deterministic 2. Simple syntactic test for monotonicity Result: Whole-program static analysis for" eventual consistency

CALM: Beyond Sets •  Monotonic logic: growing sets –  Partial order: set containment

•  Expressive but sometimes awkward –  Timestamps, version numbers, threshold tests, directories, sequences, …

Challenge: Extend monotone logic to support other flavors of “growth over time”

hS,t,?i is a bounded join semilattice iff: –  S is a set –  t is a binary operator (“least upper bound”) •  Induces a partial order on S: x ·S y if x t y = y •  Associative, Commutative, and Idempotent –  “ACID 2.0”

•  Informally, LUB is “merge function” for S

–  ? is the “least” element in S •  8x 2 S: ? t x = x

Time

{a,b,c} {a,b} {b}

{b,c} {a}

7 {a,c} {c}

Set (t = Union)

5 3

true

7 5

7

Increasing Int (t = Max)

false

Boolean (t = Or)

f : S→T is a monotone function iff: 8a,b 2 S : a ·S b ) f(a) ·T f(b)

Time Monotone function: set → increase-int

{a,b,c}

Monotone function: increase-int → boolean

size()

{a,b} {b}

{b,c} {a}

true

3 >= 3

{a,c}

2

false

{c}

1

false

Increasing Int (t = Max)

Boolean (t = Or)

Set (t = Union)

The Bloom Programming Language

Bloom Basics Communication State Computation “Disorderly” Computation

Message passing between agents Lattices Functions over lattices Monotone functions

Bloom Operational Model

Now

State Update Clock Events Inbound Network Messages

State Update

Bloom Rules atomic, local, deterministic

Outbound Network Messages

Quorum Vote QUORUM_SIZE = 5 RESULT_ADDR = "example.org" class QuorumVote include Bud

Annotated Ruby class

Communication interfaces: non-deterministic delivery order!

state do channel :vote_chn, [:@addr, :voter_id] Program state channel :result_chn, [:@addr] Lattice state declarations lset :votes lmax :vote_cnt lbool :got_quorum Accumulate votes end Merge at non-deterministic

into set Monotone function: set → max future time Monotone function: max → bool

bloom do votes <= vote_chn {|v| v.voter_id} Program vote_cnt <= votes.size got_quorum <= vote_cnt.gt_eq(QUORUM_SIZE) result_chn <~new got_quorum.when_true { [RESULT_ADDR] } Merge votes together end Merge using lmax LUB with stored votes (set Threshold test LUB) on bool (monotone) end

logic

29  

Some Features of Bloom •  Library of built-in lattice types –  Booleans, increasing/decreasing integers, sets, multisets, maps, sequences, … –  API for defining custom lattice types

•  Support both relational-style rules and functions over lattices •  Model-theoretic semantics (“Dedalus”) –  Logic + state update + async messaging

Ongoing Work Runtime –  Current implementation: Ruby DSL –  Next generation: JavaScript, code generation •  Also target JVM, CLR, MapReduce

Tools –  BloomUnit: Distributed testing / debugging –  Verification of lattice type implementations

Software stack –  Concurrent editing, version control –  Geo-replicated consistency control

CRDTs vs. Bloom Similarities –  Focus on commutative operations –  Formalized via join semilattices •  Monotone functions → composition of CRDTs

–  Similar design patterns (e.g., need for GC)

Differences –  Approach: language design vs. ADTs –  Correctness: confluence vs. convergence •  Confluence is strictly stronger •  CRDT “query” is not necessarily monotone •  CRDTs more expressive?

Ongoing Work Runtime –  Current implementation: Ruby DSL –  Next generation: JavaScript, code generation •  Also target JVM, CLR, MapReduce

Tools –  BloomUnit: Distributed testing / debugging –  Verification of lattice type implementations

Software stack –  Built: Paxos, HDFS, 2PC, lock manager, causal delivery, distributed ML, shopping carts, routing, task scheduling, etc. –  Working on: •  Concurrent editing, version control •  Geo-replicated consistency control

Blazes: Intelligent Coordination Synthesis

Mixing Order and Disorder •  Can these ideas scale to large systems? –  Ordering can rarely be avoided entirely

•  Make order part of the design process –  Annotate modules with ordering semantics –  If needed, coordinate at module boundaries

•  Philosophy –  Start with what we’re given (disorder) –  Create only what we need (order)

Tool Support 1.  Path analysis –  How does disorder flow through a program? –  Persistent vs. transient divergence

2.  Coordination synthesis –  Add “missing” coordination logic automatically

Coordination Synthesis •  Coordination is costly –  Help programmers use it wisely!

•  Automatic synthesis of coordination logic •  Customize coordination code to match: 1.  Application semantics (logical) 2.  Network topology (physical)

Application Semantics •  Common pattern: “sessions” –  (Mostly) independent, finite duration

•  During a session: –  Only coordinate among participants

•  After a session: –  Session contents are sealed (immutable) –  Coordination is unnecessary!

Sealing •  Non-monotonicity → arbitrary change –  Very conservative!

•  Common pattern in practice: 1.  Mutable for a short period 2.  Immutable forever after

•  Example: bank accounts at end-of-day •  Example: distributed GC –  Once (global) refcount = 0, remains 0

Affinity •  Network structure affects coordination cost •  Example: –  m clients, n storage servers –  1 client request → many storage messages –  Possible strategies: •  Coordinate among (slow?) clients •  Coordinate among (fast?) servers

•  Related: geo-replication, intra- vs. inter-DC coordination, “sticky” sessions

Algebraic Programming

Adversarial Programming •  A confluent program must behave correctly for any network schedule –  Network as “adversary”

•  What if we could control the network? –  Schedule only influences performance, not correctness –  Sounds like an optimizer!

Algebra vs. Ordering The developer writes two programs: 1.  Algebra defines program behavior –  Guaranteed to be order independent –  Language: high-level, declarative

2.  Ordering Spec controls input order –  Ordering, batching, timing –  Language: arbitrary (e.g., imperative) • 

Need not be deterministic

Benefits •  Separate correctness and performance –  Might be developed independently!

•  Wide range of freedom for optimization –  No risk of harming correctness •  Randomness, batching, parallelism, CPU affinity, data locality, …

–  Auto-tuning/synthesis of ordering spec?

Examples Quicksort

Matrix Multiplication

Algebra:

Algebra:

–  Input: values to sort, pivot elements –  Output: sorted list

Ordering Spec: –  Ordering of pivots

–  Input: sub-matrices –  Output: result of matrix multiply

Ordering Spec: –  Tiling •  i.e., division of input matrices into pieces

In a distributed system, Order is precious! Let’s stop taking it for granted.

Recap 1.  The network is disorderly – embrace it! 2.  How can we write disorderly programs? –  State: join semilattices –  Computation: monotone functions

3.  When order is necessary, use it wisely –  A program’s ordering requirements should be a first-class concern!

Thank You! Questions Welcome

Disorderly Distributed Programming with Bloom

Mutable for a short period. 2. Immutable forever after. • Example: bank accounts at end-of-day. • Example: distributed GC. – Once (global) refcount = 0, remains 0 ...

3MB Sizes 0 Downloads 252 Views

Recommend Documents

Distributed Programming with MapReduce
Jun 4, 2009 - a programming system for large-scale data processing ... save word_count to persistent storage … → will take .... locality. ○ backup tasks ...

Programming-Distributed-Computing-Systems-A-Foundational ...
... more apps... Try one of the apps below to open or edit this item. Programming-Distributed-Computing-Systems-A-Foundational-Approach-MIT-Press.pdf.

Visualised Parallel Distributed Genetic Programming
1.1 VISUALISED DISTRIBUTED GENETIC PROGRAMMING ENGINE . ..... also advantages of machine learning: the ability of massive calculations and data ...

DISTRIBUTED PARAMETER ESTIMATION WITH SELECTIVE ...
shared with neighboring nodes, and a local consensus estimate is ob- tained by .... The complexity of the update depends on the update rule, f[·], employed at ...

Distributed Node with Distributed Quota System (DNDQS).pdf ...
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. Distributed ...

Bloom Filters - the math - GitHub
support membership queries. It was invented by Burton Bloom in 1970 [6] ... comments stored within a CommonKnowledge server. Figure 3: A Bloom Filter with.

DISTRIBUTED ACOUSTIC MODELING WITH ... - Research at Google
best rescoring framework for Google Voice Search. 87,000 hours of training .... serving system (SSTable service) with S servers each holding. 1/S-th of the data.

Distributed Average Consensus With Dithered ... - IEEE Xplore
computation of averages of the node data over networks with band- width/power constraints or large volumes of data. Distributed averaging algorithms fail to ...

Distributed Averaging with Quantized Communication ...
Ji Liu2. Tamer Basar2. Behçet Açıkmese1. Abstract—Distributed algorithms are the key to enabling effective large scale distributed control systems, which present ..... xi(k)−⌊xi(k)⌋ be the decimal part of a node's local estimate value; the

DISTRIBUTED AVERAGE CONSENSUS WITH ...
“best constant” [1], is to set the neighboring edge weights to a constant ... The suboptimality of the best constant ... The degree of the node i is denoted di. |Ni|.

Constructing Reliable Distributed Communication Systems with ...
liable distributed object computing systems with CORBA. First, we examine the .... backed by Lotus, Apple, IBM, Borland, MCI, Oracle, Word-. Perfect, and Novell) ...

Conveyor carousel with distributed drive system
Nov 23, 2011 - poWer, loWer energy use, closed loop system monitoring and reduced ... This application is a reissue of US. patent application Ser. No. 12/128 ...

Distributed Quadratic Programming Solver for Kernel ...
the benefit of high performance distributed computing envi- ronments like high performance cluster (HPC), cloud cluster,. GPU cluster etc. Stochastic gradients ...

Algae Bloom Toolkit_Final.pdf
Step 3: Monitor for harmful algae blooms. If present, post signage; if above cyanotoxin advisory. value, issue a No Contact Advisory until at least two consecutive ...

Algae Bloom Toolkit_Final.pdf
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. Algae Bloom ...

Bloom Turnabout Dance.pdf
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. Bloom Turnabout Dance.pdf. Bloom Turnabout Dance.pdf. Open. Extract. Open with. Sign In. Main menu.

Java Network Programming and Distributed Computing
Internet, Web applications, and Web services, the majority of today's programs and applications require ... basic concepts involved with networking and the practical application of the skills necessary to be an ...... We use the term devices in this

Algae Bloom Toolkit_Final.pdf
Recommended actions reflect the December 2016 Draft EPA Human Health Recreational Ambient. Water Quality Criteria or Swimming Advisories for Microcystins and Cylindrospermopsin. Note: EPA does not regularly monitor recreational waters for harmful alg