Sharing Equality is Linear

Viewer
Transcript

Sharing Equality is Linear

1

62 63

2

64

3

Beniamino Accattoli

Andrea Condoluci

Claudio Sacerdoti Coen

LIX Inria & École Polytechnique France [email protected]

Department of Computer Science and Engineering University of Bologna Italy [email protected]

Department of Computer Science and Engineering University of Bologna Italy [email protected]

4 5 6 7 8 9 10

13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33

In many cases sharing is the cure because size explosion is based on unnecessary duplications of subterms, that can be avoided if such subterms are instead shared, and evaluation is modified accordingly. The idea is to introduce an intermediate setting λ shX where λ X is refined with sharing (we are vague about sharing on purpose) and evaluation in λ X is simulated by some refinement →shX of →X . A term with sharing t represents the ordinary term t↓ obtained by unfolding the sharing in t—the key point is that t can be exponentially smaller than t↓ . Evaluation in λ shX produces a shared normal form nfshX (t ) that is a compact representation of the ordinary result, that is, such that nfshX (t )↓ = nfX (t ). The situation can then be refined as in the following diagram:

The λ-calculus is a handy formalism to specify the evaluation of higher-order programs. It is not very handy, however, when one interprets the specification as an execution mechanism, because terms can grow exponentially with the number of β-steps. This is why implementations of functional languages and proof assistants always rely on some form of sharing of subterms. These frameworks however do not only evaluate λ-terms, they also have to compare them for equality. In presence of sharing, one is actually interested in equality—or more precisely α-conversion— of the underlying unshared λ-terms. The literature contains algorithms for such a sharing equality, that are polynomial in the sizes of the shared terms. This paper improves the bounds in the literature by presenting the first linear time algorithm. As others before us, we are inspired by Paterson and Wegman’s algorithm for first-order unification, itself based on representing terms with sharing as DAGs, and sharing equality as bisimulation of DAGs. Beyond the improved complexity, a distinguishing point of our work is a dissection of the involved concepts. In particular, we show that the algorithm computes the smallest bisimulation between the given DAGs, if any.

λX polynomial

Origin and Downfall of the Problem

38

For as strange as it may sound, the λ-calculus is not a good setting for evaluating and representing higher-order programs. It is an excellent specification framework, but—it is simply a matter of fact—no tool based on the λ-calculus implements it as it is.

39 40 41

44 45 46 47 48 49 50 51 52 53 54 55 56

Introduction

59

72 73 74 75 76 77 78 79 80 81 82

λ shX

85 86

88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103

Reasonable evaluation and sharing. Fix a dialect λ X of the λcalculus with a deterministic evaluation strategy →X , and note nfX (t ) the normal form of t with respect to →X . If the λ-calculus were a reasonable execution model then one would at least expect that mechanizing an evaluation sequence t →nX nfX (t ) on random access machines (RAM) would have a cost polynomial in the size of t and in the number n of β-steps. In this way a program of λ X evaluating in a polynomial number of steps can indeed be considered as having polynomial cost. Unfortunately, this is not the case, at least not literally. The problem is called size explosion: there are families of terms whose size grows exponentially with the number of evaluation steps, obtained by nesting duplications one inside the other—simply writing down the result nfX (t ) may then require cost exponential in n.

Reasonable conversion and sharing. Some higher-order settings need more than evaluation of a single term. They often also have to check whether two terms t and s are →X -convertible—for instance to implement the equality predicate, as in Ocaml, or for type checking in settings using dependent types, typically in Coq. These settings usually rely on a set of folklore and ad-hoc heuristics for conversion, that quickly solve many frequent special cases. In the general case, however, the only known algorithm is to first evaluate t and s to their normal forms nfX (t ) and nfX (s) and then check nfX (t ) and nfX (s) for equality—actually, for α-equivalence because terms in the λ-calculus are identified up to α. One can then say that conversion in λ X is reasonable if checking nfX (t ) =α nfX (s) can be done in time polynomial in the sizes of t and s and in the number of β steps to evaluate them. Sharing is the cure for size explosion during evaluation... but what about conversion? Size explosion forces reasonable evaluations to produce shared results. Equality in λ X unfortunately does

57 58

69 70

84

RAM polynomial

42 43

68

87

35

37

polynomial

Let us explain it. One says that λ X is reasonably implementable if both the simulation of λ X in λ shX up to sharing and the mechanization of λ shX can be done in time polynomial in the size of the initial term t and of the number n of β-steps. If λ X is reasonably implementable then it is possible to reason about it as if it were not suffering of size explosion. The main consequence of such a schema is that the number of β-steps in λ X then becomes a reasonable complexity measure—essentially the complexity class P defined in λ X coincides with the one defined by RAM or Turing machines. The first result in this area appeared only in the nineties and for a special case—Blelloch and Greiner showed that weak (that is, not under abstraction) call-by-value evaluation is reasonably implementable [5]. The strong case, where reduction is allowed everywhere, has received a positive answer only in 2014, when Accattoli and Dal Lago have shown that leftmost-outermost evaluation is reasonably implementable [4].

34

1

67

83

Keywords lambda-calculus, sharing, alpha-equivalence, bisimulation

36

66

71

Abstract

11 12

65

PL’17, January 01–03, 2017, New York, NY, USA 2017. ACM ISBN . . . $15.00 https://doi.org/

104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121

60

1 61

122

PL’17, January 01–03, 2017, New York, NY, USA 123 124 125 126 127 128 129 130 131 132 133 134 135

Beniamino Accattoli, Andrea Condoluci, and Claudio Sacerdoti Coen

not trivially reduce to equality in λ shX , because a single term admits many different shared representations in general. Therefore, one needs to be able to test sharing equality, that is to decide whether t↓ =α s↓ given two shared terms t and s. For conversion to be reasonable, sharing equality has to be testable in time polynomial in the sizes t and s. The obvious algorithm that extracts the unfoldings t↓ and s↓ and then checks α-equivalence is of course too naïve, because computing the unfolding is exponential. The tricky point therefore is that sharing equality has to be checked without unfolding the sharing. In these terms, the question has been first addressed by Accattoli and Dal Lago in [2], where they provide a quadratic algorithm for sharing equality. Consequently, conversion is reasonable.

Essentially, two DAGs represent the same unfolded λ-term if they have the same structural paths, just arranged differently. To be precise, sharing equality is based on what we call sharing equivalences, that are bisimulations plus some additional requirements about names—for α-equivalence—and the requirement that they are equivalence relations. Binders, cycles, and domination. A key point of our problem is the presence of binders, i.e. abstractions, and the fact that equality on λ-terms is α-equivalence. Graphically, it is standard to see abstractions as getting a backward edge from the variable they bound—this approach is also supported by the strong relationship between λ-calculus and linear logic proof nets. Therefore, binders introduce a form of cycle in DAGs. Technically speaking these are only half-cycles: the cycle can be easily avoided by reversing the backward edge (and we shall do so), but its essence does not disappear: while two free variables are bisimilar only if they coincide, two bound variables are bisimilar only when also their binders are bisimilar, suggesting that λ-terms with sharing are, as directed graphs, structurally closer to deterministic finite automata (DFA), that may have cycles, than to DAGs. The problem with cycles is that in general bisimilarity is not linear—Hopcroft and Karp’s algorithm [11], the best one, is only pseudo-linear, that is, with an inverse Ackermann factor. At the same time, these half-cycles induced by binders are of a very special form, being a graphical representation of scopes. They are indeed characterized by a structural property called domination— exploring the DAG from the root one necessarily visits the binder before the bound variable. Domination turns out to be the key ingredient for a linear algorithm in presence of binders.

136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156

A closer look to the costs. Once established that strong evaluation and conversion are both reasonable it is natural to wonder how efficiently can they be implemented. Accattoli and Sacerdoti Coen in [1] essentially show that strong evaluation can be implemented within a bilinear overhead, i.e. with overhead linear in the size of the initial term and in the number of β-steps. Their technique has then been simplified by Accattoli and Guerrieri in [3]. Both works actually address open evaluation, which is a bit simpler than strong evaluation—the moral however is that evaluation is bilinear. Consequently, the size of the computed result is bilinear. The bottleneck for conversion then seemed to be Accattoli and Dal Lago’s quadratic algorithm for sharing equality. The literature actually contains also other algorithms, studied with different motivations or for slightly different problems (discussed below). None of these algorithms however match the complexity of evaluation. In this paper we provide the first algorithm for sharing equality that is linear in the size of the shared terms, improving over the literature. Therefore, the complexity of sharing equality matches the one of evaluation, providing a combined bilinear algorithm for conversion, that is the real motivation behind this work.

Related problems. There are various problems that are closely related to sharing equality, and that are also treated with bisimilaritybased algorithms. Let us list similarities and differences: • First-order unification. On the one hand the problem is more general, because unification roughly allows to substitute variables with terms not present in the original DAGs, while in sharing equality this is not possible. On the other hand, the problem is less general, because it does not allow binders and does not test α-equivalence. There are basically two linear algorithm for first-order unification, Paterson and Wegman’s (shortened PW) [15] and Martelli and Montanari’s (MM) [14]. Both rely on sharing to be linear. PW even takes terms with sharing as inputs, while MM deals with sharing in a less direct way, except in its less known variant [13] that takes in input terms shared using the Boyer-Moore technique [6]. • Nominal unification. This is unification up to α-equivalence (but not up to β or η equivalence) of λ-calculi extended with name swapping, in the nominal tradition. It has been studied by two groups, Calvès & Fernández and Levy & Villaret, adapting PW and MM form first-order unification. It is very close to sharing equality, but the known best algorithms [8, 12] are only quadratic. (See [7] for a unifying presentation.) • Pattern unification. Miller’s pattern unification can also be stripped down to test sharing equality. Qian presents a PWinspired algorithm, claiming linear complexity [16], that seems to work only on unshared terms. We say claiming because the algorithm is very involved and the proofs are far from being clear. Moreover, according to Levy and Villaret in [12]: it is really difficult to obtain a practical algorithm from

157 158

Computing Sharing Equality

159

Sharing as DAGs. Sharing can be added to λ-terms in different forms. In this paper we adopt a graphical approach. Roughly, a λ-term can be seen as a (sort of) directed tree whose root is the topmost constructor and whose leaves are the (free) variables. A λ-term with sharing is more generally a DAG. Sharing of a subterm t is then the fact that the root node r of t is the child of more than one node. This is essentially the same sharing of calculi with explicit substitution, environment-based abstract machines, or linear logic—the details are different but all these approaches provide different incarnations of the same notion of sharing. It is instead different of so called sharing graphs that are graphs implementing Lévy’s optimal evaluation and providing a deeper form of sharing than our DAGs. To our knowledge, sharing equality for sharing graphs has never been studied—it is not even known whether it is reasonable.

160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181

Sharing equality as bisimilarity. When λ-terms with sharing are represented as DAGs, a natural way of checking sharing equality is to test DAGs for bisimilarity. Careful here: the transition system under study is the one given by the directed edges of the DAG, and not the one given by β-reduction steps, as in applicative bisimilarity—our DAGs may have β-redexes but we do not reduce them in this paper, that is an orthogonal issue (namely, evaluation).

184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243

182

2 183

244

Sharing Equality is Linear 245 246 247 248 249 250 251 252 253 254

PL’17, January 01–03, 2017, New York, NY, USA

the proof described in [16]. We believe that is fair to say that Qian’s work is hermetic (please try to read it!). • Nominal Matching. Calvès & Fernández in [9] present an algorithm for nominal matching (a special case of unification) that is linear, but only on unshared input terms. • Equivalence of DFA. Automata do not have binders, and yet they are structurally more general than λ-terms with sharing, since they allow arbitrary directed cycles, not necessarily dominated. As already pointed out, the best equivalence algorithm is only pseudo-linear [11].

• The role of binders: the fact that binders can be treated straightforwardly is—we believe—an insight and not a weakness of our work. Essentially, domination allows to reduce sharing equality in presence of binders to the blind sharing check, under mild but key assumptions on the context in which terms are tested (see well-scoped queries in Sect. 3). • Minimality. The set of shared representations of an ordinary λ-term t is a lattice: the bottom element is t itself, the top element is the (always existing) maximally sharing of t, and for any two terms with sharing there exist inf and sup. Essentially, Accattoli & Dal Lago and Grabmayer & Rochel address sharing equality by computing the top elements of the lattices of the two λ-terms with sharing, and then comparing them for α-equivalence. We show that our blind sharing check—and morally every PW-based algorithm—computes the sup of t and s, that is, the term having all and only the sharing in t or s, that is the smallest sharing equivalence between the two DAGs. This insight, first pointed out in PW’s original paper to caracterize most general unifiers, is a prominent concept in our theory of sharing equality as well. • Proofs, invariants, and detailed development. We provide detailed correctness, completeness, and linearity proofs, based on finely tuned invariants of the algorithm, to a level of preciseness that is unmatched in the literature. We also provide detailed treatment of the relationship between α-equivalence on terms and sharing equivalences on DAGs. Our work is therefore self-contained, but for the fact that most details are in the Appendix. • Concrete implementation. We implemented our algorithm and verified its linear complexity. The code is available on the third author’s webpage.

255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276

Previous work. For what concerns sharing equality itself, in the literature there are only two algorithms explicitly addressing it. First, the already cited quadratic one by Accattoli and Dal Lago. Second, a O (n log n) algorithm by Grabmayer and Rochel [10] (where n is the sum of the sizes of the shared terms to compare, and the input of the algorithm is a graph), obtained by a reduction to equivalence of DFAs and treating the more general case of λ-terms with letrec. Contributions: two parts, and a 2-levels linear algorithm. This paper is divided in two parts. The first part develops a re-usable, self-contained, and clean theory of sharing equality, independent of the algorithm that computes it. Some of its concepts are implicitly used by other authors, but never emerged from the collective unconscious before (propagated queries in particular)—others instead are new. The theory culminates with the sharing equality theorem that connects α-equivalence on terms with sharing equivalences for DAG-based sharing of λ-terms, under suitable conditions. The second part studies a linear algorithm for sharing equality by adapting PW linear algorithm for first-order unification to λ-terms with sharing. Our algorithm is actually composed by a 2-levels, modular approach (pushing further the modularity suggested—but not implemented—by Calvès & Fernández in [8]):

279 280 281 282 283 284 285 286 287 288

• Blind sharing check: a reformulation of PW from which we removed the management of meta-variables for unification. It is used as a first-order test on λ-terms with sharing, to check that the unfolded terms have the same skeleton, ignoring variable names. • Name check: a straightforward algorithm executed after the previous one, testing α-equivalence by checking that bisimilar bound variables have bisimilar binders and that two different free variables are never shared.

291 292 293 294 295 296 297 298 299 300 301 302 303

308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336

338 339 340 341

2

Preliminaries

342

λ-terms and α -equivalence. Ordinary λ-terms are defined by the following syntax: Terms

t, s, u, r

::=

x | λx .t | t s

As it is standard, the notion of equality on λ-terms is α-conversion, which is defined as follows (basic definition about free and bound variables and meta-level substitutions are in Appendix A):

The decomposition plus the correctness and the completeness of the checks crucially rely on the theory developed in the first part.

343 344 345 346 347 348 349 350

289 290

307

337

(No) Proofs. For lack of space, all proofs have been moved to the Appendix. If accepted, this long version will be uploaded on Arxiv (and more examples will be added).

277 278

306

The value of the paper. It is delicate to explain the value of our work. Three features are obvious: 1) the improved complexity of the problem, 2) the consequent downfall on the complexity of βconversion, and 3) the isolation of a theory of sharing equality. At the same time, however, our algorithm looks as an easy adaptation of PW, and binders do not seem to play much of a role. Let us then draw attention to the following points:

Definition 2.1 (α-conversion). α-conversion, also called α-equivalence, 351 352 is the relation =α defined by: 1. 2. 3. 4.

Same variables: x =α x; Application: ts =α ur if t =α u and s =α r ; Same abstracted variable: λx .t =α λx .s if t =α s; Different abstracted variables: λx .t =α λy.s{x y} if t =α s and y < fv(s).

353 354 355 356 357 358

• Identification of the problem: the literature presents similar studied and techniques, and yet we are the first to formulate and study the problem per se (unification is different, and it is usually not formulated on terms with sharing), directly (i.e. without reducing it to DFAs, like in Grabmayer and Rochel), and with a fine-grained look at the complexity (Accattoli and Dal Lago only tried not to be exponential).

Term as graphs, informally. Graphically, λ-terms can be seen as syntax trees, with two tweaks relative to variables—please have a look to the example in Fig. 1.a:

359

• Variable merging: all the nodes corresponding to the occurrences of a same variable are merged together, like the three occurrences of w in the example;

362

360 361

363 364 365

304

3 305

366

PL’17, January 01–03, 2017, New York, NY, USA 367

a)

b)

@

368 369

λ

Beniamino Accattoli, Andrea Condoluci, and Claudio Sacerdoti Coen

λ

@

370

371

λ

@ 372 373 374 375 376

x

λ y

c)

@

λ

@

@

@

w

# x

(λx . x (λy.w )) ((λy.w ) w )

(λx .x x ) (x x )

Figure 1. a) λ-term as a DAG, without sharing; b) DAG with sharing (same term of a); c) DAG breaking domination.

385 386 387 388 389 390 391 392 393 394 395 396 397 398

443

• Binding edges: abstraction nodes have a special binding edge towards the variable node corresponding to the variable that they abstract, that is always depicted as the left child. Sharing is realized by allowing abstraction and application nodes to have more than one parent, as for instance the abstraction on y in Fig. 1.b —note that sharing can happen inside abstractions, e.g. λy.w is shared under the abstraction on x.

that is distinct from that of every other Var-node—we sometimes write Var(i)—that is used to ease the read back of a term graph as a λ-term (to ease the reading, more often than not we rather use x, y, y, . . . ). At various points we shall ask two nodes to have the same label and in that case the identifier does not count as part of the label—the requirement simply asks the two nodes to both be Var nodes. 3. Structural properties: • Acyclicity: the graph is a Direct Acyclic Graph (DAG). • Domination: every Lam node dominates its left child.

Domination. Not every DAG built this way represents a term. For instance, the DAG in Fig. 1.c does not, because the bound variable x is visible outside the scope of its abstraction, since there is a path to x from the application above the abstraction that does not pass through the abstraction itself. One would say that such a DAG represents (λx .xx )(xx ), but since terms are identified up to α, the variable x in xx and the one in λx .xx =α λy.yy are not the same. It is well-known that scopes corresponding to terms are characterized by domination.

401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425

Definition 2.2 (Domination). Let G be a DAG and n and m two nodes of G. Then n dominates m when every path from a root of G to m passes through n.

Alternative representation and garbage collection. Sharing is sometimes represented using variables as the only nodes on which sharing happens and allowing DAGs to grow below them—a variable may then have at most one child. Our representation is obtained by collapsing these variables-as-sharing on their child if they are the child of some other node. Our results can be adapted to the other approach, at the price of more technical definitions. Our formalism is also garbage-free: this is not a way to cheat, complexity-wise, because garbage collection requires linear time—it simply removes irrelevant noise.

Term forests. Since we are interested in comparing two (or more) terms, we actually rather consider a forest. The following, is our precise definition of λ-terms with sharing. Definition 2.3 (Term forest). A term forest is a directed graph such that: 1. Labels: there are three kind of nodes, application, abstraction, and variable nodes, distinguished by a label that is respectively App, Lam, or Var. 2. Children and Binders: • Applications: an App node has exactly two children, called left and right. We write App(l, r ) for a node labelled by App whose left child is l and whose right child is r ; • Abstractions: a Lam node has exactly two children, called left and right, or variable and body. We write Lam(l, r ) for a node labelled by Lam whose left child is l and whose right child is r . The left child must be labelled by Var and must have Lam(l, r ) as its binder (see below). • Variables: a node n labelled by Var has no children. Every Var node has a binder attribute that is either undefined or it is a Lam node of which it is the left child. If n has a binder then it is bound, otherwise it is free. Every Varnode has also an identifier, i.e. an additional label i ∈ N

Notations for paths. We write n →1 l and n →2 r if n = App(l, r ) or n = Lam(l, r ), and n → m if n →i m for some i ∈ {1, 2}. Then we extend → to paths between nodes as follows: • n →ϵ n for every node n; • n →π ·i mi if n →π m and m →i mi in G. Read back. The sharing in a term forest can be unfolded by duplicating shared sub-graphs. We prefer however to adopt another approach. We define a read-back procedure associating an ordinary λ-term JnK (without sharing) to each node n of the forest, in such a way that shared sub-graphs simply appear multiple times.

445 446 447 448 449 450 451 452 453 454

456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483

426

4 427

444

455

Root nodes. Term forests, as expected, may have various root nodes. What is maybe less expected, is that these roots may share some parts of the forest. Consider Fig. 1.b, and immagine to remove the root and its edges: the outcome still is a perfectly legal term forest. We admit these configurations because they actually arise naturally in implementations, especially of proof assistants.

399 400

441 442

382

384

439 440

381

383

436

438

(λx . x (λy.w )) ((λy.w ) w )

379 380

435

437

377 378

432

434

λ

431

433

y

w

430

x

428 429

y

@

Definition 2.4 (Read back). The read back J·K from nodes of a term forest to λ-terms is defined by: • Variable: JVar(i)K B x i

484 485 486 487 488

Sharing Equality is Linear a)

489

@

490

@

@

492

λ

λ

λ

494 495

x

496

b)

@

491

493

PL’17, January 01–03, 2017, New York, NY, USA

y

z

@

@

c)

@

λ

d)

λ

λ

e)

λ

λ

x

y

x

y

x

y

556

x

557 558

497

Figure 2. Examples of sharing equivalences and queries.

498

559 560

499

561

• Application: JApp(l, r )K B JlK Jr K; • Abstraction: JLam(l, r )K B λJlK. Jr K.

500 501 502 503

3

guide the reader towards the proper relationship, formalized by Theorem 3.9 at the end of this section): • sharing to α: if n ≡ m then JnK =α JmK; • α to sharing: if JnK =α JmK then there exists a sharing equivalence ≡ such that n ≡ m.

The Theory of Sharing Equality

504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524

Sharing equivalence. To formalize the idea that two different DAGs unfold to the same term, we introduce a general notion of equivalence between nodes whose intended meaning is that two related nodes have α-equivalent read backs.

(Propagated) queries. According to the sketch we just provided, to check the sharing equality of two terms with sharing, i.e. a term forest composed by two DAGs of root n and m, it is enough to compute the smallest sharing equivalence ≡ such that n ≡ m, if it exists, and failing otherwise. This is what our algorithm does. At the same time, however, it is slightly more general: it may test more than two nodes, and the nodes to test are not required to be roots of the term forest (which is also not required to have only two roots)—more generaly it tests all the pairs of nodes contained in a query.

Definition 3.1 ((Blind) sharing equivalences). Let ≡ a relation over the nodes of a term forest G. Then ≡ is a blind sharing equivalence if: Equivalence: ≡ is an equivalence relation; Bisimulation: if n ≡ m and n →i ni , then m →i mi and ni ≡ mi . Labels: if n ≡ m then n and m have the same label. Additionally, ≡ is a sharing equivalence if it is a blind sharing equivalence and it also satisfies the following name conditions on Var-nodes: for all v, w, if v ≡ w then Free: if v has no binder then v = w; Bound: if v has binder bv then w has binder bw and bv ≡ bw .

Definition 3.4 (Query ∼). A query ∼ for a term forest G is a symmetric relation on the nodes of G. The simplest case is when there are only two roots n and m and the query contains only n ∼ m (depicted as a blue wave in Fig. 2.a)—from now on however we work with a generic query ∼, and our focus is on the smallest sharing equivalence containing ∼. Let us be more precise. Every query ∼, induces a number of other equality requests obtained by closing ∼ with respect to the equivalence and bisimulation clauses that every sharing equivalence has to satisfy. In other words, every query induces a propagated query.

Example. Consider Fig. 2.a. The green waves are an economical representation of a sharing equivalence—nodes in the same class are connected by a green path, and reflexive waves are omitted.

525 526 527 528 529 530 531 532 533 534 535 536

Remark 3.2. (Blind) sharing equivalences are closed by intersection, so that if there exists a (blind) sharing equivalence on a term forest then there is a smallest one.

Definition 3.5 (Propagated query ≈). Let ∼ be a query on a term forest G. The propagated query ≈ induced by ∼ is the relation on the nodes of G inductively defined by the following inference rules:

The requirements for a sharing equivalence ≡ on a term forest G essentially ensures that G quotiented by ≡ has itself the structure of a term forest (details about G/≡ are in the Appendix). Note that blind sharing equivalences are not enough, because without the bound names condition binders are not unique up to ≡—it is nonetheless possible to prove that paths up to ≡ are acyclic, which is going to be one of the key properties to prove the completeness of the blind sharing check.

n ∼m n ≈m

539 540 541 542 543 544 545 546 547

(≈ax )

n≈n

n ≈m

(≈r e f )

n →i ni ni ≈ mi n ≈m

m →i mi m ≈p

n ≈p

Theorem 3.3. Let ≡ be a blind sharing equivalence on a term forest G. Then: 1. Acyclicity up to ≡: the relation ≡→≡ is acyclic. 2. Sharing equivalences as term forests: if ≡ also satisfies the name conditions then G/≡ is a term forest. For instance, Fig. 2.b shows the the term forest corresponding to the quotient of the one of Fig. 2.a by the sharing equivalence induced by the green waves. Sharing equivalences do capture α-equivalence on read backs, as we shall show, in the following sense (this is a sketch given to

562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592

(≈s im )

593 594

(≈t r )

Note that the propagated query ≈ is defined without knowing if there exists a (blind) sharing equivalence containing the query ∼—there might very well be none (if the nodes are not sharing equivalent). This is why rule ≈sim assumes both n →i ni and m →i mi : at this point it may be that when n ≈ m the two nodes n and m are different, that is, they are, say, a Var-node and a Lam-node, so that ≈ cannot be propagated. Example. The propagation ≈ of the (blue) query ∼ in Fig. 2.a is the (transitive and reflexive closure) of the green waves.

537 538

553

555

w

552

554

λ

550 551

λ

λ

Blind universality of ≈. It turns out that the propagated query ≈ is itself a blind sharing equivalence, whenever there exists a blind sharing equivalence containing the query ∼. In that case,

595 596 597 598 599 600 601 602 603 604 605 606 607 608 609

548

5 549

610

PL’17, January 01–03, 2017, New York, NY, USA 611 612

Beniamino Accattoli, Andrea Condoluci, and Claudio Sacerdoti Coen

unsurprisingly, ≈ is also the smallest blind sharing equivalence containing the query ∼.

condition, still not necessary, is the following one, asking that every two queried nodes are under exactly the same abstractions.

Proposition 3.6 (Blind universality of ≈). Let ∼ be a query. If there exists a blind sharing equivalence ≡ containing ∼ then:

Definition 3.7 (Well-scoped query). A query ∼ is well-scoped when for every queried pair of nodes n ∼ m and every Lam-node b (v), if b (v) →+ n →∗ v then b (v) →+ m, and viceversa (remember, ∼ is symmetric).

615 616 617

1. The propagated query ≈ is contained in ≡, i.e. ≈ ⊆ ≡. 2. ≈ is the smallest blind sharing equivalence containing ∼.

618 619 620 621 622 623 624 625 626 627 628 629

Proposition 3.8 (Universality of ≈). Let ∼ be a well-scoped query. If there exists a sharing equivalence ≡ containing ∼ then the propagated query ≈ is the smallest sharing equivalence containing ∼.

Cycles up to ≈. Let us apply Theorem 3.3.1 to ≈, and take the contrapositive statement: if paths up to ≈ are cyclic then ≈ is not a blind sharing equivalence. The blind sharing check in Sect. 5 indeed fails as soon as it finds a cycle up to ≈. Note, now, that ≈ satisfies the equivalence and bisimulation requirements for a blind sharing equivalence by definition. The only way in which it might not be such an equivalence then, is if the labels requirement fails. Said differently, there is in principle no need to check for cycles, it is enough to test for labels. We are going to do it anyway, because cycles provide earlier failures—there are also other practical reasons to do so, to be discussed in Sect. 5.

The proof of this proposition is in Appendix E, page 19, where it is obtained as a corollary of other results connecting α-equivalence and sharing equivalences, that also rely crucially on the notion of well-scoped query. It can also be proved directly, but it requires a very similar reasoning, which is why we rather prove it indirectly.

632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669

675 676 677 678 679 680 681 682 683 684 685 686 687 688

The sharing equality theorem. We have now introduced all the needed concepts to state the precise connection between α-equivalence, queries, and sharing equivalences, which is the main result of our abstract study of sharing equality.

630 631

673 674

613 614

672

Universality of the propagated query ≈. Here it lies the key conceptual point in extending the linearity of Paterson and Wegman’s algorithm to binders and working up to α-equivalence of bound variables. In general the propagated query ≈ is not a sharing equivalence. Consider for instance the query in Fig. 2.d: it coincides with the propagated query ≈ (up to reflexivity), which is not a sharing equivalence because it does not include the Lam-nodes above the original query—note that propagation only happens downwards. To obtain a sharing equivalence one has to also include the Lam-nodes in the propagated query. The example does not show it, but in general then one has to start over propagating the new relation (eventually having to add other Lam-nodes found in the process, and so on). These iterations are obviously problematic in order to be linear—a key point of Paterson and Wegman’s algorithm is that every node is processed only once. What makes possible to extend their algorithm to binders is that if the query is context-free, that is, if it involves only pairs of nodes that are out of all abstractions, as in Fig. 2.c, then—remarkably— there is no need to iterate the propagation of the query. Said differently, if the query ∼ is context-free then ≈ is λ-universal. The structural property of term forests guaranteeing the absence of iterations for context-free queries is domination. Domination asks that to reach a bound variable from outside its scope one necessarily needs to first pass through its binder. The intuition is that if one starts with a context-free query then there is no need to iterate because binders are necessarily visited before the variables while propagating the query downwards. Let us stress, however, that it is not evident—or at least it was not evident to us—that domination is enough. Note that domination is about one bound variable and its only binder. For sharing equivalence instead one deals with a class of equivalent variables and a class of binders—said differently, domination is given in a setting without queries, and is not obvious that it gets along well with them. The fact that domination on single binders is enough for propagated well-scoped queries to be λ-universal requires indeed a non-trivial proof and it is a somewhat surprising fact. Now, being context-free is a sufficient condition for being λuniversal, but it is not a necessary condition. A relaxed sufficient

Theorem 3.9 (Sharing equality). Let ∼ be a query on a term forest G. Then JnK =α JmK for every n ∼ m if and only if ∼ is well-scoped and ≈ is a sharing equivalence. Despite the—we hope—quite intuitive nature of the theorem, its proof is delicate and requires a number of further concepts and lemmas, developed in Appendices C–E. The key point is finding an invariant expressing how being well-scoped propagates under abstractions to then become the name conditions for a sharing equivalence, and viceversa. Let us conclude the section by stressing a subtlety of Theorem 3.9. Consider Fig. 2.c—with that query the statement is satisfied. Consider Fig. 2.d—with that query the statement fails because the read back of the two queried nodes are not α-equivalent. Consider Fig. 2.e—now ∼ and ≈ coincide (up to reflexivity) and ≈ is a sharing equivalence, but the theorem (correctly) fails, because not all queried pairs of nodes are α-equivalent, as in Fig. 2.d.

689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710

4

Algorithms for Sharing Equality

711 712

From now on, we focus on the algorithmic side of sharing equality. By λ-universality of propagated well-scoped queries ≈ (Proposition 3.8), checking the satisfability of a query ∼ boils down to compute ≈ and check that it is a sharing equivalence. It turns out that the name conditions are modular to the blind sharing requirement. Indeed it is possible to check sharing equality in two phases:

713 714 715 716 717 718

1. Blind sharing check: building ≈ and at the same time checking that it is a blind sharing equivalence; 2. Name check: verifing that ≈ is a sharing equivalence by checking the free and bound name conditions. Of course, the difficulty is doing it in linear time, and it essentially lies in the blind sharing check. The rest of this part presents two algorithms, the blind sharing check and the name check, with proofs of correctness and completeness, and complexity analyses. The second one actually is straightforward. Be careful, however: the algorithm for the name check is trivial just because the subtleties of this part have been isolated the previous section.

719 720 721 722 723 724 725 726 727 728 729 730 731

670

6 671

732

Sharing Equality is Linear 733

5

PL’17, January 01–03, 2017, New York, NY, USA

The Blind Sharing Check

Algorithm 1: Blind sharing check Data: an initial state Result: either fail or a final state

734 735 736 737 738 739

In this section we introduce the basic concepts for the blind sharing check, plus the algorithm itself. Our algorithm is a simple adaptation of Paterson and Wegman’s, and it relies on the same key ideas in order to be linear. Our contribution in this part is a formal proof of correctness and completeness, obtained via the isolation of the algorithm invariants.

1

740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756

Intuitions for the blind sharing check. Paterson and Wegman’s algorithm is based on a tricky, linear time visit of the term forest. It addresses two main efficiency issues:

2

1. The propagated query is quadratic: the number of pairs in the propagated query ≈ can be quadratic in the size of the term forest. An equivalence class of cardinality n has indeed Ω(n2 ) pairs for the relation—this is true for every equivalence relation. This point is addressed by rather computing a linear relation ∼c generating ≈, based on keeping a canonical element for every blind sharing equivalence class. 2. Merging equivalence classes: merging equivalence classes is an operation that, for as efficient as it may be, it is not a costant time operation. The trickiness of the visit of the term forest is indeed meant to guarantee that, if the query is satisfiable, one never needs to merge two equivalence classes, but only to add single elements to classes.

3

4 5

6

757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791

The ideas behind the algorithm, which is on page 7, are: • Dead and alive nodes: nodes are either alive, i.e. still to be visited or under analysis, or dead, that is, they have already been processed and they shall not be processed again. The visit is then implemented by a procedure called Kill, that turns alive nodes into dead ones. • Top-down recursive exploration: whenever the algorithm processes a node n it first recursively calls itself on the alive parents of n. This is done to avoid the risk of reprocessing n because of some new equality requests on n coming from a parent processed after n. • Query edges: the query is represented through additional undirected query edges between nodes, and it is propagated on alive nodes by adding further query edges. The query is propagated carefully, on-demand. The fully propagated query is never computed, because, as explained, in general its size is quadratic in the number of nodes. • Canonic edges: after a node has been processed it is assigned a canonic node in its blind sharing equivalence class. This is represented via a directed canonic edge, which is implemented as a pointer. • Failures and cycles: the algorithm fails in three cases. First, when it finds two nodes with a different label supposed to be in the same class (line 10), because then the approximation of ≈ that it is computing cannot be a blind sharing equivalence. The two other cases (before line 2, and line 12) the algorithm uses the fact that the canonic edge is already present (on an alive node) to infer that it found a cycle up to ≈, and so, again ≈ cannot be a blind sharing equivalence (please read again the paragraph after Proposition 3.6). • Killing a node n: processing a node n boils down to 1. collect without duplicates all the nodes in the intended blind sharing equivalence class of n, that is, the nodes related to n by a sequence of query edges. This is done by

7

Procedure BlindSharingCheck() while there is any alive node n do Kill(n); Procedure Kill(d) queue B ∅; if canonic(d ) is undefined then canonic(d ) ← d else fail ; while d has some alive parent n do Kill(n); while there is an undirected query edge (d, n) do PushSetAndPropagate(queue, d, n); delete undirected query edge (d, n); end while not queue.empty() do h B queue.head(); while h has some alive parent n do Kill(n); while there is an undirected query edge (h, n) do PushSetAndPropagate(queue, d, n); delete undirected query edge (h, n); end queue.pop(); mark h dead; end mark d dead;

794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821

8 9 10

11

12

Procedure PushSetAndPropagate(queue, d, n) if canonic(n) is undefined then canonic(n) B d; queue.push(n); if d, n have different labels then fail ; if d and n have children resp. d 1 , d 2 and n 1 , n 2 then create undirected query edges (d 1 , n 1 ) and (d 2 , n 2 ); end else if canonic(n) , d then fail ;

822 823 824 825 826 827 828 829 830 831 832 833

the while loops at lines 3 and 5, that first collect the nodes queried with n and then iterate on the nodes queried with them. These nodes are inserted in a queue; 2. remove all the query edges in the class; 3. set n as the canonical element of its class, by setting the canonical edge of every node in the class (including n) to n; 4. propagate the query on the children (in case n is a Lam or a App node), by adding query edges between the left (resp. right) child of the canonic and the left (resp. right) child of every node in the class. 5. Pushing a node in the queue, setting its canonic, and propagating the query on the children is done by the procedure PushSetAndPropagate. • Linearity: let us now come back to the two efficiency issues we mentioned before: – Merging classes: the top-down recursive calls are done in order to guarantee that when a node is processed all the query edges for its sharing class are already available, so

834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853

792

7 793

854

PL’17, January 01–03, 2017, New York, NY, USA 855 856 857 858 859 860 861 862 863 864 865 866 867 868

Beniamino Accattoli, Andrea Condoluci, and Claudio Sacerdoti Coen

6

that the class shall not be extended nor merged with other classes later on during the visit of the term forest. – Propagating the query: the query is propagated only after having removed the query edges and having set the canonics of the current blind sharing equivalence class. To explain, consider a class of k nodes, which in general can be defined by Ω(k 2 ) query edges. Note that after canonization, the class is represented using only k − 1 canonic edges, and thus the algorithm propagates only O (k ) query edges—this is why the number of query edges is kept linear in the number of the nodes (assuming that the original query itself was linear). If instead one would propagate query edges before canonizing the class, then the number of query edges may grow quadratically.

871 872 873 874 875 876

States. As explained, the algorithm needs to enrich term forests with a few additional concepts, namely alive nodes, query edges, and canonic edges, grouped under the notion of state. Definition 5.1 (State). A state S of the algorithm is either fail or a quadruple (G, alive, undirquery , canonic) where G is term forest and alive, undirquery , and canonic are data structures with the following properties:

877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896

• Dead & alive nodes: every node is marked either dead (aka already processed) or alive (still to be processed, or being processed). The set of alive nodes is alive (shortened a), and the set of dead nodes is its complement dead B nodes \ a; • Undirected query edges: undirquery (shortened q) is a multiset of additional undirected query edges, pairing nodes that are expected to be placed by the algorithm in the same blind sharing equivalence class. Undirected loops are admitted and there may be multiple occurrences of an undirected edge between two nodes. More precisely, for every undirected edge between n and m with multiplicty k in the state, both (n, m) and (m, n) belong with multiplicity k to undirquery . • Canonic edges: nodes may have one additional canonic directed edge pointing to the computed canonical representative of that node. The partial function mapping each node to its canonical representative, if defined, is noted c. We then write c (n) = m if the canonical of n is m, and c (n) = undefined otherwise.

899 900 901 902 903 904 905 906 907 908 909 910 911 912 913

917 918 919

Moreover, a state is: • initial: if every node n is alive and the canonic function c is undefined on n; the multiset q0 of undirected query edges of the initial state is a concrete representation of the query of the blind sharing equivalence problem on which the algorithm is executed. • final: if every node is dead.

923 924 925

The extended canonic relation ∼qc of a state is then the approximation of ≈ computed so far by the algorithm. Directed edges { up to ∼qc are instead going to be used for cycles and failures, in the proof of completeness.

936

927 928 929 930 931 932 933 934 935

937 938 939 940 941 942 943 944 945

Definition 6.2 (Refined state). A refined state S of the algorithm is either fail or a tuple (G, a, q, c, calls) where (G, a, q, c) is a state and calls is an abstraction of the implicit call stack of the Kill procedure where only (part of) the activation frames for Kill(r ) are represented: calls is a list of pairs

946

[(d 1 , queued1 ), . . . , (dk , queuedk )]

951

where every pair corresponds to a congruence class which is being computed, such that: • canonical node: di is the node on which Kill has been called, corresponding to the canonical node of its class; • nodes to process: queuedi contains the nodes of the class that are going to be processed next. For a refined state we also consider the following two derived notions, the multi-set of dying nodes and the heads of their associated queues, when they are defined:

953

B B

[d 1 , . . . , dk ] queuedi .head() if queuedi is non-empty, and undefined otherwise.

Beware: from now on, we shall only consider refined states, and simply call them states. Invariants and good states. The proofs of correctness and completeness of the algorithm rely on a number of invariants, grouped under the notion of good state. Definition 6.3 (Good state). A non-fail state S is good if: 1. Propagated query: a. Label: if n ∼c m, then n and m have the same label;

914

922

926

dying hd i

We shall prove that in a final state the canonic function c is defined on every node and that there are no undirected query edges. Details about how these additional structures are implemented are given in Sect. 7, where the complexity of the algorithm is analysed. Let us point out that the code is optimized in order to satisfy simpler invariants in the next section, not for being the shortest possible one. Typically, the line loops at lines 3 and 5 can be merged, since they have essentially the same body, by putting the node d itself in the queue—unfortunately this change breaks our formulation of the invariants, and makes them more involved.

921

Definition 6.1. Let (G, a, q, c) be a state. We define the following relations on the nodes of G: • Canonic equivalence: n ∼c m if c (n) and c (m) are both defined and coincide. • Undirected (query) relation: ∼q is the relation obtained from the multi-relation q by dropping the multiplicity of its elements. • Extended canonic equivalence: ∼qc B (∼q ∪ ∼c ) ∗ . • Directed edges up to ∼qc : {B (∼qc ◦ → ◦ ∼qc ).

Refined states. To analyse the behavior of the algorithm it is useful to refine the state of the algorithm with the stack of its recursive calls. We shall show that the added information is actually already contained in the notion of state—for reasoning about the algorithm it is however handy to make it explicit.

897 898

916

920

The representation of the partially propagated query. According to the explanation of the previous section, every state of the algorithm contains an approximation of ≈, obtained by composing two notions, canonic edges and undirect query edges. Let us fix some notions.

869 870

Correctness and Completeness

Here we prove that the blind algorithm correctly and completely solves the blind sharing equality problem, that is, it checks whether the propagated query is a blind sharing equivalence.

947 948 949 950

952

954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975

8 915

976

Sharing Equality is Linear 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001

PL’17, January 01–03, 2017, New York, NY, USA

b. Simulation: if n ∼c m and n →i ni , then m →i mi and ni ∼qc mi for i ∈ {1, 2}. c. Approximation: ∼0q ⊆ ∼qc ⊆ ≈; 2. Canonics: if c (n) = m then a. Alive canonics are dying: if m is alive then m is dying; b. Idempotency: c (m) = m; c. Canonics die last: if m is dead then n is dead; d. Alive with canonic are queued: if n is alive then n , m ⇔ n ∈ queuem ; 3. Alive Nodes: a. Alive nodes are downward closed: if n is alive and n → m then m is alive; b. Candidates are alive: if n ∼q m then n and m are alive; c. Dying are still alive: if n is dying then n is alive; d. Queues are alive: nodes in queuei are alive; e. Dead have representatives: if c (n) = undefined then n is alive; 4. Queues: for 1 ≤ i ≤ k: a. Queued nodes have right canonic: if n ∈ queuei then c (n) = di ; b. Queues are sets: queuei contains no duplicates. 5. Dying Nodes: a. Auto-canonic: c (di ) = di ; b. Calls are on different nodes: dying has no duplicates; c. Dying order: dk { dk−1 { · · · { d 1 .

To be formal, we should then introduce transitions between states of the algorithm. For the sake of readability, however, we avoid such a technical definition. Roughly, a transition is the execution of the algorithm from a line to the next, as they appear numbered in the algorithm itself. When the line is a while loop a transition is an iteration of the body. Moreover, PushSetAndPropagate is executed as a single transition, its line numbers being used only to help the reader in the non-trivial proof of the following theorem, where transitions are spelled out carefully. S ′,

Theorem 6.5. Let S be a good state. If S → then: • Completeness: if S ′ = FAIL, then ≈ is not a blind sharing equivalence; • Preservation of good states: otherwise S is good.

7

1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016

Some invariants are going to be used for correctness (the propagated query group), others for completeness (Dying order), others for the proof of linearity (Calls are on different nodes and Queues are sets). The remaining ones are used for minor points, or simply to prove other invariants. Let us delay for a moment the preservation of the invariants. Coming back to the definition of refined state, let us show that calls is actually retrievable from the information in a (good) state, justifying our identification of the two concepts. First, dying nodes are those alive nodes that are their own canonic (by Auto-canonic). Second, all other alive nodes that are not dying and that have the canonic defined are in the queue associated to their canonic (by Alive with canonic are queued). Third, the order between dying nodes is reflected by directed edges modulo { (by Dying order).

1017 1018 1019 1020 1021 1022 1023

Correctness. The invariants of the propagated query group state that ∼c is a blind sharing equivalence up to query edges, and that ∼qc can indeed be seen as an approximation of the propagated query ≈. At the end of the algorithm then ∼qc and ∼c coincide (because there are no query edges left by Candidates are alive and all nodes have a canonic by Dead have representatives), and ∼c is then exactly the propagated query, as the next proposition shows.

1024 1025 1026

Proposition 6.4 (Correctness). Let S be a good final state reachable from an initial state of query ∼. Then in S :

1027 1028 1029 1030 1031 1032 1033 1034 1035

1. Every node has a canonic and there are no query edges. 2. ∼c is a blind sharing equivalence and coincides with the propagation ≈ of the initial query ∼. Completeness and state transitions. Completeness is the fact that whenever the algorithm fails then there are no blind sharing equivalences satisfying the initial query. We prove this fact while proving the preservation of the invariants, because both proofs need to look at the last step performed by the algorithm.

Linearity

1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053

In this section we show that the algorithm for the blind sharing check always terminates, and it does so in time linear in the size of the term forest and the query.

1054

Low-level assumptions. In order to analyse the complexity of the check we have to spell out some details about an hypothetical implementation on a RAM of the data structures used by the blind sharing check. Some of the data structures could be avoided — and our concrete implementation avoids them— but the approach described here is easier to analyse, complexity-wise. • Term forest directed edges: these edges, despite being directed, have to be traversed in both directions, typically to recurse over the alive parents. We then assume that every node has an array of pointers to its parents. • Dead & alive nodes: this distinction is done via a boolean on each node. We also maintain the set a as a doubly linked list of nodes (so every node also has two additional pointers to the previous and next node on the list), so that the while loop at line 1 simply iterates over this list. When a node is marked dead, it is also removed from the list of alive nodes. In the concrete implementation there is no data structure for a. • Undirected query edges: query edges are undirected, and are dynamically added and removed. To do it in constant time, every node maintains a doubly linked list of alive query edges. Each query edge is then represented by a data structure having two pointers to nodes, and two pointers to the previous and next query edge in the list. In the concrete implementation a queue implemented as a simply linked list is sufficient. • Canonical assignment is obtained by a pointer to a node (possibly undefined) in the data structure for nodes. Let us call atomic the following operations performed by the check: finding an alive node, marking a node as dead, finding a parent, checking and setting canonics, getting the next query edge on a given node, traversing a query edge, deleting a query edge given a pointer to it, adding a query edge between two nodes, pushing, popping, and looking up the head element of a queue.

1057

Lemma 7.1 (Atomic operations are constant). The atomic operations of the blind sharing check are all implementable in constant time on a RAM.

1092

Termination measure. Termination and linearity of the check are proved via a measure of states. The definition abuses a bit

1095

1055 1056

1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091

1093 1094

1096 1097

1036

9 1037

1098

PL’17, January 01–03, 2017, New York, NY, USA 1099 1100

Beniamino Accattoli, Andrea Condoluci, and Claudio Sacerdoti Coen Moreover, the name check terminates in time linear in the size of G.

the notation | · |, used for the number of elements in a set (a and nodes(G)), in a multi-set (q), and in the domain of a function (c).

1161

Composing Theorem 8.1 with Corollary 7.5, we obtain the second main result of the paper.

1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120

Definition 7.2 (State measure). We define the measure |S | of a (refined) state S as follows: • |FAIL| B 0; • |(G, a, q, c, calls)| B |a| + |q| − 2 × |c| + 2 × |nodes(G)|.

Theorem 8.2 (Sharing equality is linear). Let ∼ be a well-scoped query on a term forest G. There is an algorithm that succeeds if and only if there exists a sharing equivalence containing ∼, which is linear in the sizes of G and ∼. Moreover, if it succeeds, it outputs a concrete (and linear) representation of the smallest such equivalence.

Remark 7.3. • The size of q is the number of query edges it contains, where each edge and its symmetric both count one. • The state measure is non-negative: |c| ≤ |nodes(G)| always holds, because c is a (partial) function of domain nodes(G). • The state measure is linear in the size of the state. In particular, for an initial state S it is linear in the size of the term forest G and in the size of the initial query ∼. Namely, |S | = |a| + |q| + 2 × |nodes(G)| = |q| + 3 × |nodes(G)|. Lemma 7.4. Let S be a good state of the blind sharing check. If S → S ′ then |S ′ | < |S|.

1125 1126 1127 1128 1129 1130 1131 1132

The Name Check

Our second algorithm takes in input the output of the blind sharing check, that is, a blind sharing equivalence on a term forest G represented via canonic edges, and checks whether the Var-nodes of G satisfy the name conditions for a sharing equivalence—free variables at line 2, and bound ones at line 3. The name check is based on the fact that to compare a node with all those in its class it is enough to compare it with the canonical representant of the class—note that this fact is used twice, for the Var-nodes and for their binders. The check fails in two cases, corresponding to whether the free or the bound condition fails.

1133 1134 1135 1136 1137

Algorithm 2: Name check Data: canonic(·) representation of a sharing equivalence ≈ Result: is ≈ a sharing equivalence?

1138 1139 1 1140 1141 1142 1143

2

1144 1145 1146 1147 1148 1149

3

Procedure NameCheck() foreach Var-node v do w B canonic(v); if v , w then if binder(v) is undefined or binder(w ) is undefined then fail; else if canonic(binder(v)) , canonic(binder(w )) then fail; end end

1154 1155 1156 1157

1166 1167 1168

1170 1171 1172

This work has been partially funded by the ANR JCJC grant COCA HOLA (ANR-16-CE40-004-01).

1175

1174

1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201

1203 1204 1205 1206 1207 1208 1209 1210 1211 1212

1151

1153

1165

1202

1150

1152

1164

1173

[1] Beniamino Accattoli and Claudio Sacerdoti Coen. 2015. On the Relative Usefulness of Fireballs. In LICS 2015. 141–155. http://dx.doi.org/10.1109/LICS.2015.23 [2] Beniamino Accattoli and Ugo Dal Lago. 2012. On the Invariance of the Unitary Cost Model for Head Reduction. In RTA. 22–37. [3] Beniamino Accattoli and Giulio Guerrieri. 2017. Implementing Open Call-byValue. In FSEN 2017, Tehran, Iran, April 26-28, 2017, Revised Selected Papers. 1–19. [4] Beniamino Accattoli and Ugo Dal Lago. 2014. Beta reduction is invariant, indeed. In CSL-LICS ’14. 8:1–8:10. http://doi.acm.org/10.1145/2603088.2603105 [5] Guy E. Blelloch and John Greiner. 1995. Parallelism in Sequential Functional Languages. In FPCA. 226–237. [6] Robert S Boyer and Jay S Moore. 1972. The sharing of structure in theoremproving programs. Machine intelligence 7 (1972), 101–116. [7] Christophe Calvès. 2013. Unifying Nominal Unification. In RTA 2013, Vol. 21. 143–157. https://doi.org/10.4230/LIPIcs.RTA.2013.143 [8] Christophe Calvès and Maribel Fernández. 2011. The First-order Nominal Link (LOPSTR’10). 234–248. http://dl.acm.org/citation.cfm?id=2008282.2008297 [9] Christophe Calvès and Maribel Fernández. 2010. Matching and alpha-equivalence check for nominal terms. J. Comput. System Sci. 76, 5 (2010), 283 – 301. [10] Clemens Grabmayer and Jan Rochel. 2014. Maximal sharing in the Lambda calculus with letrec. In ICFP 2014. 67–80. https://doi.org/10.1145/2628136.2628148 [11] J. Hopcroft and R. Karp. 1971. A Linear Algorithm for Testing Equivalence of Finite Automata. Technical Report 0. Dept. of Computer Science, Cornell U. [12] Jordi Levy and Mateu Villaret. 2010. An Efficient Nominal Unification Algorithm. In RTA 2010. Edinburgh, Scottland, UK, 209–226. [13] Alberto Martelli and Ugo Montanari. 1977. Theorem Proving with Structure Sharing and Efficient Unification (IJCAI’77). 543–543. [14] Alberto Martelli and Ugo Montanari. 1982. An Efficient Unification Algorithm. ACM Trans. Program. Lang. Syst. 4, 2 (April 1982), 258–282. [15] M.S. Paterson and M.N. Wegman. 1978. Linear unification. J. Comput. System Sci. 16, 2 (1978), 158 – 167. https://doi.org/10.1016/0022-0000(78)90043-0 [16] Zhenyu Qian. 1993. Linear unification of higher-order patterns. In TAPSOFT’93: Theory and Practice of Software Development. 391–405.

1123 1124

1163

Acknowledgments

References

Corollary 7.5 (Linear termination). Let S be an initial state of term forest G and query (edges) q. Then the blind sharing check on S terminates in a number of transitions linear in |nodes(G)| and |q|.

8

1162

1169

Finally, composing with the sharing equality theorem (Theorem 3.9) one obtains that the algorithm indeed tests α-equivalence of the read backs of the query, as expected.

1121 1122

1160

Theorem 8.1 (Soundness & completeness of the name check). Let ∼ be a well-scoped query on a term forest G passing the blind check, and let c be the canonic assignment produced by that check. • if the name check fails then there are no sharing equivalences containing ∼, • otherwise ∼c is the smallest sharing equivalence containing ∼.

1213 1214 1215 1216 1217 1218 1219

1158

10 1159

1220

Sharing Equality is Linear 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231

A

PL’17, January 01–03, 2017, New York, NY, USA

Variables, Meta-Level Substitution, and α-conversion

B B.1

1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254

We need a notion of path, independently of nodes. Definition B.1 (Paths). Paths on graphs are defined inductively as follows: • ϵ is the empty path; • if π is a path, then π · 1 and π · 2 are paths.

Definition A.2 (Bound Variables). 1. bv(x ) B ∅; 2. bv(ts) B bv(t ) ∪ bv(s); 3. bv(λx .t ) B bv(t ) ∪ {x }.

We overload · to denote concatenation of paths: • π · ϵ B π; • π · (π ′ · i) B (π · π ′ ) · i for i ∈ {1, 2}.

Definition A.3 (Variables). 1. v(x ) B {x }; 2. v(ts) B v(t ) ∪ v(s); 3. v(λx .t ) B v(t ) ∪ {x }.

(R ∗ ,

1257

1260 1261 1262 1263 1264 1265 1266

In the paper domination is defined before the formal definitions. Let us redefine it accordingly to our new notations.

Proof. By structural induction on t: • Variable (t = x): v(x ) = {x } = {x } ∪ ∅ = fv(x ) ∪ bv(x ); • Application (t = t 1t 2 ): v(t 1t 2 ) = v(t 1 ) ∪ v(t 2 ) = (fv(t 1 ) ∪ bv(t 1 )) ∪ (fv(t 2 ) ∪ bv(t 2 )) = (fv(t 1 ) ∪ fv(t 2 )) ∪ (bv(t 1 ) ∪ bv(t 2 )) = fv(t ) ∪ bv(t ); • Abstraction (t = λx .s): v(λx .s) = v(s) ∪ {x } = fv(s) ∪ bv(s) ∪ {x } = (fv(s) \ {x }) ∪ (bv(s) ∪ {x }) = fv(λx .s) ∪ bv(λx .s). □

Definition B.3 (Domination). For every root node r and every Lam-node b (v) binding the Var-node v, if r →π v then π = π 1 · π2 and r →π1 b (v). Lemma B.4 (Nesting property). For every Lam-node b (v) binding the Var-node v, and every n: if n →∗ v then b (v) →∗ n or n →∗ b (v).

1269 1270 1271 1272

1274 1275 1276 1277 1278 1279

1289 1290 1291 1292 1293 1294 1295 1296 1297

1299 1300 1301 1302 1303 1304 1305 1306

1308 1309 1310 1311

1313

B.3

Height-preserving equivalences

1314

• h(n) B 0 if n has no children; • h(n) B max{h(m) | m child of n} + 1. Definition B.6 (Height-preserving). A relation ≡ is height-preserving if, whenever n ≡ m, then h(n) = h(m). Lemma B.7. Blind sharing equivalences are height-preserving.

t {x y}{y x } = x {x y}{y x } = y{y x } = x = t or t = z and then t {x y}{y x } = z{x y}{y x } = z = t

3. Abstraction: then t = λz.s. Note that by hypothesis x , z , y. Then t {x y}{y x } = λz.s{x y}{y x } =i.h. λz.s = t □

1273

1288

Quotients of term forests

Definition B.5 (Height of a Node). The height h(n) of a node n in a dag is the natural number defined by:

Proof. By induction on t. Cases: 1. Variable: t cannot be y by hypothesis. So either t = x and then

t {x y}{y x } = s{x y}{y x } (u{x y}{y x }) =i.h. su = t

1287

1312

Lemma A.6. If x < bv(t ) and y < v(t ) then t = t {x y}{y x }.

1268

1286

B.2

Lemma A.6 is a technical lemma to prove Lemma C.5 and Lemma C.6.

2. Application: then t = su and

1285

1307

Proof. Let r be a root of the DAG such that r →π1 n and let n →π2 v. Because b (v) dominates v, there exists π3 , π4 s.t. π1 ·π2 = π3 ·π4 and r →π3 b (v). Then b (v) →∗ n iff π3 is a prefix of π1 and n →∗ b (v) iff π 1 is a prefix of π3 . □

Definition A.5 ((Capture-Avoiding) Substitution). 1. x {x s} B s; 2. y{x s} B y; 3. (ts){x u} B t {x u}s{x u}; 4. (λx .t ){x s} B λx .t; 5. (λy.t ){x s} B λy.t {x s} when y < {x } ∪ fv(s); 6. (λy.t ){x s} B λz.t {y z}{x s} with z < v(t ) ∪ {x } ∪ fv(s);

1267

1283

1298

Lemma A.4. For all t, v(t ) = fv(t ) ∪ bv(t ).

1258 1259

R + ).

Notation B.2 Let R be any binary relation over the nodes of a term forest. We write R ∗ for its reflexive and transitive closure on the whole term forest. We write R + for R ◦ R ∗ , where “◦” is the composition of two binary relations.

1255 1256

1282

Preliminary definitions

1284

Definition A.1 (Free Variables). 1. fv(x ) B {x }; 2. fv(ts) B fv(t ) ∪ fv(s); 3. fv(λx .t ) B fv(t ) \ {x }.

1232 1233

Graphs

Lemma A.7 (Basic properties of α-conversion). If t =α s then 1. Free variables: fv(t ) = fv(s); 2. Size: |t | = |s |. Proof. By induction on t =α s.

1316 1317 1318 1319 1320 1321 1322 1323

Proof. By contradiction, let ≡ a blind sharing equivalence, and suppose there are nodes n ≡ m such that h(n) , h(m): we construct an infinite descending sequence of nodes in the term forest, against the hypothesis that it is a DAG. Note that n and m have the same label by the Labels requirement. The nodes cannot be Var-nodes, because otherwise by definition h(n) = h(m) = 0. Therefore n and m have children resp. n 1 , n 2 and m 1 , m 2 . Now, h(n) , h(m) implies max{h(n 1 ), h(n 2 )} , max{h(m 1 ), h(m 2 )} by definition of h(·). Necessarily either h(n 1 ) , h(m 1 ) or h(n 2 ) , h(m 2 ). By the requirement Bisimulation, n ≡ m implies ni ≡ mi . Therefore either n 1 ≡ m 1 and h(n 1 ) , h(m 1 ) or n 2 ≡ m 2 and h(n 2 ) , h(m 2 ). One can iterate the procedure, continuing with one of the two pairs n 1 , m 1 or n 2 , m 2 . Absurd. □

1324

Proof of (Theorem 3.3). Let ≡ be a blind sharing equivalence on a term forest G. Then:

1338

1. Acyclicity up to ≡: the relation ≡→≡ is acyclic.

□

1315

1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337

1339 1340 1341

1280

11 1281

1342

PL’17, January 01–03, 2017, New York, NY, USA 1343 1344 1345

Beniamino Accattoli, Andrea Condoluci, and Claudio Sacerdoti Coen

2. Sharing equivalences as term forests: if ≡ also satisfies the name conditions then G/≡ is a term forest. Proof.

• Equivalence: the only property of an equivalence relation that ≈ does not satisfy evidently is symmetry, which follows from the symmetry of ∼ and can be proved by structural induction on the definition of ≈. • Labels: if n ≈ m then by Point 1 n ≡ m, and so n and m have the same label, because ≡ is a blind sharing equivalence. • Bisimulation: if n ≈ m then by the previous property n and m have the same label, and so if n →i n ′ then m →i mi and so ni ≈ mi by rule ≈sim . □

1. First we show that if n (≡→≡) m, then h(m) < h(n). Assume n ≡ n ′ → m ′ ≡ m for some n ′, m ′ . Since ≡ is heightpreserving by Lemma B.7, h(n) = h(n ′ ) and h(m) = h(m ′ ); together with h(m ′ ) < h(n ′ ), it entails that h(m) < h(n). By iterating this argument, n (≡→≡) + m implies that h(m) < h(n). Lastly, n (≡→≡) + n implies that h(n) < h(n), absurd. 2. As usual, let [n] = {m | n ≡ m} be the equivalence class of n w.r.t. ≡, which is an equivalence relation by the property Equivalence of the blind sharing equivalence ≡. We define the term forest G/≡ as follows: • The nodes of the term forest are the equivalence classes of the nodes of G. • The label of a node [n] (of G/≡) is the label of each node (of G) in [n]. They are all the same by property Labels of the blind sharing equivalence ≡. • The binder of a variable node [v] (of G/≡) is [bv ]. The definition is well-posed because, by the name conditions, nodes in the same equivalence class have binders in the same equivalence class or, alternatively, they do not have any binder and they are all equal (the equivalence class is a singleton). • The i-th child of a node [n] (of G/≡) is [ni ] where ni is the i-th child of n. The definition is well posed by property Bisimulation of the blind sharing equivalence ≡ that implies that children in corresponding position of nodes in the same equivalence class are in the same equivalence class. We now verify that G/≡ is indeed a term forest: • Labels, Children, Binders: they hold by definition. • Acyclicity: by Point 1. • Domination: let [r ] →τ [v] in G/≡ where [r ] is a root of G/≡ and [v] is a variable node. By construction r →τ v ′ in G. Moreover, r is a root in G: by absurdum, if r is not a root, there is a node m → r and therefore, by construction, [m] → [r ] that would contradict the hypothesis that [r ] is a root. Thus, by Domination (Definition B.3)of G, r →τ1 b (v ′ ) →τ2 v ′ where τ = τ1 · τ2 and b (v ′ ) is the binder of v ′ . Again by construction [r ] →τ1 [b (v ′ )] →τ2 [v ′ ] = [v]. Finally, by construction again, [b (v ′ )] is the binder of [v ′ ] = [v] and therefore the binder of [v] dominates [v].

1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385

1390 1391 1392 1393

1396 1397 1398 1399 1400 1401

1407 1408 1409 1410 1411 1412 1413

1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428

1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445

As expected, the read backs of any two nodes are commonly shareable, which subsumes the fact that the read back of a single node is shareable.

1446 1447 1448

Lemma C.2. Let G be a term forest, n and m two of its nodes. Then JnK and JmK are commonly shareable.

Sharing universality of ≈

Proof of (Proposition 3.6). Let ∼ be a query. If there exists a blind sharing equivalence ≡ containing ∼ then:

Parametric α -equivalence. The next step is to define a special notion of α-equivalence on shareable terms, noted t =αΓ s, that is halfway between α-equivalence and sharing equivalences on term forests. It is parametric in a renaming Γ, that is morally a set of identifications of variable names respecting some constraints. The reason for the parameter Γ comes from the need of establishing the relationship with sharing equivalences. Consider two nodes n and m that are shared equivalent, i.e. n ≡ m, and that appear in the scope of two shared equivalent Lam-nodes of variables v and w. The read back JnK of n in general contains free occurrences of JvK and similarly for JmK and JwK. Then JnK and JmK are not literally α-equivalent, they are α-equivalent only up to the identification of

1. The propagated query ≈ is contained in ≡, i.e. ≈ ⊆ ≡. 2. ≈ is the smallest blind sharing equivalence containing ∼.

1394 1395

1406

1429

Shareable terms. The λ-terms obtained by reading back a node has a particular structure of names, induced by the structural properties of terms forests and the unique identifiers of Var-nodes. Essentially, abstractions with the same name must have the same body, becuase they arise from shared Lam-nodes in the term forest. This is expressed by the following notion on terms. Note that in the definition we use C⟨·⟩ and D⟨·⟩ for contexts, defined in the usual way. The notation C⟨t⟩ denotes the term obtained filling the hole of the context C⟨·⟩ with the term t. Definition C.1 (Sharable Terms). • A term t is shareable if whenever t = C⟨λx .s⟩ and t = D⟨λx .u⟩ then s = u. • Two terms t and s are commonly shareable if they are shareable and moreover whenever t = C⟨λx .u⟩ and s = D⟨λx .r ⟩ then u = r .

□ B.4

Proving the Sharing Equality Theorem

In this section we explain the concepts and the statements needed to prove the sharing equality theorem (Theorem 3.9), that connects α-equivalence, (propagated) queries, and sharing equivalences. Auxiliary theorems and the full proofs are in Appendix D and Appendix E. For the proof, we introduce a third, intermediate notion of equivalence on a restricted class of terms. The idea is that 1) these terms, deemed shareable, are the special α-representants coming from the read back of nodes, and 2) the new equivalence can both be seen as α-equivalence restricted to, and sharing equivalence reformulated on, shareable terms. The full proof of Theorem 3.9, connecting the intermediate results presented here, is at page 19.

1388 1389

1405

1414

C

1386 1387

1404

Proof. 1. By induction on the definition of ≈. The base cases (∼ and reflexivity) hold because ≡ contains ∼ and is reflexive. The inductive cases follow from the i.h. and the fact that ≡ is transitive and closed by simulation. 2. we show each property of blind sharing equivalences, minimality is then given by Point 1:

1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463

1402

12 1403

1464

Sharing Equality is Linear 1465 1466 1467 1468 1469 1470

PL’17, January 01–03, 2017, New York, NY, USA

JvK and JwK. Such an identification is expressed by ≡, because the two binders of v and w are equivalent, but it happens above n and m. Therefore, we need a notion of α equivalence up to identifications of variables (coming from a context, that is, above the two nodes).

Note that the definition only asks ≡ to be a relation, and that under this hypothesis v and w are not necessarily ≡-related. This is because induced name pairings are needed also in the proof of the other direction (that is, in the forthcoming paragraph parametric α ⇒ ho sharing), where there are no hypotheses on ≡. When the relation is a blind sharing equivalence, v ≡ w holds and the induced name pairing is a proper renaming.

Definition C.3 (Renaming). A renaming Γ is a set of pairs of names such that

1471 1472 1473 1474 1475 1476

• Distinct names: if (x, y), (z, w ) ∈ Γ then x, y, z, and w are all distinct names.

1479 1480 1481 1482 1483

A renaming Γ for a pair of terms (t, s) is a renaming such that • Crossed independence: if (x, y) ∈ Γ then x < bv(t ) ∪ v(s) and y < bv(s) ∪ v(t ). For simplicity, we now define =αΓ without restricting to shareable terms, but it is only on them, and when Γ is empty, that it shall coincide with =α .

Then, by induction on the heigth of nodes (Definition B.5), one obtains that if two nodes are sharing equivalent then their interpretations are α-equivalent up to the induced renaming.

Definition C.4 (α-conversion up-to renaming). Let Γ be a renaming. The relation α-equivalence up to the renaming Γ is defined inductively by the following rules:

Theorem C.9 (Sharing ⇒ parametric α). Let ≡ a sharing equivalence on a term forest G, and n and m two nodes of G such that n ≡ m. Then JnK =αΓ JmK where Γ is the renaming induced by ≡ on n and m ≡ ). (i.e. Γ B Γn,m

1484 1485 1486 1487 1488 1489 1490

1. 2. 3. 4. 5.

Same variables: x =αΓ x; Different variables: x =αΓ y if (x, y) ∈ Γ; Application: t u =αΓ s r if t =αΓ s and u =αΓ r ; Same abstracted variable: λx .t =αΓ λx .s if t =αΓ s; Different abstracted variables: λx .t =αΓ λy.s if Γ ∪ {(x, y)} is Γ∪{(x,y ) } a renaming for the pair (t, s) and t =α s.

1493 1494 1495 1496 1497 1498 1499 1500

We now proceed to show that α-equivalence on shareable terms coincides with α-equivalence up to renamings, that in turn coincides with the existence of a sharing equivalence. The first part is simple, the second one requires to treat the two directions of the equivalence separetely.

Lemma C.10. Let ∼ be a query on a term forest G. If JnK for every two queried nodes n ∼ m then

Parametric α = α + shareable. Parametric α-equivalence =αΓ is easily seen to be a special case of α-equivalence, when the pairs in the renaming are interpreted as substitutions on one of the two terms (this is formalized Lemma D.5). Then:

1505

Lemma C.5 (Parametric α ⇒ α). If t =∅α s then t =α s.

1508

The other direction does not hold, in general. For instance, λx .λy.xy =α λy.λx .yx but λx .λy.xy ,∅α λy.λx .yx, because the two terms are not commonly shareable. Otherwise, Lemma C.6 (α + shareable ⇒ parametric α). Let t and s be commonly shareable. Then t =α s implies t =∅α s.

1509 1510 1511 1512 1513 1514 1515 1516

Sharing ⇒ parametric α . The key point in establishing this direction is to understand how to extract a renaming Γ from a sharing equivalence. First, we show how to extract a set of pairs of names.

1518 1519 1520 1521 1522 1523

b (v) ≡ b (w )

→+

n

→∗

→+

m

→∗

v , w

1532

1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545

1547 1548 1549 1550 1551 1552 1553

1555 1556 1557 1558 1559 1560 1561 1562

1564

≈ = ∅ for every two 1. The induced name pairing is empty: Γn,m queried nodes n ∼ m. ′ 2. Parametric α propagates, with the induced renaming: JpK =αΓ JqK for every two propagatedly queried nodes p ≈ q with ≈ induced by the propagated respect to the renaming Γ ′ B Γn,m query.

1567

Last, the definition of the renaming induced by the propagated query contains exactly what is needed to prove the λ requirements for a blind sharing equivalence.

Definition C.7 (Induced name pairing Let ≡ be a relation ≡ on a term forest G, and n, m nodes of G. The name pairing Γn,m induced by ≡ on n and m is defined by:    ≡ Γn,m B (JvK , JwK)   

1531

Lemma C.11 (Parametric α propagates + renamings). Let ∼ be a query on a term forest G. If JnK =∅α JmK for every two queried nodes n ∼ m then

≈ Γn,m

≡ ). Γn,m

1517

1530

1563

1506 1507

1529

1554

JmK

Second, we show that the renamings induced by parametric αequivalence on the subterms are exactly the name pairings induced by the propagated query.

1503 1504

=∅α

1. The query is well-scoped: ∼ is well-scoped; 2. The propagated query is sharing: ≈ is a blind sharing equivalence.

1501 1502

1528

1546

Parametric α ⇒ sharing. Here the aim is to show that if JnK =∅α JmK for every two queried nodes n ∼ m then the query is wellscoped and the propagated query is a sharing equivalence. The name pairing induced by the propagated query is a key tool here as well. First, we show that the propagated query is a blind sharing equivalence and that the query is well-scoped.

1491 1492

1527

1533

Lemma C.8 (Sharing ⇒ induced name pairing = renaming). Let ≡ be a blind sharing equivalence on a term forest G, and n and m two ≡ is a renaming for nodes of G. Then the induced name pairing Γn,m (JnK , JmK).

1477 1478

1526

Lemma C.12. Let ∼ be a query on a term forest G. If JnK =αΓ JmK ≈ for every n ≈ m then ≈ is a sharing equivalence. with Γ B Γn,m

      

1565 1566

1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580

Putting it all together, 1581

Theorem C.13 (Parametric α ⇒ sharing). Let ∼ be a query on a term forest G. If JnK =∅α JmK for every two queried nodes n ∼ m then ∼ is well-scoped and ≈ is a sharing equivalence.

that is, it is the set of names of distinct bound Var-nodes reachable from n and m and such that their binders are ≡-related and above n and m.

1582 1583 1584 1585

1524

13 1525

1586

PL’17, January 01–03, 2017, New York, NY, USA 1587

D

Beniamino Accattoli, Andrea Condoluci, and Claudio Sacerdoti Coen

Sharing vs. α Equivalences

To show that JnK and JmK are commonly shareable the reasoning is exactly the same. □

1588 1589

The following lemma translates properties of variables in terms to properties of Var-nodes in the term forest.

1592 1593 1594 1595

Lemma D.1. Let n any node, and v a Var-node:

1. By course of value induction on h(n): • h(n) = 0: n is a variable, v(JnK) = {JnK}, necessarily JvK = JnK, and v = n by the definition of J·K and the definition of term forest. Therefore n →∗ v. • h(n) > 0: let n have children n 1 and n 2 . JvK ∈ v(JnK) = v(Jn 1 K) ∪ v(Jn 2 K), say JvK ∈ v(Jni K). By i.h., ni →∗ v. Therefore n →∗ v. 2. By course of value induction on h(n): • h(n) = 0 (Variable) bv(JnK) = ∅, this case is not possible. • h(n) > 0 (Application, n = App(n 1 , n 2 )): JvK ∈ bv(Jn 1 K Jn 2 K) = bv(Jn 1 K) ∪ bv(Jn 2 K), say JvK ∈ bv(Jni K). By i.h., v has binder b (v) and ni →∗ b (v). Therefore n →∗ b (v). • h(n) > 0 (Abstraction, n = Lam(w, m)): JvK ∈ bv(λJwK.JmK) = bv(JmK) ∪ {JwK}. Two cases: if JvK = JwK (i.e. v = w by definition of J·K and the definition of term forest), then n = b (v). If otherwise JvK ∈ bv(JmK), by i.h. m →∗ b (v), and therefore n →∗ b (v). 3. By course of value induction on h(n): • h(n) = 0 (Variable): fv(JnK) = {JnK}, necessarily JvK = JnK, v = n by the definition of J·K and the definition of term forest. Clearly b (v) →+ v. • h(n) > 0 (Application, n = App(n 1 , n 2 )): JvK ∈ fv(Jn 1 K Jn 2 K) = fv(Jn 1 K) ∪ fv(Jn 2 K), say JvK ∈ fv(Jni K). By i.h., b (v) →+ ni . By Domination (Definition B.3), b (v) →∗ n. Clearly n , b (v) because n is an App-node. Therefore b (v) →+ n. • h(n) > 0 (Abstraction, n = Lam(w, m)): JvK ∈ fv(λJwK.JmK) = fv(JmK) \ {JwK}. Then JvK ∈ fv(JmK) and JvK , JwK, i.e. v , w by the definition of J·K and the definition of term forest. By i.h., b (v) →+ n. By Domination (Definition B.3), b (v) →∗ Lam(w, n). Since v , w, then b (v) , b (w ) , Lam(w, n), and b (v) →+ n.

1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627

□

1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647

D.1

Properties of Parametric α -equivalence.

Renamings used in parametric α-equivalence have an additional property: they cover equivalent terms, as we will see in Lemma D.3.2.

1. if JvK ∈ v(JnK), then n →∗ v; 2. if JvK ∈ bv(JnK), then v has binder b (v) and n →∗ b (v); 3. if JvK ∈ fv(JnK) and v has binder b (v), then b (v) →+ n. Proof.

1596

Shareable terms

The notion of commonly shareable (Definition C.1) correctly captures the properties of names in the interpration of nodes and the struture of subterms: Proof of (Lemma C.2). Let G be a term forest, n and m two of its nodes. Then JnK and JmK are commonly shareable. Proof. Essentially, the property follows from the fact that every Varnode has a distinguished identifier. Let us show that JnK is shareable. If JnK = C⟨λx .t⟩ = D⟨λx .s⟩ then by uniqueness of identifiers the two variable nodes whose interpretation is x coincide—let us note this node v. Then by uniqueness of binders both abstractions are obtained as the interpretation of a same Lam-node Lam(v, p). Finally, by definition of the interpretation of nodes as terms, both bodies t and s are obtained as interpretations of p, that is, t = JpK = s—then JnK is shareable.

1649 1650

D.2

1590 1591

1648

Definition D.2 (Covering renaming). Γ covers (t, s) if it is a renaming for (t, s), and it renames all the free variables that are not in common between t and s: • for every x ∈ fv(t ) \ fv(s) there exists y ∈ fv(s) \ fv(t ) such that (x, y) ∈ Γ; • for every y ∈ fv(s) \ fv(t ) there exists x ∈ fv(t ) \ fv(s) such that (x, y) ∈ Γ. Lemma D.3 (Properties of α-equivalence up-to). t, s, u terms: 1. Reflexivity: t =∅α t. 2. Coverage: Let Γ renaming for (t, s). If t =αΓ s then Γ covers (t, s). 3. Monotonicity: Let Γ ⊆ Γ ′ renamings for (t, s). If t =αΓ s then ′ t =αΓ s. ′ 4. Sufficience: Let Γ ⊆ Γ ′ covering (t, s). If t =αΓ s then t =αΓ s. Proof. 1. Easy structural induction on t. 2. By induction on t =αΓ s: • Variables: suppose z =αΓ w. fv(z) = {z} and fv(w ) = {w }. If z = w there’s nothing to prove because fv(z) \ fv(w ) = fv(w ) \ fv(z) = ∅. Otherwise (z, w ) ∈ Γ, and conclude easily. • Application: let t = t 1t 2 and s = s 1s 2 . By inversion of Application, ti =αΓ si for i = 1, 2. By i.h., Γ covers both (ti , si ); we need to show that it covers (t, s) as well. We prove one of the two directions, the other is symmetric. Let z ∈ fv(t ) \ fv(s): by the definition of fv(·), z ∈ fv(ti ) for some i. z < fv(si ) because by hypothesis z < fv(s). Because Γ covers (ti , si ), there exists w ∈ fv(si ) \ fv(ti ) such that (z, w ) ∈ Γ. w < fv(t ) by Crossed independence for Γ renaming for (t, s). Therefore w ∈ fv(s) \ fv(t ) and conclude. • Abstraction: let t = λx .t ′ and s = λy.s ′ . By inversion ′ of Same/Different abstracted variables, t ′ =αΓ s ′ where ′ ′ Γ may be Γ or Γ ∪ {(x, y)}. By i.h. Γ covers (t ′, s ′ ); we need to show that Γ covers (t, s). We prove one of the two directions, the other is symmetric. Let z ∈ fv(t ) \ fv(s): by the definition of fv(·), z ∈ fv(t ′ ) and z , x. Because Γ ′ covers (t ′, s ′ ), there exists w ∈ fv(s ′ ) \ fv(t ′ ) such that (z, w ) ∈ Γ ′ . Because z , x, (z, w ) ∈ Γ. By Crossed independence for Γ, w < bv(s) and thus w , y. Therefore y ∈ fv(s) \ fv(t ) and conclude. 3. By induction on t =αΓ s: • Variables: immediate. • Application: let t = t 1t 2 and s = s 1s 2 . By inversion of Application, ti =αΓ si for i = 1, 2. Since Γ ′ is a renaming for ′ (t, s), it is a renaming for (ti , si ) as well. By i.h., ti =αΓ si . ′ Γ By Application, t =α s. • Abstraction: let t = λx .t ′ and s = λy.s ′ . The case x = y is easy: by inversion of Same abstracted variable t ′ =αΓ s ′ , ′ by i.h. t ′ =αΓ s ′ , and conclude by Same abstracted variable. If x , y, by inversion of Different abstracted variables

1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707

14 1708

Sharing Equality is Linear

PL’17, January 01–03, 2017, New York, NY, USA

Γ, (x,y )

Γ∪{(x,y ) }

s ′ . Note that Γ ′ ∪ {(x, y)} is a renaming for t ′ =α ′ ′ (t , s ) because: Γ ′ is a renaming for (t, s); Γ ∪ {(x, y)} is a renaming for (t ′, s ′ ); x does not occur in Γ ′ because x ∈ bv(t ); y does not occur in Γ ′ because y ∈ bv(s). Also, Γ′, (x,y ) ′ Γ ∪ {(x, y)} ⊆ Γ ′ ∪ {(x, y)}. By i.h., t ′ =α s , and by ′ Different abstracted variables t =αΓ s. ′ 4. By induction on t =αΓ s: • Variables: immediate. • Application: let t = t 1t 2 and s = s 1s 2 . By inversion of ′ Application, ti =αΓ si for i = 1, 2. Since Γ covers (t, s), it covers (ti , si ) as well. By i.h., ti =αΓ si . By Application, t =αΓ s. • Abstraction: let t = λx .t ′ and s = λy.s ′ . The case x = y ′ is easy: by inversion of Same abstracted variable t ′ =αΓ ′ ′ ′ s ; Γ covers (t , s ) because it covers (t, s) and x = y ∈ fv(t ′ ) = fv(s ′ ); therefore by i.h. t ′ =αΓ s ′ and conclude by Same abstracted variable. If x , y, by inversion of Γ′, (x,y ) ′ Different abstracted variables t ′ =α s . Γ ∪ {(x, y)} is a renaming for (t ′, s ′ ) because Γ ∪ {(x, y)} ⊆ Γ ′ ∪ {(x, y)}. Γ ∪ {(x, y)} covers (t ′, s ′ ) because: Γ covers (t, s); fv(t ′ ) = Γ, (x,y ) ′ fv(t ) ∪ {x }; fv(s ′ ) = fv(s) ∪ {y}. By i.h., t ′ =α s , and Γ by Different abstracted variables t =α s.

1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732

3. Different variables 2: z =α w because (z, w ) ∈ Γ. Note that z , x by the bijective property of renamings. Therefore, z{x y} = z =αΓ w. Γ∪{(x,y ) } Γ∪{(x,y ) } p q = s because u =α – Application: t = u r =α Γ∪{(x,y ) } p and r =α q. Clearly the hypotheses on the renamings also hold with respect to the pair of left / right subterms of the applications. By i.h., u{x y} =αΓ p and r {x y} =αΓ q. Therefore, t {x y} = (u r ){x y} = u{x y} r {x y} =αΓ p q = s. Γ∪{(x,y ) } λz.r = s – Same abstracted variable: t = λz.u =α Γ∪{(x,y ) } because u =α r . In order to apply the i.h. we only need to show that Γ ∪ {(x, y)} is a renaming for (u, r ), but this is obvious (because (u, r ) have less bound variables than (λz.u, λz.r )). Then by i.h. u{x y} =αΓ r , and so t {x y} = λz.u{x y} =αΓ λz.r = s. Γ∪{(x,y ) } λw.r = – Different abstracted variables: t = λz.u =α s because Γ ∪ {(x, y), (z, w )} is a valid renaming for the Γ∪{(x,y ), (z,w ) } pair (u, r ) and u =α r . Note that x , z and y , w by the crossed independence property of renamings Γ∪{(z,w ) } for pairs. By i.h., u{x y} =α r and so t {x y} = Γ λz.u{x y} =α λx .r = s. • ⇐) By induction on t. Cases: – Variable: t = y is impossible, because by the hypotheses on renamings y < fv(t ). For the case where the two variables are the same there are two subcases: Γ∪{(x,y ) } 1. t = x and x {x y} = y =αΓ y. Then, x =α y. Γ∪{(x,y ) } Γ 2. t = z and z{x y} = z =α z. Then, z =α z. For the case of different variables there are two subcases: 1. The case x {x y} = y =αΓ z because (y, z) ∈ Γ is impossible because by hypothesis Γ ∪ {(x, y)} is a renaming and so there cannot be another pair involving y in Γ. Γ∪{(x,y ) } 2. z{x y} = z =αΓ w because (z, w ) ∈ Γ. Then, z =α w. – Application: t {x y} = (u r ){x y} = u{x y} r {x y} =αΓ p q = s because u{x y} =αΓ p and r {x y} =αΓ q. Clearly the hypotheses on the renamings also hold with respect to the pair of left / right subterms of the applications. By Γ∪{(x,y ) } Γ∪{(x,y ) } i.h., u =α p and r =α q. Therefore, t = Γ∪{(x,y ) } u r =α p q = s. – Same abstracted variable: t {x y} = λz.u{x y} =αΓ λz.r = Γ∪{(x,y ) } s because u{x y} =αΓ r . By i.h., u =α r . Therefore, Γ∪{(x,y ) } t = λz.u =α λz.r = s. – Different abstracted variables: t {x y} = λz.u{x y} =αΓ λx .r = s because

□

1733 1734

Parametric α -equivalence vs α -equivalence

D.3

1735 1736 1737 1738

Lemma D.4 and Lemma D.5 show that pairs in a renaming are equivalent to a single capture avoiding substitutions. Iterating the lemmas one could eventually reduce to the empty nameset. The lemmas will be used in the proofs of Lemma C.5 and Lemma C.6.

1739 1740 1741 1742 1743 1744 1745 1746 1747 1748 1749 1750 1751

Lemma D.4. Let Γ ∪ {(x, y)} be a renaming for the pair (t, s). Then Γ is a renaming for the pair (t {x y}, s). Proof. Define Γ ′ B Γ ∪ {(x, y)}. Then: • Γ is a renaming: obvious, because Γ ∪ {(x, y)} is. • Γ is a renaming for (t {x y}, s): let (z, w ) ∈ Γ. We need to show that z < bv(t {x y}) ∪ v(s) and w < bv(s) ∪ v(t {x y}). By Crossed independence for Γ ′ , z < bv(t )∪v(s) = bv(t {x y})∪ v(s), and so the requirements for z are satisfied, and w < bv(s) ∪ v(t ). Since v(t {x y}) = v(t ) ∪ {y} \ {x } (since y < bv(JtK)), and w is distinct from y by the hypothesis that Γ ′ is a renaming, we have w < bv(s) ∪ v(t {x y}). □

1752 1753 1754 1755

Lemma D.5. Let Γ ∪ {(x, y)} be a renaming for the pair (t, s). Then Γ∪{(x,y ) } t =α s if and only if t {x y} =αΓ s.

1756 1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767

Γ ∪ {(z, w )} is a valid renaming for the pair (u{x y}, r )

Proof. Γ∪{(x,y ) } =α

Γ∪{(z,w ) } =α

• ⇒) By induction on t s. Cases: – Variable: by the crossed independence property of the renaming Γ ∪ {(x, y)} for the pair (t, s) we have that s Γ∪{(x,y ) } cannot be x and t cannot be y, so the cases x =α Γ∪{(x,y ) } Γ∪{(x,y ) } Γ∪{(x,y ) } x, y =α y, y =α x , y =α z, and Γ∪{(x,y ) } z =α x are impossible. There are three cases: Γ∪{(x,y ) } 1. Same variable: if z =α z then z{x y} = z =αΓ z. Γ∪{(x,y ) } 2. Different variables 1: x =α y. Then x {x y} = y =αΓ y.

1770 1771 1772 1773 1774 1775 1776 1777 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787 1788 1789 1790 1791 1792 1793 1794 1795 1796 1797 1798 1799 1800 1801 1802 1803 1804 1805 1806 1807 1808 1809 1810 1811 1812 1813 1814 1815 1816 1817

(1)

and u{x y} r . First, we have to prove that Γ ′ B Γ ∪ {(x, y), (z, w )} is a valid renaming for (u, r ): 1. Γ ′ is a renaming: by (1), z and w are distinct from every other name in Γ. By hypothesis, Γ ∪ {(x, y)} is a valid renaming for the pair (λz.u, λw.r ), and by the crossed independence property of renamings for pairs, x , z, x , w, y , z, and y , w. 2. Γ ′ is a renaming for (u, r ): a. Requirements for (z, w ): we have to prove that z < bv(u) ∪ v(r ) and w < bv(r ) ∪ v(u). By (1) and crossed

1818 1819 1820 1821 1822 1823 1824 1825 1826 1827 1828 1829

1768

15 1769

1830

PL’17, January 01–03, 2017, New York, NY, USA 1831 1832 1833 1834 1835 1836 1837 1838 1839 1840 1841 1842 1843 1844 1845 1846 1847 1848 1849

Beniamino Accattoli, Andrea Condoluci, and Claudio Sacerdoti Coen

independence, z < bv(u{x y}) ∪ v(r ) = bv(u) ∪ v(r ), and so the requirements for z are satisfied, and w < bv(r ) ∪ v(u{x y}) = bv(r ) ∪ (v(u) ∪ {y} \ {x }). Since we already proved that y , w and x , w we have that w < bv(r ) ∪ (v(u) ∪ {y} \ {x }) if and only if w < bv(r ) ∪ v(u), and so the requirements for w are also satisfied. b. Requirements for (x, y): we have to prove that x < bv(u) ∪ v(r ) and y < bv(r ) ∪ v(u). By hypothesis, Γ ∪ {(x, y)} is a valid renaming for the pair (λz.u, λw.r ), so that x < bv(λz.u) ∪ v(λw.r ) = (bv(u) ∪ {z}) ∪ (v(r ) ∪ {w }) and y < bv(λw.r ) ∪ v(λz.u) = (bv(r ) ∪ {w })∪(v(u)∪{z}). The requirements then are satisfied because we already proved that x, y, z, and w all are distinct. Γ∪{(x,y ) } Second, we have to prove that t = λz.u =α λw.r = ′ s. Since Γ is a renaming for (u, r ), we can apply the i.h. ′ and obtain u =αΓ r , from which the thesis follows. □

• u is shareable. Just because λx .u is shareable by hypothesis; • r is shareable. Let r = C⟨λz.r 1 ⟩ = D⟨λz.r 2 ⟩. First of all, note that the hypothesis on t and s commonly shareable implies that x , z , y, for size reasons. Then we have s = λy.C⟨λz.r 1 ⟩{x y} = λy.D⟨λz.r 2 ⟩{x y}. Since both x and y are not in bv(r ) (Point 3 and Point 6), we have C⟨λz.r 1 ⟩{x y} = C{x y}⟨λz.r 1 {x y}⟩ and D⟨λz.r 2 ⟩{x y} = D{x y}⟨λz.r 2 {x y}⟩. Then r 1 {x y} = r 2 {x y} by the fact that s is shareable. Since y < v(r ) ⊇ v(r 1 ) ∪ v(r 2 ), this implies r 1 = r 2 , that is, r is shareable. • u and r are commonly shareable. Let u = C⟨λz.u ′ ⟩ and r = D⟨λz.r ′ ⟩. As before, the hypothesis on t and s commonly shareable implies that x , z , y, for size reasons. Since both x and y are not in bv(r ) (Point 3 and Point 6), it follows that r {x y} = D⟨λz.r ′ ⟩{x y} = D{x y}⟨λz.r ′ {x y}⟩. Then r ′ {x y} = u ′ because t = λx .C⟨λz.u ′ ⟩ and s = λy.D{x y}⟨λz.r ′ {x y}⟩ are commonly shareable. Note that Point 1 and Point 2 give y < v(u), that implies y < v(u ′ ), in turn giving x < fv(r ′ ) (otherwise r ′ {x y} = u ′ cannot hold). Moreover, y < bv(r ) (Point 3) implies r ′ = r ′ {x y}, that is, r ′ = u ′ . Therefore, u and r are commonly shareable.

1850 1851 1852 1853 1854 1855 1856 1857 1858 1859 1860 1861 1862 1863 1864

We can now relate parametric α-equivalence with empty paramater to usual α-equivalence: Proof of (Lemma C.5). If t

=∅α

=∅α

Now, we are allowed to apply the i.h. to u =α r , obtaining u r. By Lemma A.6, r = r {x y}{y x }, so that u =∅α r {x y}{y x }. By Lemma D.5 applied to u =∅α r {x y}{y x } and the fact that {(x, y)} {(x,y ) } is a renaming for (u, r {x y}), it follows u =α r {x y}, and so ∅ t = λx .u =α λy.r {x y} = s by definition of =αΓ . □

s then t =α s.

Proof. By induction on t =∅α s. In all cases but t = λx .u =∅α λy.r = s it is either evident (same free variable), or impossible (different free variable), or it follows immediately from the i.h. (application and abstraction on the same variable)—then we consider the non-trivial case of different abstracted variables, where λx .u =∅α λy.r with {(x,y ) } {(x, y)} renaming for the pair (u, r ) and u =α r . By Lemma D.5, ∅ u{x y} =α r , and by i.h. u{x y} =α r . Then λx .u{x y}{y x } =α λy.r by definition of α-equivalence. Because {(x, y)} is a renaming for the pair (u, r ), x < bv(t ) and y < v(t ). Therefore by Lemma A.6, u = u{x y}{y x }, hence λx .u =α λy.r . □

E

1867 1868 1869 1870 1871 1872 1873 1874 1875 1876 1877 1878 1879 1880 1881 1882 1883 1884 1885 1886 1887 1888 1889

1893 1894 1895 1896 1897 1898 1899 1900 1901 1902 1903 1904 1905 1906 1907 1908 1909 1910 1911 1912 1913 1914 1915 1916 1917 1918 1919

Parametric α -equivalence vs. Sharing equivalence

1920 1921

The name pairing induced by a blind sharing equivalence on queried nodes is empty:

1922 1923 1924

Lemma E.1. Let ∼ well-scoped, and ≡ blind sharing equivalence ≡ = ∅ for every n ∼ m. containing ∼. Then Γn,m

1865 1866

1892

For the other direction one needs commonly shareable terms, because generic terms may require renamings violating the conditions in Definition C.3.

Proof. By induction on t =α s. The non-trivial case is when t = λx .u =α λy.r {x y} = s with u =α r and y < fv(r ). Note that: 1. y < fv(u), because u =α r implies fv(u) = fv(r ) by Lemma A.7.1; 2. y < bv(u), by the hypothesis on commonly shareable terms, that otherwise would force s to be α-equivalent with a term having s as strict subterm, absurd; 3. y < bv(r ), by the hypothesis on commonly shareable terms, that otherwise would force s to have itself as a strict subterm; 4. bv(r ) = bv(r {x y}) because by the previous point the susbtitution {x y} never renames bound names in r ; 5. x < bv(r {x y}) by the hypothesis on commonly shareable terms; 6. x < bv(r ) by Point 4 and Point 5; 7. x < fv(r {x y}). 8. x < bv(u), by the hypothesis on commonly shareable terms, similarly as in point 3; 9. Then {(x, y)} is a renaming for (u, r {x y}). Let us show that u and r are commonly shareable:

1890

1926

Proof. By contradiction, (JvK , JwK) ∈ By definition of b (v) and b (w ) are above respectively n and m. Because n ∼ m and ∼ is well-scoped, b (v) is above m as well. By Nesting property (Lemma B.4), b (v) →+ m →∗ w implies that b (v) →∗ b (w ) or viceversa. Because b (v) ≡ b (w ) and ≡ is height-preserving by Lemma B.7, then necessarily b (v) = b (w ), and hence v = w. This ≡ . contradicts the definition of Γn,m □

1927

≡ = ∅ for Corollary E.2. If ≡ is a blind sharing equivalence, Γn,n every node n.

1935

≡ . Γn,m

Proof of (Lemma C.6). Let t and s be commonly shareable. Then t =α s implies t =∅α s.

1925

≡ , Γn,m

1928 1929 1930 1931 1932 1933 1934

1936 1937

Proof. Follows from Lemma E.1, by taking = as ∼, noting that = is trivially scoped and contained in every congruence. □

1938 1939 1940

Name pairings induced by a blind sharing equivalence are renamings:

1941

Proof of (Lemma C.8). Let ≡ be a blind sharing equivalence on a term forest G, and n and m two nodes of G. Then the induced name ≡ is a renaming for (JnK , JmK). pairing Γn,m

1943

Proof. We show that the requirements from Definition C.3 hold:

1947

• Distinct names: ≡ : then JvK , JwK because v , w by – Let (JvK , JwK) ∈ Γn,m ≡ . the definition of Γn,m

1942

1944 1945 1946

1948 1949 1950 1951

16 1891

1952

Sharing Equality is Linear 1953 1954 1955 1956 1957 1958 1959 1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975 1976

PL’17, January 01–03, 2017, New York, NY, USA

≡ : we show that neces– Let (JvK , Jw 1 K), (JvK , Jw 2 K) ∈ Γn,m ≡ , b (v) ≡ sarily Jw 1 K = Jw 2 K. From the definition of Γn,m b (w 1 ) and b (v) ≡ b (w 2 ). If w 1 , w 2 , then (Jw 1 K , Jw 2 K) ∈ ≡ , impossible by Corollary E.2. Γm,m ≡ implies Jv K = – Similarly, (Jv 1 K , JwK), (Jv 2 K , JwK) ∈ Γn,m 1 Jv 2 K. ≡ , and let’s prove • Crossed independence: let (JvK , JwK) ∈ Γn,m that JvK < bv(JnK) ∪ v(JmK) (the other case for JwK is symmetric): – JvK < bv(JnK): assume JvK ∈ bv(JnK) and derive a contradiction. By Lemma D.1, JvK ∈ bv(JnK) implies that n →∗ b (v), hence generating the cycle b (v) →+ n →∗ b (v). This contradicts the hypothesis that G is a dag. – JvK < fv(JmK): assume JvK ∈ fv(JmK) and derive a contradiction. By Lemma D.1 and since v has a binder, b (v) →+ ≡ also b (w ) →+ m →∗ w, m →∗ v. By definition of Γn,m ≡ , b (v) ≡ b (w ), and v , w. Therefore (JvK , JwK) ∈ Γm,m absurd by Corollary E.2. – JvK < bv(JmK): assume JvK ∈ bv(JmK) and derive a contradiction. By Lemma D.1, JvK ∈ bv(JmK) implies that m →∗ b (v), hence generating the cycle b (v) →+ n ≡ m →∗ b (v). This contradicts the hypothesis that ≡ is height-preserving or that G is a DAG. □

≡ is a renaming for (JnK , JmK) by Proof. First note that Γ = Γn,m Lemma C.8 (because ≡ is in particular a blind sharing equivalence) . Let’s prove that JnK =αΓ JmK by induction on h(n) = h(m) (equal by Lemma B.7). First note that if n = m, then Γ = ∅ by Corollary E.2, and JnK =∅α JmK = JnK by reflexivity of α-conversion up-to ∅ (Lemma D.3.1). Let’s assume then that n , m, and consider three cases: • h(n) = 0, i.e. n and m are both Var-nodes. By Free for ≡, n and m have both a binder because n , m. By definition Γ = {(JnK , JmK)} (because n ≡ m by hypothesis, and b (n) ≡ b (m) by Bound for ≡). Clearly JnK =αΓ JmK by the rule Different variables. • h(n) > 0 and n and m are App-nodes resp. App(n 1 , n 2 ) and App(m 1 , m 2 ). By Bisimulation for ≡, n ≡ m implies that ni ≡ mi . By i.h. Jni K =αΓi Jmi K for Γi = Γn≡i ,mi . By Lemma E.3 Γi ⊆ Γ, and by Lemma D.3.3 we obtain Jni K =αΓ Jmi K. By the rule Application of α-conversion up-to, Jn 1 K Jn 2 K =αΓ Jm 1 K Jm 2 K, and conclude with JApp(n 1 , n 2 )K =αΓ JApp(m 1 , m 2 )K. • h(n) > 0 and n and m are Lam-nodes resp. Lam(v, n ′ ) and Lam(w, m ′ ). By Bisimulation for ≡, n ≡ m implies that v ≡ w and n ′ ≡ m ′ . By the assumption n , m it follows that v , w. ≡ and Jn ′ K =Γ2 Jm ′ K for By i.h. JvK =αΓ1 JwK for Γ1 = Γv,w α ≡ Γ2 = Γn ′,m ′ . Note that Γ1 = {(JvK , JwK)}. Two cases: – (JvK , JwK) ∈ Γ2 : by Different abstracted variables of αconversion up-to, JnK =αΓ2 \Γ1 JmK. – (JvK , JwK) < Γ2 : then either v is not under n ′ , or w is not under m ′ . W.l.o.g., v is not under n ′ . We first prove that Γ1 ∪ Γ2 is a renaming for (Jn ′ K , Jm ′ K). Since Γ2 is a renaming for (Jn ′ K , Jm ′ K), it suffices to discuss the new pair (JvK , JwK) ∈ Γ1 : 1. JvK < v(Jn ′ K) by Lemma D.1 because v is not under n ′ by the hypothesis, and its binder is above n ′ . 2. JvK < fv(Jm ′ K): if by contradiction JvK ∈ fv(Jm ′ K), then b (v) →+ m ′ . By Domination (Definition B.3) b (v) = n →∗ m, which together with n ≡ m yields by Lemma B.7 the contradicting n = m. 3. JvK < bv(Jm ′ K): if by contradiction JvK ∈ bv(Jm ′ K), then m ′ →∗ b (v), yielding m →+ b (v)n which contradicts the fact that ≡ is height-preserving. 4. JwK < fv(Jm ′ K): assume that JwK ∈ fv(Jm ′ K) and derive a contradiction. By Lemma D.3.2, Γ2 covers (n ′, m ′ ): therefore either w is under n ′ (which is impossible by Nesting property (Lemma B.4) and because ≡ is heightpreserving) or there exists v ′ under n ′ such that (Jv ′ K , JwK) ∈ Γ2 . But then (JvK , Jv ′ K) ∈ Γn≡′,n ′ , absurd by Corollary E.2. 5. the other cases for JwK are similar as the ones above for JvK. Since obviously Γ2 ⊆ Γ1 ∪ Γ2 , by Lemma D.3.3 Jn ′ K =αΓ1 ∪Γ2 Jm ′ K. Again by Different abstracted variables of α-conversion up-to, JnK =αΓ2 \Γ1 JmK. By Lemma E.3 Γ2 \Γ1 ⊆ Γ, and by Lemma D.3.3 JnK =αΓ2 \Γ1 JmK implies JnK =αΓ JmK. □

1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003

First an auxiliary lemma that we are going to use during induction in the proofs of Theorem C.9 and Theorem E.7. Lemma E.3. Let ≡ blind sharing equivalence such that n ≡ m, where n and m are distinct with children resp. n 1 , n 2 and m 1 , m 2 . Let ≡ and Γ = Γ ≡ Γ = Γn,m i n i ,m i for i = 1, 2. Then: 1. if n, m App-nodes: Γi ⊆ Γ. 2. if n, m Lam-nodes: Γi ⊆ Γ ⊎ {(Jn 1 K , Jm 1 K)}. Proof. 1. Let (JvK , JwK) ∈ Γi : we show that the binders b (v) and b (w ) are actually above resp. n and m, and therefore (JvK , JwK) ∈ Γ. We discuss only b (v), the case for b (w ) is similar. By hypothesis b (v) →+ ni →∗ v. Since n →i ni , by Domination (Definition B.3) b (v) →∗ n, and since n , b (v) (because n is an App-node), b (v) →+ n. Similarly, b (w ) is above m. 2. Clearly Γ1 = {(Jn 1 K , Jm 1 K)} because n ≡ m. (Jn 1 K , Jm 1 K) < Γ by the definition of Γ, because n 1 and m 1 are bound resp. in n and m. Let (JvK , JwK) ∈ Γi : b (v) →+ ni →∗ v, and by Domination (Definition B.3), b (v) →∗ n →∗ v. Similarly b (w ) →∗ m →∗ w. If v = n 1 then w = m 1 (because if ≡ w , m 1 then (JwK , Jm 1 K) ∈ Γm , which is impossible by i ,m i Corollary E.2); and viceversa. Otherwise v , n 1 and w , m 1 , and therefore b (v) →+ n →∗ v and b (w ) →+ m →∗ w. Moreover v , w because (JvK , JwK) ∈ Γi . Thus (JvK , JwK) ∈ Γ. □

2004 2005 2006 2007 2008 2009 2010 2011

One of the main dependencies of Proposition 3.8, stating that if there is a sharing equivalence relating two nodes, then their read-backs are parametrically α-equivalent. Proof of (Theorem C.9). Let ≡ a sharing equivalence on a term forest G, and n and m two nodes of G such that n ≡ m. Then JnK =αΓ JmK where Γ is the renaming induced by ≡ on n and m (i.e. ≡ ). Γ B Γn,m

Lemma E.4 is a technical lemma to prove Corollary E.5 below.

2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 2026 2027 2028 2029 2030 2031 2032 2033 2034 2035 2036 2037 2038 2039 2040 2041 2042 2043 2044 2045 2046 2047 2048 2049 2050 2051 2052 2053 2054 2055 2056 2057 2058 2059 2060 2061 2062 2063 2064 2065 2066 2067 2068 2069 2070 2071 2072 2073

2012

17 2013

2074

PL’17, January 01–03, 2017, New York, NY, USA 2075 2076 2077 2078 2079 2080 2081 2082 2083 2084 2085 2086 2087 2088 2089 2090 2091 2092 2093 2094 2095 2096 2097 2098 2099 2100

Beniamino Accattoli, Andrea Condoluci, and Claudio Sacerdoti Coen

Lemma E.4. If for every n ∼ m, JnK =αΓ JmK for some Γ, then for every n ≈ m there are nodes n = p 1 , . r . . , pzk = m and renamings q 1 y Γ1 k 1 k Γ Γ , . . . , Γ such that p =α . . . =α p k .

– x , y , z and x , z: by Different variables. (x, z) ∈ Γ · Γ ′ because (x, y) ∈ Γ and (y, z) ∈ Γ ′ . – x = y , z: by Different variables. (y, z) ∈ Γ · Γ ′ because (−, y) < Γ — because Γ is a renaming for (x, y) = (y, y) — and (y, z) ∈ Γ ′ . – the remaining case is symmetric to the previous one. ′ • Applications: Assume t 1t 2 =αΓ s 1s 2 =αΓ u 1u 2 . By inversion ′ of Application, ti =αΓ si =αΓ ui for i = 1, 2. Since Γ · Γ ′ is a renaming for (t 1t 2 , u 1u 2 ), then it is also for (ti , ui ). By i.h., ′ ti =αΓ ·Γ ui . Conclude by Application. ′ • Lambda abstractions: Assume λx .t =αΓ λy.s =αΓ λz.u. Similarly as in the case for variables, there are a lot of cases: – x = z: because λx .t and λz.u are commonly shareable, λx .t = λz.u. By Lemma D.3.1 λx .t =∅α λz.u. By Lemma D.3.3, ′ λx .t =αΓ ·Γ λz.u. Γ∪{(x,y ) } – x , y , z , x: by Different abstracted variables, t =α ′ Γ ∪{(y,z ) } s =α u. (Γ ∪ {(x, y)}) · (Γ ′ ∪ {(y, z)}) = (Γ · Γ ′ ) ∪ {(x, z)} because Γ and Γ ′ are renamings. (Γ · Γ ′ ) ∪ {(x, z)} is a renaming because: x , z by hypothesis; Γ · Γ ′ is by hypothesis a renaming for (λx .t, λz.u) and thus x and z do not occur in Γ · Γ ′ because x ∈ bv(λx .t ) and z ∈ bv(λz.u). (Γ · Γ ′ ) ∪ {(x, z)} is in particular a renaming for (t, u) because: Γ · Γ ′ is a renaming for (λx .t, λz.u); x < bv(t ) because λx .t is shareable; x < bv(u) because λx .t and λz.u are commonly shareable and by reasoning on size; x < fv(u) because by contradiction, x ∈ fv(u) implies x ∈ fv(λz.u) implies that x occurs in Γ · Γ ′ by the requirement for Γ · Γ ′ covering (λx .t, λz.u), impossible since x ∈ bv(λx .t ). Γ ·Γ′, (x,z ) By i.h., t =α u. By Different abstracted variables, ′ λx .t =αΓ ·Γ λz.u. – x = y , z: since the terms are commonly shareable, it follows that t = s. By Same/Different abstracted variables, Γ′ ∪{(y,z ) } t =αΓ s =α u. Γ · (Γ ′ ∪ {(y, z)}) = (Γ · Γ ′ ) ∪ {(y, z)} because y does not occur in Γ since y ∈ bv(λy.s) and Γ renaming for (λx .t, λy.s). (Γ · Γ ′ ) ∪ {(y, z)} is a renaming because: y , z by hypothesis; y does not occur in Γ · Γ ′ because Γ · Γ ′ renaming for (λx .t, λz.u) = (λy.s, λz.u) and y ∈ bv(λy.s); z does not occur in Γ ·Γ ′ renaming for (λy.s, λz.u) because z ∈ bv(λz.u). (Γ · Γ ′ ) ∪ {(y, z)} is in particular a renaming for (t, u) because: Γ·Γ ′ is a renaming for (t, u); t = s; and Γ ′ ∪{(y, z)} is a renaming for (s, u). Γ ·Γ′ ∪{(y,z ) } Finally, by i.h. t =α u. By Different abstracted ′ variables, λx .t = λy.t =αΓ ·Γ λz.u. – the other case is symmetric. □

Proof. • • •

Let’s proceed by induction on the rules generating ≈: (≈ax ) by the hypothesis. (≈r e f ) if n ≈ n, JnK =∅α JnK by Lemma C.6. (≈t r ) assume that n ≈ m because n ≈ n ′ and n ′ ≈ m. By i.h. one obtains two sequences of nodes and renamings, one from n ≈ n ′ and one from n ′ ≈ m. It is easy to see that the concatenation of the two sequences is enough to conclude. • (≈sim ) let n 1 , n 2 and m 1 , m 2 the children resp. of n and m: we are going to prove the statement for ni ≈ mi . By i.h. from n ≈ m one obtains nodes n = p 1 , . . . , p k = m and renamings q y j q y Γ 1 , . . . , Γk such that p j =αΓ p j+1 for 1 ≤ j < k. By the properties of α-equivalence up-to, these nodes must have all j the same label, and therefore have children pi for i = 1, 2. By inversion of the rules of α-equivalence up-to, there are r z Γj r z j j j+1 Γi such that pi =αi pi for 1 ≤ j < k and i = 1, 2. Conclude with the two sequences obtained by taking i = 1 and i = 2. □

Corollary E.5. If for every n ∼ m, JnK is a blind sharing equivalence.

=αΓ

JmK for some Γ, then ≈

2101 2102 2103 2104 2105 2106 2107 2108 2109 2110 2111 2112 2113 2114 2115 2116 2117 2118 2119 2120

Proof. It suffices to show that ≈ satisfies the Labels requirement. This follows from Lemma E.4 and from the fact that for all nodes p and q, JpK =αΓ JqK implies that p and q have the same label. □ Proof of (Lemma C.10). Let ∼ be a query on a term forest G. If JnK =∅α JmK for every two queried nodes n ∼ m then 1. The query is well-scoped: ∼ is well-scoped; 2. The propagated query is sharing: ≈ is a blind sharing equivalence. Proof. =∅α

JmK implies by Lemma D.3.2 that 1. for every n ∼ m, JnK ∅ covers (JnK , JmK). By the definition of covering fv(JnK) = fv(JmK), and by Lemma D.1.3 n and m are in the same scopes. 2. By Corollary E.5. □ In the proof of Theorem E.7 we are going to use transitivity for parametric α-equivalence: note that the property does not hold for generic terms, but requires them to be commonly shareable.

2121 2122 2123 2124 2125 2126 2127

Lemma E.6 (Transitivity of α up-to). Let t, s, u commonly shareable, Γ renaming for (t, s), Γ ′ renaming for (s, u), and Γ · Γ ′ renaming ′ ′ for (t, u). If t =αΓ s =αΓ u, then t =αΓ ·Γ u, where: Γ·

Γ′

B {(x, z) |

The other main dependency of Proposition 3.8, stating that parametric α-equivalence propagates with the induced renamings.

Γ′

(x, y) ∈ Γ and (y, z) ∈ and x , z OR: (x, z) ∈ Γ and (z, −) < Γ ′ OR: (−, x ) < Γ and (x, z) ∈ Γ ′ }.

Theorem E.7. ≈ , If for all n ∼ m, JnK =αΓ JmK with Γ = Γn,m Γ ≈ . then for all n ≈ m, JnK =α JmK with Γ = Γn,m

2128 2129 2130 2131 2132 2133

Proof. By structural induction on t, s, u (that are of the same kind because alpha-equivalent by hypothesis): =αΓ

′ =αΓ

and Γ · Γ ′ renaming for

• Variables: assume x y z A lot of cases: – x = z: conclude by Same variables.

Proof. First note that by Corollary E.5 ≈ is a blind sharing equiv≈ alence. Therefore by Lemma C.8, Γ = Γn,m is a renaming for (JnK , JmK). Let’s proceed by induction on the rules generating ≈:

(x, z).

2136 2137 2138 2139 2140 2141 2142 2143 2144 2145 2146 2147 2148 2149 2150 2151 2152 2153 2154 2155 2156 2157 2158 2159 2160 2161 2162 2163 2164 2165 2166 2167 2168 2169 2170 2171 2172 2173 2174 2175 2176 2177 2178 2179 2180 2181 2182 2183 2184 2185 2186 2187 2188 2189 2190 2191 2192 2193 2194 2195

2134

18 2135

2196

Sharing Equality is Linear 2197 2198 2199 2200 2201 2202 2203 2204 2205 2206 2207 2208 2209 2210 2211 2212 2213 2214 2215 2216 2217 2218 2219 2220 2221 2222 2223 2224 2225 2226 2227 2228 2229 2230 2231 2232 2233 2234 2235 2236 2237 2238 2239 2240 2241

PL’17, January 01–03, 2017, New York, NY, USA Proof of (Lemma C.12). Let ∼ be a query on a term forest G. If ≈ for every n ≈ m then ≈ is a sharing JnK =αΓ JmK with Γ B Γn,m equivalence.

• (≈ax ) by the hypothesis. ≈ = ∅. By Lemma C.6, JnK =∅ • (≈r e f ) by Corollary E.2, Γn,n α JnK. ≈ is a renaming for (JnK , JpK) such that • (≈t r ) by i.h. Γ1 = Γn,p JnK

=αΓ1

JpK, and Γ2 = =αΓ2

≈ Γp,m

2244 2245 2246 2247 2248 2249 2250 2251 2252 2253 2254 2255

2259 2260 2261

is a renaming for (JpK , JmK) such =αΓ1 ·Γ2

JmK. By Lemma E.6, JnK JmK if Γ1 · Γ2 is a that JpK renaming for (JnK , JmK), but we need to show JnK =αΓ JmK ≈ . Just note that Γ ·Γ ⊆ Γ by definition of Γ · and for Γ = Γn,m 1 2 ·, · transitivity of ≈. And since Γ is a renaming for (JnK , JmK), its subset Γ1 · Γ2 is a renaming for (JnK , JmK) as well. Conclude with JnK =αΓ JmK by Lemma D.3.3. • (≈sim ) The case n = m has already been covered by reflexivity, therefore we assume n , m. Let n 1 , n 2 and m 1 , m 2 the ≈ is a renaming for children resp. of n and m. By i.h. Γ = Γn,m Γ (JnK , JmK) such that JnK =α JmK. Two cases: – (Application nodes) By the rule Application of α-conversion up-to Jni K =αΓ Jmi K, but we need to show Jni K =αΓi Jmi K for Γi = Γn≈i ,mi . By Lemma D.3.2, Γ is a cover on (ni , mi ). We show that also Γi covers (ni , mi ). We show one direction of the requirement for a cover, the other is similar. Let ni →∗ v and mi ̸→∗ v; since Γ is a cover on (Jni K , Jmi K), there exists w such that mi →∗ w and (JvK , JwK) ∈ Γ. By definition of Γ, b (v) →+ n and b (w ) →+ m with b (v) ≈ b (w ), and thus b (w ) →+ mi →∗ w. Therefore (JvK , JwK) ∈ Γi by definition of Γi . Conclude with Jni K =αΓi Jmi K by Lemma D.3.4 because Γi ⊆ Γ (by Lemma E.3). – (Lambda nodes) Let Γ1 = Γn≈1,m1 = {(Jn 1 K , Jm 1 K)}. Γ1 is a renaming for (Jn 1 K , Jm 1 K), and Jn 1 K =αΓ1 Jm 1 K. Let’s turn to (Jn 2 K , Jm 2 K). By the rule Different abstracted variables of α-conversion up-to, Jn 2 K =αΓ∪Γ1 Jm 2 K, but we need to show Jn 2 K =αΓ2 Jm 2 K for Γ2 = Γn≈2,m2 . By Lemma D.3.2, Γ ∪ Γ1 is a cover. Let’s show that Γ2 is a cover too. We show one direction of the requirement for a cover, the other is similar. Let n 2 →∗ v and m 2 ̸→∗ v; since Γ∪Γ1 covers (Jn 2 K , Jm 2 K), there exists w such that m 2 →∗ w and (JvK , JwK) ∈ Γ ∪ Γ1 . By definition of Γ ∪ Γ1 , b (v) →∗ n and b (w ) →∗ m with b (v) ≈ b (w ), and thus b (w ) →+ m 2 →∗ w. Therefore (JvK , JwK) ∈ Γ2 by definition of Γi . Conclude with Jn 2 K =αΓ2 Jm 2 K by Lemma D.3.4 because Γ2 ⊆ Γ ∪ Γ1 (by Lemma E.3). □

Proof. By Corollary E.5, ≈ is a blind sharing equivalence. In order to show that it is a sharing equivalence, it suffices to show:

2262

• Free: assume v ≈ w for v that has no binder, and by contradiction v , w. As above, by Theorem E.7 JvK =αΓ JwK ≈ , which implies (JvK , JwK) ∈ Γ ≈ . But the for Γ = Γv,w v,w ≈ requires v and w to have a binder, absurd. definition of Γv,w • Bound: assume v ≈ w and v has binder bv . If v = w, then we conclude because ≈ is reflexive on bv by definition. Oth≈ , erwise v , w. By Theorem E.7 JvK =αΓ JwK for Γ = Γv,w ≈ . By definition of Γ ≈ , v which implies (JvK , JwK) ∈ Γv,w v,w and w have binders in the ≈ relation.

2264

2263

2265 2266 2267 2268 2269 2270 2271 2272 2273

□ Proof of (Theorem C.13). Let ∼ be a query on a term forest G. If JnK =∅α JmK for every two queried nodes n ∼ m then ∼ is wellscoped and ≈ is a sharing equivalence. Proof. By Lemma C.10.1 ∼ is well-scoped. The rest is by Lemma C.11.2 and Lemma C.12. □ Proof of (Theorem 3.9). Let ∼ be a query on a term forest G. Then JnK =α JmK for every n ∼ m if and only if ∼ is well-scoped and ≈ is a sharing equivalence.

2274 2275 2276 2277 2278 2279 2280 2281 2282 2283 2284 2285 2286

Proof. • (⇒) For every n ∼ m: by Lemma C.2 JnK e JmK are commonly shareable; by Lemma C.6 the hypothesis JnK =α JmK implies JnK =∅α JmK. Conclude by Theorem C.13. ≈ . • (⇐) Let n ∼ m. By Theorem C.9, JnK =αΓ JmK with Γ = Γn,m ≈ = ∅. By Lemma C.5, JnK = JmK. By Lemma E.1, Γn,m α □

2287 2288 2289 2290 2291 2292 2293 2294

We can finally turn to the proof of λ-universality of ≈. Proof of (Proposition 3.8). Let ∼ be a well-scoped query. If there exists a sharing equivalence ≡ containing ∼ then the propagated query ≈ is the smallest sharing equivalence containing ∼. Proof. ≡ is in particular a blind sharing equivalence, and so by Proposition 3.6 ≈ ⊆ ≡ and ≈ is the smallest blind sharing equivalence containing ∼. We only have to show that ≈ is a sharing equivalence. By Theorem C.9, JnK =αΓ JmK for every n ∼ m (with ≡ ). By Lemma E.1, Γ ≈ = Γ ≡ = ∅ for every n ∼ m. Hence Γ = Γn,m n,m n,m by Theorem C.13, ≈ is a sharing equivalence. □

2242 2243

2258

Proof of (Lemma C.11). Let ∼ be a query on a term forest G. If JnK =∅α JmK for every two queried nodes n ∼ m then ≈ = ∅ for every two 1. The induced name pairing is empty: Γn,m queried nodes n ∼ m. ′ 2. Parametric α propagates, with the induced renaming: JpK =αΓ JqK for every two propagatedly queried nodes p ≈ q with re≈ induced by the propagated spect to the renaming Γ ′ B Γn,m query. Proof. 1. By Lemma C.10.2 ≈ is a blind sharing equivalence. Conclude by Lemma E.1. 2. By Point 1 and Theorem E.7. □

F

Correctness and Completeness of the Algorithms

Proof of (Proposition 6.4). Let S be a good final state reachable from an initial state of query ∼. Then in S : 1. Every node has a canonic and there are no query edges. 2. ∼c is a blind sharing equivalence and coincides with the propagation ≈ of the initial query ∼.

2295 2296 2297 2298 2299 2300 2301 2302 2303 2304 2305 2306 2307 2308 2309 2310 2311 2312 2313 2314 2315

Proof. Because the state is final, a = ∅.

2316 2317

2256

19 2257

2318

PL’17, January 01–03, 2017, New York, NY, USA 2319 2320 2321 2322 2323 2324 2325 2326 2327 2328 2329 2330 2331 2332 2333 2334 2335 2336 2337 2338 2339 2340 2341 2342 2343 2344 2345 2346 2347 2348

Beniamino Accattoli, Andrea Condoluci, and Claudio Sacerdoti Coen

1. By Dead have representatives and the hypothesis that a = ∅, c is defined on every node of the term forest. Moreover ∼q = ∅ and q = ∅ by Candidates are alive and the hypothesis that a = ∅. 2. By definition, ∼c is an equivalence relation on G iff every node has a canonic, which is true by Point 1. Moreover ∼qc = ∼∗c because ∼q = ∅ by Point 1. Therefore ∼qc = ∼c since ∼∗c = ∼c because equivalence relations are closed transitively. By Approximation, ∼c contains the query. To prove that ∼c is a blind sharing equivalence, it is sufficient to show that Label and Bisimulation holds: • Label: it follows from the label invariant for good states; • Bisimulation: it follows from the simulation invariant for good states, because ∼qc = ∼c ; Finally, by universality of ≈ (Proposition 3.6) and the fact that the blind sharing equivalence ∼c is contained in ≈ by Approximation, ≈ = ∼c . □

– canonic: (∼2c ) = (∼1c ⊎ {(n, n)}), because c2 = c1 ⊎ {(n, n)} and c1 (n) = undefined. – extended canonic: it is easy to see that (∼2qc ) = (∼1qc ) because ∼qc is already closed reflexively by definition. Now: – Preservation of Invariants: 1. Propagated query: remember that (∼2c ) = (∼1c ⊎ {(n, n)}) and (∼2qc ) = (∼1qc ). Because the invariant holds for S 1 , we only need to prove it for the new related pair n ∼2c n (in fact note that n ∼2c m iff n = m): a. Label: holds because n has the same label as itself. b. Simulation: holds because ∼qc is reflexive on every node of the term forest. c. Approximation: trivial because (∼2qc ) = (∼1qc ). 2. Canonics: we check the invariants for n, because for the other nodes they follow from the hypothesis on S 1 . a. Alive canonics are dying: n is alive and n is dying, as required. b. Idempotency: c2 (n) = n, as required. c. Canonics die last: the invariant trivially holds because n is not dead. d. Alive with canonic are queued: okay because n = n and in fact n < queue2 (n) = ∅. 3. Alive Nodes: a. Alive nodes are downward closed: the set of alive nodes does not change, so the invariant is preserved. b. Candidates are alive: the candidates and the set of alive nodes do not change, so the invariant is preserved. c. Dying are still alive: dying2 = {n} and n is alive, as required. d. Queues are alive: nothing new to prove because queue2 (n) = ∅. e. Dead have representatives: okay because a2 = a1 . 4. Queues: a. Queued nodes have right canonic: nothing to prove because queue2 (n) = ∅. b. Queues are sets: true because queue2 (n) = ∅. Dying Nodes: a. Auto-canonic: c2 (n) = n, as required. b. Calls are on different nodes: dying2 = {n}, nothing to prove. c. Dying order: dying2 = {n}, nothing to prove. Procedure Kill

Lemma F.1 (Irreflexivity of {). If n {+ n for some n in a good state, then ≈ is not a blind sharing equivalence. +

Proof. Let n such that n { n. Assume that ≈ is a blind sharing equivalence, and derive a contradiction. Because ∼qc ⊆ ≈ by Approximation, if n { n then n (≈→≈) + n. Conclude by Theorem 3.3.1. □ Proof of (Theorem 6.5). Let S be a good state. If S → S ′ , then: • Completeness: if S ′ = FAIL, then ≈ is not a blind sharing equivalence; • Preservation of good states: otherwise S is good.

2349 2350 2351 2352 2353 2354 2355 2356 2357 2358

Proof of Theorem 6.5. We discuss separately each line of the algorithm, one procedure at a time. But first note the following, which hold after any successful state transition: • c1 ⊆ c2 , because canonical pointers are never deleted; • (∼1c ) ⊆ (∼2c ) for the same reason; • a2 ⊆ a1 because nodes are never marked alive; For reasons of clarity, we are going to drop the term forest from (refined) states, denoting S = S 1 = ⟨a1 , q1 , c1 , calls1 ⟩ and S ′ = S 2 = ⟨a2 , q2 , c2 , calls2 ⟩.

2359 2360 2361 2362 2363 2364

Procedure BlindSharingCheck (1) We discuss line 1 together with the first two lines (without numbers on the side) of the procedure Kill(·). Every time Kill(n) is called in the while loop, either the algorithm fails, or the following state transition occurs:

2367 2368 2369 2370 2371 2372 2373 2374 2375 2376 2377

= → =

2381 2382 2383 2384 2385 2386 2387 2388 2389 2390 2391 2392 2393 2394 2395 2396 2397 2398 2399 2400 2401 2402 2403 2404 2405 2406 2407 2408 2409 2410 2411 2412 2413 2414 2415 2416 2417 2418 2419 2420 2421 2422 2423 2424 2425

(2) we discuss line 2 together with the first two lines (without numbers on the side) of the procedure Kill(n). There are two cases, either the algorithm fails (because c1 (n) is already defined) or the program states evolves (when c1 (n) is undefined). If the algorithm fails (because c1 (n) is defined), we show that failure is correct, i.e. that ≈ is not a higher-order congruence. Let dying1 = [d 1 , . . . , dk = d]. Since n is alive, by Canonics die last for S 1 and Alive canonics are dying for S 1 there exists di such that c1 (n) = di , and so n ∼1c di (by Auto-canonic

2365 2366

2380

S1 ⟨a1 , q1 , c1 , ∅⟩ ⟨a1 , q1 , c1 ∪ {(n, n)}, [(n, ∅)]⟩ S2

Let us show that the algorithm cannot fail here. It would fail if the canonic of n was already defined. Let us suppose that c1 (n) is defined and show that a contradiction follows. Since n is alive, by Canonics die last for S 1 and Alive canonics are dying for S 1 c1 (n) ∈ dying1 but dying1 is empty, because it is defined starting from calls1 , that is itself empty. Absurd. Therefore the transition above takes place. Note that for the canonic relations we have

for

S 1 ).

1

We now prove that { is cyclic. Note that n → d, 1

hence di ∼1c n → d = dk , i.e. di { dk . Then, by Dying order

2426 2427 2428 2429 2430 2431 2432 2433 2434 2435 2436 2437 2438 2439

2378

20 2379

2440

Sharing Equality is Linear 2441 2442 2443

1 ∗

PL’17, January 01–03, 2017, New York, NY, USA 1 +

(5) See procedure PushSetAndPropagate below. (6) We discuss line 6 and the following one without a number. The following state transition occurs:

for S 1 dk { di and so di { di : hence ≈ cannot be a higher-order congruence by Lemma F.1. If the call does not fail, the following transition takes place:

2445

2447 2448

= → =

S1 ⟨a1 , q1 , c1 , calls1 ⟩ ⟨a1 , q1 , c1 ∪ {(n, n)}, calls1 + (n, ∅)⟩ S2

= → =

S1 ⟨a1 , q1 , c1 , calls1 ⟩ ⟨a1 \ {h}, q1 , c1 , calls2 ⟩ S2

2451 2452 2453 2454 2455 2456 2457 2458 2459 2460 2461 2462 2463 2464 2465 2466 2467 2468 2469 2470 2471 2472 2473 2474 2475 2476 2477 2478 2479 2480 2481 2482 2483 2484 2485 2486 2487 2488 2489 2490 2491 2492 2493

If such a transition takes place observe that—as in the case of procedure BlindSharingCheck above—we have – canonic: ∼2c = ∼1c ⊎ {(n, n)}, because c2 = c1 ⊎ {(n, n)}. – extended canonic: (∼2qc ) = (∼1qc ) because ∼qc is already closed reflexively by definition. Now: – Preservation of Invariants: mostly as in the case of procedure BlindSharingCheck, the only minor changes are in the invariants for dying nodes. 1. Propagated query: exactly as in the case of procedure BlindSharingCheck. 2. Canonics: exactly as in the case of procedure BlindSharingCheck. 3. Alive Nodes: a. Alive nodes are downward closed: the set of alive nodes does not change, so the invariant is preserved. b. Candidates are alive: the candidates and the set of alive nodes do not change, so the invariant is preserved. c. Dying are still alive: the new dying node n is alive and the old ones are as well by Dying are still alive d. Queues are alive: follows from Queues are alive for S1. e. Dead have representatives: okay because a2 = a1 . 4. Queues: it is essentially as in the case of procedure BlindSharingCheck. The only difference is that now there can be queues other than queue2 (n) = ∅, but they are also in S 1 , so there is nothing else to prove. 5. Dying Nodes: here is the only point where things are slightly different with respect to procedure BlindSharingCheck, because now there is more than just one dying node. We have dying2 = dying1 ⊎ {n} and for the nodes in dying1 the invariant follows from the invariant for S 1 . For n: a. Auto-canonic: c2 (n) = n, as required. b. Calls are on different nodes: by hypothesis c1 (n) is undefined (otherwise the algorithm would fail) while by Auto-canonic for S 1 all the nodes in dying1 have their canonical defined, and so n , di for all di ∈ dying1 . c. Dying order: By hypothesis, n is a parent of d, that

The only difference between calls1 and calls2 is that queue1 (d ) = queue2 (d ) ⊎ {h}. Therefore clearly (∼2c ) = (∼1c ) and (∼2qc ) = (∼1qc ). Moreover h , d by Alive with canonic are queued for S 1 because h ∈ queue1 (d ) and s ∈ a1 (by Queues are alive for S 1 ). Now: – Preservation of Invariants: 1. Propagated query: a. Label: trivial, because (∼2c ) = (∼1c ) and (∼2qc ) = (∼1qc ). b. Simulation: trivial, because (∼2c ) = (∼1c ) and (∼2qc ) = (∼1qc ). c. Approximation: immediate because ∼1q = ∼2q and ∼1c = ∼2c . 2. Canonics: the canonical assignment does not change (i.e. c1 = c2 ) and the only node that is killed is h. Therefore all invariants are trivially preserved but Canonics die last that requires to prove that there is no alive node m such that c(m) = h. By absurdum, if such a node existed by Idempotency for S 1 one would have c(h) = h whereas c(h) = d by Queued nodes have right canonic. Therefore it would be the case h = d that is absurd (we already proved h , d). 3. Alive Nodes: a. Alive nodes are downward closed: a2 = a1 \ {h}. Therefore we need to prove that all ancestors of h in S 1 are dead. Because of Alive nodes are downward closed for S 1 , dead1 is upward closed and therefore it is sufficient to prove that h has no parent in a1 . Similarly to what we will discuss on line 7, this property was established on line 4 and it is preserved in the following lines because the algorithm never changes the term forest, and never makes a dead node alive. b. Candidates are alive: again, similar to what we will discuss for line 7. c. Dying are still alive: it follows from the property for S 1 , because the dying set does not change and we now prove that the only node h that dies was not dying in S 1 . Indeed c1 (h) = d by Queued nodes have right canonic and for every di in the dying set c1 (di ) = di by Auto-canonic. Therefore h could only be equal to the dying node d, but we already proved h , d. d. Queues are alive: h is the only node that dies and therefore it is sufficient to show that for every dying node di , h < queue2 (di ). By Queued nodes have right canonic for S 1 and Calls are on different nodes for S 1 , all queues have distinct elements, h ∈ queue1 (di ) by Queued nodes have right canonic for S 1 and thus h does not belong to the other queues. Therefore it

2

is, n → d, and therefore n { d. (3) See procedure PushSetAndPropagate below. (4) Identical to the proof for line 2, except for: 1

2494 2495 2496 2497 2498 2499

2504

2506 2507 2508 2509 2510

2449 2450

2503

2505

2444

2446

2502

– in that for line 2 to deal with failure we conclude n { d from n → d. Here we reach the same conclusion from n → h ∼1qc d; – similarly, Dying order for n holds because n = dk +1 → hk = s ∼c dk (while for line 2 it holds because n = dk +1 → dk = d).

2511 2512 2513 2514 2515 2516 2517 2518 2519 2520 2521 2522 2523 2524 2525 2526 2527 2528 2529 2530 2531 2532 2533 2534 2535 2536 2537 2538 2539 2540 2541 2542 2543 2544 2545 2546 2547 2548 2549 2550 2551 2552 2553 2554 2555 2556 2557 2558 2559 2560 2561

2500

21 2501

2562

PL’17, January 01–03, 2017, New York, NY, USA 2563 2564 2565 2566 2567 2568 2569 2570 2571 2572 2573 2574 2575

Beniamino Accattoli, Andrea Condoluci, and Claudio Sacerdoti Coen

belongs to no queue in S 2 because it is removed from queue2 (d ). e. Dead have representatives: h is the only node that dies, but c1 (h) = d , undefined by Queued nodes have right canonic for S 1 . 4. Queues: dying nodes are preserved and all queues stay the same but for queue(d ), that shrinks (queue1 (d ) = queue2 (d ) ⊎ {h}). Then the invariants are trivially preserved. 5. Dying Nodes: the set of dying nodes does not change and so the invariants are trivially preserved. (7) The line of the algorithm induces the following transition of states:

i. undirected arcs are only added by PushSetAndPropagate and only to children of alive nodes; ii. d has no alive parent nodes (and thus cannot participate in new undirected arcs) because the property was established in line 2 (before line 3) and it is preserved in the following lines because the algorithm never changes the term forest and never makes a dead node alive. c. Dying are still alive: it follows from the invariant for S 1 , because the only newly dead node d has been removed from the call stack, which contained no duplicates because of Calls are on different nodes for S1. d. Queues are alive: d is the only node that dies and therefore it is sufficient to show that for all dying node di of S 2 , d < queue2 (di ). Assume that d ∈ queue2 (di ). By Queued nodes have right canonic for S 1 , c1 (d ) = di . Thus by Dying are still alive for S 1 and Alive with canonic are queued for S 1 we obtain d = di , which is absurd because d has been removed from the call stack, which contained no duplicates because of Calls are on different nodes for S 1 . e. Dead have representatives: d is the only node that dies, but c2 (d ) = d , undefined by Auto-canonic for S1. 4. Queues: it follows from the invariant for S 1 , because S 2 only lacks one dying node and its queue (the others being preserved). 5. Dying Nodes: it follows from the invariant for S 1 , because in S 2 there is one dying node less (d = dk ) and the others are preserved.

2576 2577

S1

2578

⟨a1 , q1 , c1 , calls2

2579 2580

= → =

⟨a1

\

+ (d, ∅)⟩

{d }, q1 , c1 , calls2 ⟩

S2

2581 2582 2583 2584 2585 2586 2587 2588 2589 2590 2591 2592 2593 2594 2595 2596 2597 2598 2599 2600 2601 2602 2603 2604 2605 2606 2607 2608 2609 2610 2611 2612 2613 2614 2615 2616 2617 2618 2619 2620 2621

where we know that queued1 is empty because that is the condition to exit from the while-cycle before the line under analysis. Note that many of the notions derived from S 1 and S 2 coincide: – (∼1c ) = (∼2c ): because c1 = c2 . – (∼1qc ) = (∼2qc ): because (∼1c ) = (∼2c ) and ∼1q = ∼2q . Now, – Preservation of Invariants: 1. Propagated query: immediate, because (∼1c ) = (∼2c ) and (∼1qc ) = (∼2qc ). 2. Canonics: a. Alive canonics are dying: trivial because it holds for S1 b. Idempotency: trivial because it holds for S 1 c. Canonics die last: because it holds for S 1 and d is the only new dead node, we only need to prove that for all n s.t. c1 (n) = d , n is dead in S 2 . If n was dead in S 1 it is still dead and there is nothing to prove. Otherwise, by Alive with canonic are queued for S 1 , either n = d (and thus it is dead in S 2 ) or n ∈ queue1 (d ) = ∅ (absurd). d. Alive with canonic are queued: trivial because it holds for S 1 3. Alive Nodes: a. Alive nodes are downward closed: a2 = a1 \ {d}. Therefore we need to prove that all ancestors of d in S 1 are dead. Because of Alive nodes are downward closed for S 1 , dead1 is upward closed and therefore it is sufficient to prove that d has no parent in a1 . The property holds because it was established in line 2 and it is preserved in the following lines because the algorithm never changes the term forest and never makes a dead node alive. b. Candidates are alive: ∼1q = ∼2q , but a2 = a1 \ {d}. Therefore we need to prove that there is no n such that d ∼q n. The property holds because it was established in line 3 and it is preserved in the following lines because:

2624 2625 2626 2627 2628 2629 2630 2631 2632 2633 2634 2635 2636 2637 2638 2639 2640 2641 2642 2643 2644 2645 2646 2647 2648 2649 2650 2651 2652 2653 2654 2655 2656

Procedure PushSetAndPropagate The procedure PushSetAndPropagate is called on lines 3 and 5 of Kill. In both cases, we discuss the call to PushSetAndPropagate together with the line “delete undirected edge (−, n)”. Note that we carry the proof on the nodes r , s, and n, with the convention that on line 3 they stand respectively for r , r , and n. In both calls to PushSetAndPropagate, the following facts hold:

2657 2658 2659 2660 2661 2662 2663

1. d is the topmost element of dying1 ; 2. d ∼1c h because: on line 3 d ∼1c d by Auto-canonic for S 1 ; on line 5 d ∼1c h by Auto-canonic for S 1 and Queued nodes have right canonic for S 1 ; 3. d ∼1qc n because: the item above, and h ∼1q n. 4. n ∈ a1 , by Candidates are alive for S 1 ; 5. d ∈ a1 , by Dying are still alive for S 1 .

2664

Each call to PushSetAndPropagate(queue, r , n) either fails on lines 10 and 12, or it makes the state evolve.

2671

2665 2666 2667 2668 2669 2670

2672 2673

• Failure on line 10: by hypothesis d ∼1qc n and so d ≈ n (by Approximation), but they have different labels. Therefore Label fails for ≈, which then is not a higher-order congruence. • Failure on line 12: let dying1 = [r 1 , . . . , dk = d]. By hypothesis c1 (n) , d = dk . Since n is alive, by Canonics die last and Alive canonics are dying for S 1 , c1 (n) is dying, i.e. that there is i such that c1 (n) = di , dk . By Dying order for S 1 , 1 +

dk { di . Now note that c1 (n) = di implies di ∼1qc n (by Auto-canonic for S 1 ), that together with d ∼1qc n produces

2674 2675 2676 2677 2678 2679 2680 2681 2682 2683

2622

22 2623

2684

Sharing Equality is Linear 2685 2686 2687

PL’17, January 01–03, 2017, New York, NY, USA 1 +

1 +

di ∼1qc n ∼1qc d = dk { di , that is di { di , i.e. the dying order is cyclic. Therefore ≈ is not a higher-order congruence by Lemma F.1.

2688

If the call does not fail there are two cases: 2689 2690 2691 2692 2693 2694 2695 2696 2697 2698

• c1 (n) is defined. Then the only difference between S 1 and S 2 is that q2 = q1 \ {(h, n)}. But then, since c1 (n) = d, by Auto-canonic for S 1 , d ∼1c n, which together with d ∼1c h, shows that h ∼1c n. Hence ∼1qc = ∼2qc . It is easy to see that all invariants are preserved: the only ones worth discussing are Candidates are alive (which is immediate), and Approximation (because ∼1qc = ∼2qc ). • c1 (n) is undefined. Then the following state transition occurs: S1

2699 2700 2701 2702

= → =

⟨a1 , q1 , c1 , calls + (d, queue1 )⟩ ⟨a1 , q2 , c1 ⊎ {(n, d )}, calls + (d, queue1 ⊎ {n})⟩ S2

2703 2704 2705 2706 2707 2708 2709 2710 2711 2712 2713 2714 2715 2716 2717 2718 2719 2720 2721 2722 2723 2724 2725 2726 2727 2728 2729 2730 2731 2732 2733 2734 2735 2736 2737 2738 2739 2740 2741 2742 2743 2744

where q2 depends on whether d has children or not: – if d is a var-node, then q2 = q1 \ {(h, n)}. Note that in this case ∼1qc = ∼2qc since h ∼2c n (because d ∼2c n – from c2 (n) = d and Auto-canonic for S 2 – and d ∼1c h); – otherwise, q2 = q1 \ {(h, n)} ∪ {(d 1 , n 1 ), (d 2 , n 2 )}. In this case ∼1qc ⊆ ∼2qc because: ∼1c ⊆ ∼2c , q1 \ {(h, n)} ⊆ q2 , and h ∼2c n. Note that, because q · is a multi-relation, q1 \ {(h, n)} may still contain occurrences of the pair (h, n). – Preservation of Invariants: 1. Propagated query: Recall that (∼1c ) , (∼2c ) because c2 = c1 ⊎ {(n, d )}. a. Label: Let m ∼2c m ′ : we need to prove that m and m ′ have the same label. If m ∼1c m ′ , the requirement follows from Label for S 1 . Assume then m ̸∼c 1m ′ : this means that necessarily d = c2 (m) = c2 (m ′ ). Note that either m or m ′ is n, because otherwise m ∼1c m ′ ; hence three cases: ∗ m = m ′ = n: trivial, because n has the same label as itself; ∗ m = n and m ′ , n. c1 (m ′ ) = d implies m ′ ∼1c d (by Auto-canonic for S 1 ), and implies by Label for S 1 that m ′ and d have the same label. In order to conclude, it suffices to show that d and m = n have the same label: This is the case because of the check on line 10. ∗ m , n and m ′ = n: symmetric to the case above. b. Simulation: Let m ∼2c m ′ . If m ∼1c m ′ , the requirement follows from Simulation for S 1 because ∼1qc ⊆ ∼2qc . Assume then m ̸∼c 1m ′ : this means that necessarily d = c2 (m) = c2 (m ′ ). Recall: we need to prove that if m →i mi then m ′ →i mi′ and mi ∼2qc mi′ . Note that either m or m ′ is n, because otherwise m ∼1c m ′ ; hence three cases: ∗ m = m ′ = n: trivial, because if n →i ni then n →i ni and ni ∼2qc ni (because ∼2qc is reflexive); ∗ m = n and m ′ , n. Suppose m →i mi = ni : by the requirement Label proved above, m and m ′ have the same label (because m ∼2c m ′ ), hence m ′ →i mi′ for some mi′ .

c1 (m ′ ) = d implies m ′ ∼1c d (by Auto-canonic for S 1 ), and implies by Simulation for S 1 that d →i di and mi′ ∼1qc di . Because mi = ni ∼2q di and ∼1qc ⊆ ∼2qc , we conclude with mi ∼2qc mi′ . ∗ m , n and m ′ = n: (almost) symmetric to the case above. c. Approximation: We have ∼1qc ⊆ ∼2qc . Therefore we just need to show the second inclusion of the invariant. Note that ≈ contains ∼2qc iff it contains ∼1qc and it contains {(d 1 , n 1 ), (d 2 , n 2 )}. The first follows from Approximation for S 1 . The second follows from d ≈ n by definition of ≈. 2. Canonics: We only have to consider the new case c2 (n) = d, for which: a. Alive canonics are dying: d is dying in S 2 because it is so in S 1 . b. Idempotency: c2 (d ) = c1 (d ) = d by Auto-canonic for S1. c. Canonics die last: d is alive so the hypothesis is false and the invariant trivially holds. d. Alive with canonic are queued: d is dying and n , d (because c1 (d ) = d by Auto-canonic for S 1 , while c1 (n) = undefined) and n ∈ queue2 (d ), as required. 3. Alive Nodes: a. Alive nodes are downward closed: the set of alive nodes does not change. b. Candidates are alive: The new edges involve children of d and n. We already proved that d and n are alive. By Alive nodes are downward closed for S 1 , {di , ni } ⊆ a1 and we conclude. c. Dying are still alive: the set of dying nodes does not change. d. Queues are alive: the only new node in the queues is n, that we already proved to be alive. e. Dead have representatives: immediate by Dead have representatives for S 1 . 4. Queues: We only have to consider the case queued2 = queued1 ⊎ {n}, for which: a. Queued nodes have right canonic: c2 (n) = d as required. b. Queues are sets: n is distinct from every other node m ∈ queue1 (d ) because by Queued nodes have right canonic for S 1 , m has the canonic defined in S 1 while n does not. The nodes in queued1 are pairwise distinct because the invariant holds on S 1 . 5. Dying Nodes: a. Auto-canonic: preserved. b. Calls are on different nodes: the set of dying nodes does not change. c. Dying order: the set of dying nodes does not change, and ∼1qc ⊆ ∼2qc .

2746

□

2799

2747 2748 2749 2750 2751 2752 2753 2754 2755 2756 2757 2758 2759 2760 2761 2762 2763 2764 2765 2766 2767 2768 2769 2770 2771 2772 2773 2774 2775 2776 2777 2778 2779 2780 2781 2782 2783 2784 2785 2786 2787 2788 2789 2790 2791 2792 2793 2794 2795 2796 2797 2798

2800

G

Linearity of the Algorithms

Remark G.1. In every good non-final non-fail state S, |S| > 0. This is because |a| > 0, |c| ≤ |nodes(G)| and therefore (|a| + |q| − 2 × |c| + 2 × |nodes(G)|) > 0.

2801 2802 2803 2804 2805

23 2745

2806

PL’17, January 01–03, 2017, New York, NY, USA 2807 2808

Beniamino Accattoli, Andrea Condoluci, and Claudio Sacerdoti Coen

Proof of (Lemma 7.4). Let S be a good state of the blind sharing check. If S → S ′ then |S ′ | < |S|.

2809 2810 2811 2812

Proof. First note that if S ′ = FAIL, then |S| > 0 by Remark G.1. Now we discuss separately each procedure of Algorithm 1, similarly as in the proof of Theorem 6.5 on page 20.

2813 2814 2815 2816 2817 2818 2819

Procedure BlindSharingCheck: we discuss line 1 together with the first two lines (without numbers on the side) of the procedure Kill(n). In case it does not fail, the state S ′ after the call differs from S by having one more canonical representative set. Therefore |S ′ | = |S| − 2 < |S|. Procedure Kill: we discuss separately each line of the procedure.

2820

Exactly like the case of the loop on line 1 of BlindSharingCheck. See procedure PushSetAndPropagate below. Exactly like the case of the loop on line 1 of BlindSharingCheck. See procedure PushSetAndPropagate below. S ′ differs from S by having one more dead node: hence |S ′ | = |S| − 1 < |S|. (7) S ′ differs from S by having one more dead node: hence |S ′ | = |S| − 1 < |S|.

(2) (3) (4) (5) (6)

2821 2822 2823 2824 2825 2826 2827 2828 2829 2830 2831 2832

Procedure PushSetAndPropagate: it is called on lines 3 and 5 of Kill. In both cases, we discuss the call to PushSetAndPropagate together with the unnumbered line “delete undirected edge (−, n)”. There are three possible outcomes:

2834 2835 2836 2837 2838 2839

2841 2842 2843

Proof. Consider the algorithm obtained composing the sharing check and the λ check algorithms. The obtained algorithm cannot diverge and it works in time linear in G and ≈ because of Corollary 7.5 and because the λ check is linear in the number of nodes. By Theorem 6.5, if the first algorithm fails, then ≈ — the propagated query — is not a blind sharing equivalence and, by universality of ≈ (Proposition 3.6), no sharing equivalence containing ∼ exists. Otherwise the algorithm reaches a good final state and by Proposition 6.4 the computed c is a linear representation of ≈. By Theorem 8.1 if the second algorithm fails then there are no sharing equivalences that contain the query and otherwise ∼c is an explicit representation of the smallest sharing equivalence that contains the query. □

□

2840

Proof of (Corollary 7.5). Let S be an initial state of term forest G and query (edges) q. Then the blind sharing check on S terminates in a number of transitions linear in |nodes(G)| and |q|.

2844 2845 2846

Proof. The measure is well-founded on good states (Remark 7.3). Conclude by Theorem 6.5 and Lemma 7.4. □

2847 2848

H

The Name Check

2849 2850 2851 2852 2853 2854 2855 2856 2857 2858

Proof of (Theorem 8.1). Let ∼ be a well-scoped query on a term forest G passing the blind check, and let c be the canonic assignment produced by that check.

2861 2862 2863 2864 2865

2869 2870 2871 2872 2873 2874 2875 2876 2877 2878 2879 2880 2881 2882 2883 2884 2885 2886 2887 2888 2889 2890

2892 2893 2894 2895 2896 2897 2898 2899 2900 2901 2902 2903 2904 2905 2906 2907 2908 2909 2910 2911 2912 2913

• if the name check fails then there are no sharing equivalences containing ∼, • otherwise ∼c is the smallest sharing equivalence containing ∼.

2914 2915 2916 2917 2918

Moreover, the name check terminates in time linear in the size of G.

2919 2920

2859 2860

2868

2891

Proof of (Theorem 8.2). Let ∼ be a well-scoped query on a term forest G. There is an algorithm that succeeds if and only if there exists a sharing equivalence containing ∼, which is linear in the sizes of G and ∼. Moreover, if it succeeds, it outputs a concrete (and linear) representation of the smallest such equivalence.

• the case of failure was already discussed above; • if c (n) = d, then S ′ differs by S only for having one undirected arcs less, hence |S ′ | = |S| − 1 < |S|; • if c (n) = undefined, then we have one undirected arc less, one canonical assignment more, and x undirected arcs more where x ≤ 2. Hence |S ′ | = |S| − 1 − 2 + x < |S|.

2833

• Free is checked by line 2. The check fails iff there is a node v such that v , c (v), v ∼c c (v) (by definition of ∼c ) and one of the two has no binder. If it fails, then Free does not hold by definition. For the converse, Free does not hold if there exist two Varnodes v, w s.t. v , w, v ∼c w and one of the two (say v w.l.o.g.) has no binder. Then, by definition of ∼c , let n := c (v) = c (w ). If n = v, then the test will fail when processing w. Otherwise it will fail when processing v. • Bound is checked by line 3. The check fails iff there is a node v such that v , c (v), v ∼c c (v) (by definition of ∼c ), their binders are unrelated c (b (v)) , c (b (c (v))) and therefore b (v) ̸∼c b (c (v)). If it fails then Bound does not hold by definition. For the converse, Bound does not hold if there exist two variable nodes v, w s.t. v ∼c w and their binders are unrelated: b (v) ̸∼c b (w ). Therefore v , w (otherwise the two binders would be related by reflexivity of ∼c ). Then, by definition of ∼c , let n := c (v) = c (w ) and thus v ∼c n ∼c w. Because b (v) ̸∼c b (w ), at least the binder of one of the two nodes, w.l.o.g. say v, will be unrelated to the binder of n. Therefore the check will fail when processing v. □

Proof. By Proposition 6.4, ∼c = ≈ is a blind sharing equivalence that includes the query. By Proposition 3.8, if it is also a sharing equivalence, than it is the smallest such one, otherwise no sharing equivalence exists. Therefore we just need to show that the properties Free and Bound of sharing equivalences hold iff the λ-check is succesfull:

2921 2922 2923 2924 2925 2926 2927

2866

24 2867

2928

Prioritizing linear equality and inequality systems

Car Sharing is Growing - Automotive Digest

Is Shapley Cost Sharing Optimal?

Car Sharing is Growing - Automotive Digest

Using a Current Sharing Controller with Non ... - Linear Technology

Equality and Freedom - USCIS

The International Risk Sharing Puzzle is at Business ...

Equality policy.doc.pdf

Equality Day

Equality and Freedom - USCIS

Gender Equality Policy.pdf

Battery Backup Regulator is Glitch-Free and Low ... - Linear Technology

File Sharing Algorithms File Sharing Algorithms over MANET ... - IJRIT

Gender Equality in Science, Technology, Engineering ... - usaid

Sharing Online Poster

File Sharing Algorithms File Sharing Algorithms over MANET ... - IJRIT

Sharing - Peg.pdf

Gender Equality Policy Brief.pdf

Promoting Equality in the Workplace - felgtb

pdf sharing free