Pointwise Generalized Algebraic Data Types

Viewer
Transcript

Pointwise Generalized Algebraic Data Types Chuan-kai Lin

Tim Sheard

Department of Computer Science Portland State University Portland, Oregon, USA {cklin,sheard}@cs.pdx.edu

Abstract In the GADT (Generalized Algebraic Data Types) type system, a pattern-matching branch can draw type information from both the scrutinee type and the data constructor type. Even though the type system can handle complex interactions between the two types, most programs require only simple interactions in the form of parametric instantiation and type indexing. To explore the tradeoffs related to GADT patterns, we define the Pointwise GADT type system, which restricts GADTs to the common case of parametric instantiation and type indexing. The Pointwise GADT type system still accepts a wide range of GADT programs, while rejecting a pathological function whose pattern-matching branches can make arbitrarily different assumptions about the type environment. We also state and prove several properties of the type system, which we speculate might be useful in helping researchers design better type inference algorithms. Categories and Subject Descriptors D.3.3 [PROGRAMMING LANGUAGES]: Language Constructs and Features—Abstract data types; F.3.3 [LOGICS AND MEANINGS OF PROGRAMS]: Studies of Program Constructs—Type structure General Terms Languages, Theory

1.

Introduction

Generalized algebraic data types (GADTs) are an extension to algebraic data types used in functional languages like Haskell and ML [9]. GADTs are extremely useful. Researchers have used them in a wide variety of programming tasks: generalized tries [5], balanced trees [1], generic programming [5], monad libraries [11], and tagless language interpreters [5, 18]. The Glasgow Haskell Compiler has supported GADTs since March 2005 [22], and GADTs are used in major software projects such as Pugs [24], a leading Perl 6 implementation, and Darcs [23], a distributed revision control system with advanced merge features. GADTs allow a pattern-matching branch to draw type information from two sources: the type of the case scrutinee, and the type of the data constructor in the pattern. The GADT type system, which

combines these two types with unification, allows the scrutinee type and the constructor type to interact in very complex ways. But when we surveyed a wide range of examples that use GADTs, we discovered that the interactions, in practice, have a very simple structure. Type information can flow from the scrutinee type to the constructor type, which we call parametric instantiation, or type information can flow from the constructor type to the scrutinee type, which we call type indexing. This discovery is significant for two reasons: 1. Characterizing the type interactions with parametric instantiation and type indexing helps programmers understand and use GADTs effectively, and 2. Deeper insight into how the scrutinee type and the constructor type interact may help researchers design better type inference algorithms for GADTs. Intrigued by the second point, we decided to explore a type system that accepts only parametric instantiation and type indexing. This paper documents our findings. We make the following technical contributions: • We identify parametric instantiation and type indexing as the

most common ways which the scrutinee type and the constructor type interact in a pattern (§2.4, §4.2), • We capture the intuition behind parametric instantiation and type indexing by formally developing pointwise unifiers (§3) and the Pointwise GADT type system (§4), • To further motivate the Pointwise GADT type system, we show how it rejects a pathological function whose pattern-matching branches can make arbitrarily different assumptions about the type environment (§2.6, §4.3), and • To illustrate how the Pointwise GADT type system may help type inference researchers, we prove four properties about it that generalize existing results on algebraic data types with existential types (§5.2).

2.

In this section we introduce background information and the problems caused by typing pattern-matching branches with unification. 2.1

c ACM, (2010). This is the author’s version of the work. It is posted here

by permission of ACM for your personal use. Not for redistribution. The definitive version was published in Proceedings of the 5th ACM SIGPLAN workshop on Types in Language Design and Implementation, January 2010. http://doi.acm.org/10.1145/1708016.1708024

Background

Notation

In this paper we use the Haskell syntax [7] for our program examples, and we distinguish program examples from other kinds of text by setting them in teletype font. When we want to refer to a program entity, for example, a type without having any specific one in mind, we use the following meta-symbols:

e, f, g, . . . u, v, w, . . . S, T, U, V, . . . α, β, γ, . . . θ, η, σ, . . .

Expressions Types Type constructors Type variables Type substitutions

The type-constructor meta symbols range over not only userdefined types, but also built-in ones like arrows and tuples. A type is either a type variable, or a type constructor, which has a fixed arity, and whose arguments are also types. Finally, we use the following mathematical notations in this paper: dom(θ) tv(s) S #T S]T

Domain of the substitution θ Free type variables of s Sets S and T are disjoint Union of disjoint sets S and T

When a case branch matches a GADT data constructor, the match induces a type refinement based on the type indices of the constructor. The branch body under the GADT pattern is typed under this refinement. For example, consider the following function eval: eval :: forall a. Term a → a eval e = case e of RepInt i → i RepBool b → b RepPair u v → (eval u, eval v)

From the type annotation, we know that the argument e has type Term a, and the result of eval has type a. However, since the range type of RepInt is Term Int, in the RepInt branch the type variable a is refined to Int, and thus we can use an integer i as the body of the branch. 2.4

2.2

Algebraic Data Types (ADTs) in functional languages such as ML and Haskell [7] support parametric polymorphism: programmers can define data types in which some component types are specified through type parameters. The following Tree type, declared in modern Haskell syntax, illustrates this feature: data Tree a where Tip :: forall a. a → Tree a Fork :: forall a. Tree a → Tree a → Tree a

The Tree type does not specify the type of the value wrapped in the Tip data constructor; instead it exposes that type through the argument a of the range type Tree a. (We call the last component of a data constructor type the range type, and its type arguments the range type arguments.) By instantiating the type variable a to Int, for example, a programmer can define trees of integers and functions that traverse trees of integers: tree1 :: Tree Int tree1 = Fork (Fork (Tip 1) (Tip 7)) (Tip 3) sumTree :: Tree Int → Int sumTree t = case t of Tip i → i Fork l r → sumTree l + sumTree r

We call this kind of type instantiation parametric instantiation. Parametric polymorphism and parametric instantiation make data types more flexible and facilitate code reuse. 2.3

Parametric Instantiation and Type Indexing

Algebraic Data Types

Generalized Algebraic Data Types

Generalized Algebraic Data Types (GADTs) extend ADTs by allowing a data constructor to specify its range-type arguments [8]. This feature, which we call type indexing, allows programmers to use type indices to describe the structure of values. The Term type illustrates type indexing: data Term RepInt RepBool RepPair

a where :: Int → Term Int :: Bool → Term Bool :: forall a b. Term a → Term b → Term (a, b)

Each data constructor of Term has a different range-type argument, and the type argument a of a term with type Term a represents the type of the object-value encoded by the term. Here are two examples: RepInt 3 :: Term Int RepPair (RepBool True) (RepInt 5) :: Term (Bool, Int)

Parametric instantiation and type indexing are designed for complementary purposes. We can see the difference most clearly when we consider pattern-matching branches, because a pattern ties together the scrutinee type and the constructor range type through both parametric instantiation and type indexing. We explained in the previous subsection that the GADT type system allows the type checker to type the body of a pattern-matching branch against a refined type and under a refined type environment. The type checker computes the refinement using type information from two sources: 1. The type of the scrutinee of the case expression. This type comes from the context in which the case expression is type checked. 2. The range type of the data constructor that appears in the pattern. This type comes from the data constructor definition. The type checker compares these two types and propagates type information in the specific-to-generic direction. For example, suppose that the two types are Maybe Int and Maybe t. Since Maybe Int is more specific than Maybe t, type information flows from the former to the latter, and the type checker computes the type refinement substitution [Int/t]. Parametric instantiation Parametric instantiation happens when the scrutinee type is more specific than the constructor range type. For example, in the Tip i branch of the function sumTree, the scrutinee type is Tree Int while the constructor range type is Tree a. Since the former is more specific than the latter, type information flows into the constructor range type. We illustrate the situation with the following table. Pattern: Tip i, Scrutinee Type: Constructor: Tip, Range Type: Parametric instantiation: [Int/a]

Tree Int Tree a

Type indexing Type indexing, in contrast, happens when the scrutinee type is less specific than the constructor range type. For example, in the RepInt i branch of the function eval, the scrutinee type is Term a while the constructor range type is Term Int. Since the former is less specific than the latter, type information flows into the scrutinee type. We illustrate the situation with the following table. Pattern: RepInt i, Constructor: RepInt, Type indexing: [Int/a]

Scrutinee Type: Range Type:

Term a Term Int

Mixed parametric instantiation and type indexing Parametric instantiation and type indexing need not happen in isolation: one part of the scrutinee type can be more specific than its pointwise counterpart (i.e., the type at the corresponding position) in the constructor range type, while another part of the scrutinee type can be less specific than its pointwise counterpart in the constructor range type. Consider the following example: data Z data S n

sum :: forall m. L m Int → Int sum xs = case xs of Cons y ys → y + sum ys Nil → 0

The L data type is a list with an additional type argument n that tracks the length of the list. The type constructor Z represents the length zero, and the type constructor S represents a length increment. Let us look at the Cons branch in sum. Its scrutinee type is L m Int (from the type of the case scrutinee xs), and the range type of the data constructor Cons is L (S n) a. Here, type information flows both ways: the type variable a instantiates to Int (parametric instantiation), and the type variable m is refined to S n in the pattern-matching branch (type indexing). L m Int L (S n) a

This situation clearly calls for a general mechanism that can support both parametric instantiation and type indexing. 2.5

Typing Pattern Branches with Unification

The mechanism used by the GADT type system is to unify the scrutinee type with the constructor range type, and then apply the most-general unifier (mgu) to the type environment and the body type before typing the branch body. Here we show how it works for the Cons branch in sum, and we use the ∼ symbol to define a unification problem between two types: mgu(L m Int ∼ L (S n) a) = [Int/a, S n/m] The substitution [Int/a] represents parametric instantiation, and [S n/m] represents type indexing. Since a most-general unifier turns the scrutinee type and the constructor range type into the same type, it naturally incorporates both parametric instantiation and type indexing. 2.6

Unification is Too Expressive

Typing pattern-matching branches with unification is problematic because unification is too powerful. Using a most general unifier sometimes produces unexpected results. We illustrate this problem with the following example: data Slide a b c where C1 :: forall a b. Slide a a b C2 :: forall a b. Slide a b b ex1 e y = case e of C1 → y == ’c’ C2 → [if y then 5 else 7]

Pattern C1 C2

Type of y Char Bool

Result Type Bool [Int]

Surely ex1 must not be well-typed? Alas, it turns out that ex1 is well-typed. Here are three of the infinitely many types for ex1:

data L n a where Cons :: forall n a. a → L n a → L (S n) a Nil :: forall a. L Z a

Pattern: Cons y ys, Scrutinee Type: Constructor: Cons, Range Type: Parametric instantiation: [Int/a] Type indexing: [S n/m]

There is something very unusual about the ex1 function. Its argument y and its result both have different types across the two branches. Furthermore, none of these types appear in the types of C1 and C2.

ex1 :: forall q r. Slide (Char, Bool) (q, r) (Bool, [Int]) → q → r ex1 :: forall q r. Slide (Bool, Char) (r, q) ([Int], Bool) → q → r ex1 :: forall q r. Slide (Bool, [Char]) (r, [q]) ([Int], [Bool]) → q → r

The function ex1 is well-typed, not because of parametric instantiation or type indexing, but because of something else altogether. The repeated occurrences of type variables in the data-constructor range types (a in the range type of C1 and b in the range type of C2) reflect parametric instantiations back as branch-specific type indexing, so the type variables q and r can refine to different types in different pattern-matching branches. We illustrate the situation in the C2 branch with the following table (some parts, irrelevant to the discussion, are omitted): Pattern: C2, Scrutinee Type: Slide (q,r) (Bool,[Int]) Constructor: C2, Range Type: Slide b b Parametric instantiation: [(Bool,[Int])/ b] Reflected type indexing: [Bool/q, [Int]/r]

This example suggests that unification may be too powerful for our purpose of combining parametric instantiation with type indexing. Not only is the function ex1 well-typed in the GADT type system, but so are the following variations: ex1 1 e y = case e of C1 → y == "c" C2 → [if y then 5 else 7] ex1 2 e y = case e of C1 → y == ’c’ C2 → if y then 5 else 7

-- [Char], not Char

-- Int, not [Int]

In general, given expressions e1 and e2 such that ei is well-typed under a type environment Γi where y has a monomorphic type ti , you can create a well-typed variation ex1 x of the function ex1 as follows: ex1 x e y = case e of C1 → e1 C2 → e2

The existence of such functions does not materially affect practical programming. Current implementations of the GADT type system require type annotations on all functions that pattern match over GADTs, so it is unlikely that a programmer could write such a function and slip it past the type checker by accident. However, for researchers who are interested in the type inference problem for GADTs, the existence of functions such as ex1 raises three serious issues: 1. Designing a type inference algorithm for ex1 is difficult because the types of the data constructors C1 and C2 are totally useless in computing the type refinement for each branch.

2. The function ex1 has an infinite number of types that are not instances of each other. Sulzmann et al. also demonstrated a GADT function that has an infinite number of types [21], but there is an important difference between the two examples. While their example has infinite variations in the environment types, ex1 has infinite variations in the scrutinee type, thus presenting a different challenge to type inference algorithms. 3. Even if someone comes up with an algorithm that can infer a type for functions such as ex1, such a type inference algorithm may be undesirable from a software engineering perspective. A type inference algorithm should be nothing more than a labor-saving device; if it can infer types for programs that are not obviously well-typed to programmers, then the inference algorithm might just turn out be too clever for its own good. Type inference for functions like ex1 is a hard problem, and it is also a pointless one, because virtually no one writes these kinds of functions (see our case studies in §4.2). Instead, a far simpler solution is to mend the type system so that a type inference algorithm can avoid these unusual functions. In this paper we propose the Pointwise GADT type system, which supports both parametric instantiation and type indexing like the plain GADT type system does, but rejects functions such as ex1.

3.

Pointwise Unifiers

In this section we propose pointwise unifiers, which are special idempotent most-general unifiers that allow type information to flow only between the pointwise counterparts of two types. 3.1

Definition

Let us start with a way to identify a part of a type by address. Definition 1. A path is a sequence of positive integers p = a1 , a2 , . . . , an that serve as an “address” in a type. The expression t . p (read “subterm of t at p”) identifies a specific subterm of a type t at path p: 1≤a≤m a>m

t.=t (T t1 . . . tm ) . (a : r) = ta . r (T t1 . . . tm ) . (a : r) α . (a : r)

is undefined is undefined

where represents the empty sequence, and (a : r) represents the number a followed by the sequence r. Path p is valid in t if t . p is defined. If a path p is valid in both s and t, s . p and t . p are pointwise counterparts of s and t. g Example Let us consider paths in s = Either (a,b,c) Bool, illustrated here as a type tree. The subscript next to each tree node label represents the path to that particular subtree.

a1,1

b1,2

A1. dom(θ) ⊆ tv(s, t), A2. θ(s) = θ(t), and A3. Let α ∈ dom(θ). If s . p = α, then t . p = θ(α). Similarly, if t . p = α, then s . p = θ(α). If the unification problem s ∼ t has a pointwise unifier, we say that s and t are pointwise unifiable. g Example We use two positive and three negative examples of pointwise unifiers to demonstrate how Condition A3 in Definition 2 works. 1. θ = [Int/a, Bool/b] is a pointwise unifier of (Int,Bool) ∼ (a,b). 2. θ = [Int/a, Int/b] is not a pointwise unifier of a ∼ b because a ∈ dom(θ) but θ(a) = Int 6= b, violating A3. 3. θ = [a/c, a/b] is a pointwise unifier of (b,a) ∼ (a,c). 4. θ = [c/a, c/b] is not a pointwise unifier of (b,a) ∼ (a,c) because b ∈ dom(θ) but θ(b) = c 6= a, violating A3. 5. θ = [Int/a, Int/b] is not a pointwise unifier of (Int,a) ∼ (a,b) because a ∈ dom(θ) but θ(a) = Int 6= b, violating A3. This particular problem does not have a pointwise unifier. 3.2

Properties

In this subsection we state and prove some properties of pointwise unifiers. Since the definition of pointwise unifiers is symmetric, all theorems still hold if you switch s and t around. We start with three lemmas that allow us to decide, in a pointwise manner, whether two types are equal under substitution. Definition 3. If a type t is built from a type constructor, then T C(t) extracts the name of the type constructor at the root of the type t. In mathematical notation, T C(T t1 . . . tm ) = T T C(t) is undefined if t is a type variable.

g

Lemma 1. Let types s, t, and substitution θ be given. θ(s) = t iff the following conditions hold for all paths p valid in s: 1. p is valid in t, 2. If s . p is a type variable, then θ(s . p) = t . p. 3. If s . p is built from a type constructor, then t . p is also built from a type constructor, and T C(s . p) = T C(t . p). Proof. ⇒ Trivial. ⇐ We conduct this part of the proof by structural induction.

Either (,,)1

Definition 2. A substitution θ is a pointwise unifier of s ∼ t if the following conditions hold:

Bool2 c1,3

In other words, s . = Either (a,b,c) Bool s . 1 = (a,b,c) s . 1, 1 = a

s . 2 = Bool s . 1, 3 = c

We are now ready to define pointwise unifiers.

BASE C ASE. If s is a type variable, the only path valid in s is the empty path , which must also be valid in t. From θ(s . ) = t . we know θ(s) = t. I NDUCTION S TEP. If s is built from a type constructor U , then we can assume s = U s1 . . . sm . From T C(s . ) = U = T C(t . ) we know t = U t1 . . . tm , and we now prove that θ(sa ) = ta for all 1 ≤ a ≤ m. Let a be given and r be a path valid in sa . 1. r is valid in ta because (a : r) is valid in s and t.

2. If sa . r is a type variable, then θ(sa . r) = θ(s . (a : r)) = t . (a : r) = ta . r.

Theorem 5. Let θ be a pointwise unifier of s ∼ t. θ is a mostgeneral unifier of s ∼ t.

3. If sa . r is built from a type constructor, then ta . r is also built from a type constructor, and T C(sa . r) = T C(ta . r). This condition holds because T C(s . (a : r)) = T C(t . (a : r)).

Proof. Let η be a unifier of s ∼ t. We show that, for any type variable α, η(α) = (η ◦ θ)(α), so θ is a most-general unifier of s ∼ t.

By induction, θ(sa ) = ta , and therefore θ(s) = t.

Let α be given. If α ∈ / dom(θ), then (η ◦ θ)(α) = η(α).

Lemma 2. Let types s, t be given such that, for all paths q valid in s and t, T C(s . q) = T C(t . q) if they both exist. Then, if a path p is valid in s but not valid in t, there exists a prefix p0 of p such that t . p0 is a type variable.

If α ∈ dom(θ), then α ∈ tv(s, t). Without loss of generality, we assume that α ∈ tv(s), so there exists a path p such that s . p = α. Then (η ◦ θ)(α) = η(θ(s . p)) = η(t . p) = η(α).

Proof. We conduct this proof by structural induction on t. BASE C ASE. If t is a type variable, then p0 = satisfies the lemma. I NDUCTION S TEP. If t is built from a type constructor, then t = U t1 . . . tm . Since p is valid in s but not valid in t, we can assume p = (a : r) where 1 ≤ a ≤ m. Since T C(s . ) = T C(t . ) = U , we can assume s = U s1 . . . sm . We know that, for all paths q such that T C(sa . q) and T C(ta . q) are both defined, T C(sa . q) = T C(ta . q) because T C(sa . q) = T C(s . (a : q)) = T C(t . (a : q)) = T C(ta . q) The path r is valid in sa but not valid in ta , so by induction there exists a prefix r0 of r such that ta . r0 is a type variable. Then t . (a : r0 ) is a type variable, and p0 = (a : r0 ) satisfies the lemma. Lemma 3. Let types s, t, and substitution θ be given. θ(s) = θ(t) iff the following holds for all paths p valid in both s and t: 1. If s . p or t . p is a type variable, then θ(s . p) = θ(t . p). 2. If s . p and t . p are both built from type constructors, then T C(s . p) = T C(t . p). Proof. ⇒ Trivial by Lemma 1. ⇐ We conduct this proof by reduction to Lemma 1. Let p be a path valid in s, then p may relate to s and t in the following ways: 1. p is valid in t, and either s . p or t . p is a type variable. Then θ(s . p) = θ(t . p) = θ(t) . p. 2. p is valid in t, and both s . p and t . p are built from type constructors. Then T C(s . p) = T C(t . p) = T C(θ(t) . p). 3. p is not valid in t. By Lemma 2, there exists a prefix p0 of p such that p0 is valid in t, t . p0 is a type variable, and θ(s . p0 ) = θ(t . p0 ). Without loss of generality, let p be the concatenation of p0 and q. If s . p is a type variable, then 0

Since η = η ◦ θ for all unifiers η, θ is a most-general unifier of s ∼ t. Definition 4. We define θ S as the substitution θ with its domain restricted to the set of type variables S. Given types s, t, and a path p valid in both s and t, we define the substitution (θ . p) as θ tv(s . p, t . p). g We use (θ . p) only when s and t are clear from the context. The next theorem shows how a pointwise unifier of two types relates to the pointwise unifiers of their pointwise counterparts. Theorem 6. Let θ be a pointwise unifier of s ∼ t. For every path p that is valid in both s and t, (θ . p) is a pointwise unifier of s . p ∼ t . p. Proof. Trivial by applying the definition of pointwise unifiers. The next theorem states that a unifier that transfers information in only one direction is a pointwise unifier. Theorem 7. Let types u, v, and a substitution σ be given such that dom(σ) # tv(u) and dom(σ) ⊆ tv(v). If u = σ(v), then σ is a pointwise unifier of u ∼ v. Proof. Trivial by applying the definition of pointwise unifiers. The next theorem states that, if two pointwise-unifiable types do not share type variables, you can factor their pointwise unifier into two separate substitutions. Theorem 8. Let θ be a pointwise unifier of s ∼ t, and θs = θ tv(s)

θt = θ tv(t)

If tv(s) # tv(t), there exists a type u such that: θs (s) = θ(s) = θ(t) = θt (t)

θs (u) = t

θt (u) = s

In other words, the following diagram commutes. θt

u

/s

0

θ(s . p) = θ((s . p ) . q) = θ(t . p ) . q = θ(t) . p If s . p is built from a type constructor, then T C(s.p) = T C((s.p0 ).q) = T C(θ(t.p0 ).q) = T C(θ(t).p)

θs

θs

t

θt

/ θ(t) = θ(s)

Applying Lemma 1 to s and θ(t) proves that θ(s) = θ(t). Proof. It is trivial to prove that The next two theorems state that a pointwise unifier is an idempotent most-general unifier. Theorem 4. Let θ be a pointwise unifier of s ∼ t. θ is idempotent. Proof. Let α ∈ dom(θ). Without loss of generality, we assume that there exists a path p such that α = s . p. By Definition 2 and Lemma 3, θ(α) = θ(s . p) = t . p = θ(t . p) = θ(θ(α)) Therefore θ is idempotent.

θs (s) = θ(s) = θ(t) = θt (t) Let u = s fθ t where we define the prune operator (fθ ) as follows: if α ∈ dom(θ)

α fθ y = α

if α ∈ dom(θ) x fθ α = α (T x1 . . . xn ) fθ (T y1 . . . yn ) = T (x1 fθ y1 ) . . . (xn fθ yn ) We can prove, by induction, that the following properties hold: 1. u is uniquely defined,

2. If path p is valid in u and u . p = α is a type variable, then α ∈ dom(θ), and either s . p = α or t . p = α, and 3. If path p is valid in u and u . p is built from a type constructor, then T C(s . p) = T C(t . p) = T C(u . p). We omit the proof. Now we use these properties to prove that θs (u) = t (the proof for θt (u) = s is similar and omitted). Let p be a path such that u . p is a type variable. If u . p = s . p, then θs (u . p) = θs (s . p) = θ(s . p) = t . p because θ is a pointwise unifier of s ∼ t. If u . p = t . p, then θs (u . p) = θs (t . p) = t . p as t . p ∈ / dom(θs ). Applying Lemma 1 proves θs (u) = t. 3.3

Pointwise Unification

In this subsection we present a pointwise unification algorithm. Intuitively, the algorithm finds type variables that should be in the domain of a unifier, and checks that all occurrences of those type variables satisfy condition A3. It works in three phases: counterpart collection, conflict resolution, and unifier generation. We present each phase in turn, along with the lemmas we use to prove the correctness of pointwise unification at the end of this subsection. Counterpart collection This phase computes s . t, the unifiable pointwise counterparts of s and t that have all matching top-level type constructors stripped off. Here are some examples: (x → y) . (Int → Bool) = {(x, Int), (y, Bool)} (x → x) . (y → z) = {(x, y), (x, z)}

(Bool, b, Char) . (a, Int, a) = {(Bool, a), (b, Int), (Char, a)}

(Either a Int) . (Either b Bool) = ⊥

Counterpart collection for the last example fails because Int and Bool are not unifiable. Here is the formal definition: α . t = {(α, t)}

t . α = {(t, α)} (T s1 . . . sn ) . (T t1 . . . tn ) = (s1 . t1 ) ∪ · · · ∪ (sn . tn )

(T1 s1 . . . sm ) . (T2 t1 . . . tn ) = ⊥

if T1 6= T2

The symbol ⊥ indicates failure. We define ⊥ ∪ e = e ∪ ⊥ = ⊥, so any local failure implies global failure. When two pairs (x, y) and (y, x) are both present in a set, we arbitrarily keep one and drop the other, so for example {(x, y), (y, x)} becomes {(x, y)}. Counterpart collection does not check any property specific to pointwise unifiers; s . t = ⊥ implies that s and t are not unifiable.

Lemma 9. s . t 6= ⊥ iff for all path p such that T C(s . p) and T C(t . p) both exist, T C(s . p) = T C(t . p). Lemma 10. If (x, y) ∈ (s . t), then x or y is a type variable, and there exists a path p such that x = s . p and y = t . p. Lemma 11. Given path p valid in s and t, if s . p or t . p is a type variable, then (s . p, t . p) ∈ (s . t) or (t . p, s . p) ∈ (s . t).

These three lemmas are easily proved by induction over the definition of s . t.

Conflict resolution The conflict resolution phase enforces conditions A2 and A3 in Definition 2. This phase is defined by a rewrite system 7−→ that orients a set of pairs, so that the first component of each pair is a type variable that appears nowhere else. The set rewrites to ⊥ if no such orientation exists. Some examples:

{(x, Int), (y, Bool)}

is in normal form

{(x, y), (x, z)} 7−→ {(y, x), (z, x)} {(Bool, a), (b, Int), (Char, a)} 7−→ ⊥

And here is the definition. α ∈ tv(E), β ∈ / tv(E) {(α, β)} ] E 7−→ {(β, α)} ] E α 6= β, {α, β} ⊆ tv(E) {(α, β)} ] E − 7 →⊥ α∈ / tv(E) ∪ tv(T s) {(T s, α)} ] E 7−→ {(α, T s)} ] E α ∈ tv(E) ∪ tv(T s) {(T s, α)} ] E − 7 →⊥ α ∈ tv(E) ∪ tv(T s) {(α, T s)} ] E − 7 →⊥ Conflict resolution fails if s . t 7−→∗ ⊥. Note that each rewrite rule transforms only one element in the set; this property reflects the pointwise nature of pointwise unifiers. Lemma 12. The rewrite system 7−→ is confluent and strongly normalizing over finite sets of pairs of types. Proof. We establish strong normalization through a measure n(E) which counts the number of one-step rewrites for E. The value n(E) is finite for every finite E, every rewrite reduces n(E) by at least 1, and n is always nonnegative. Thus every rewrite sequence must be finite. We establish confluence through local confluence and strong normalization. Proof of local confluence is trivial because each element in E can trigger at most one rewrite, and the rewrite associated with an element (x1 , y1 ) is not affected by replacing another element (x2 , y2 ) with (y2 , x2 ). Lemma 13. If X 6= ⊥ and X 7−→∗ ⊥, then X 7−→ ⊥. Proof. We prove this by induction over the rewrite sequence. BASE C ASE. If X 7−→∗ ⊥ happens in one step, then X 7−→ ⊥. I NDUCTION S TEP. If X 7−→∗ ⊥ happens in n steps (n > 1), then there exists X 0 and X 00 such that X 7−→∗ X 0 7−→ X 00 7−→ ⊥. We show that X 0 7−→ X 00 7−→ ⊥ implies X 0 7−→ ⊥, and applying the induction principle completes the proof. There are six rewrites for X 0 7−→ X 00 7−→ ⊥; since they are all similar, we will prove only one case and omit the others. Suppose X 0 = {(α0 , β 0 )} ] E 0 7−→ {(β 0 , α0 )} ] E 0 = X 00 X 00 = {(α00 , β 00 )} ] E 00 7−→ ⊥ α0 ∈ tv(E 0 ), β 0 ∈ / tv(E 0 ), α00 6= β 00 , {α00 , β 00 } ⊆ tv(E 00 ) Then clearly β 0 6= α00 , β 0 6= β 00 , (α00 , β 00 ) ∈ E 0 , thus X 0 = {(α00 , β 00 )} ] Y 0

α00 6= β 00 , {α00 , β 00 } ⊆ tv(Y 0 )

And therefore X 0 7−→ ⊥. Unifier generation This phase turns a properly-oriented set of type pairs into a substitution. If E 6= ⊥, ( t if (α, t) ∈ E subE (α) = α otherwise Pointwise unification We have now introduced all three phases of the pointwise unification algorithm. To compute a pointwise unifier of two types s and t, simply compute s . t 7−→∗ E such that E is in normal form. If E 6= ⊥, then subE is a pointwise unifier of s ∼ t. The following theorem proves the soundness and completeness of the algorithm.

Theorem 14. If types s, t are pointwise unifiable, counterpart collection and conflict resolution both succeed. In addition, if s . t 7−→∗ E such that E is in normal form and E 6= ⊥, then subE is a pointwise unifier of s ∼ t. Proof. We start with the first statement. Assuming that s, t are pointwise unifiable, then there exists a pointwise unifier θ of s ∼ t. We know that T C(s . p) = T C(t . p) for all paths p such that T C(s . p) and T C(t . p) both exist (Lemma 3), so s . t 6= ⊥ (Lemma 9).

VAR

LAM

x : ∀α. t ∈ Γ s = inst[α](t) Γ`x:s

Γ{u : s} ` e : t Γ ` λu . e : s → t

CONS

APP

C : ∀α. t

s = inst[α](t) Γ`C:s

LET

Γ{u : s} ` e : s

∗

Now we prove that s . t 7−6 → ⊥. From Lemma 13, it is sufficient to prove that s . t 7−6 → ⊥. Let us consider each possible rewrite to ⊥ as follows:

Γ{u : ∀α. s} ` e : s α # tv(Γ) Γ{u : ∀α. s} ` d : t Γ ` let u=e : ∀α. s in d : t CASE

Γ`e:s Γ `p pi → ci : s → t Γ ` case e of { pi → ci } : t PAT

2. If s . t = {(T s, α)} ] E 0 , then there exists a path p such that T s = s . p, α = t . p (Lemma 10), and α ∈ dom(θ) (Lemma 3). For all paths q, s . q = α implies t . q = T s, and t . q = α implies s . q = T s. Therefore α ∈ / tv(E 0 ), and {(T s, α)} ] E 0 7−6 → ⊥.

C : ∀α. w → T s α # tv(Γ, u, t) θ(Γ{x : w}) ` c : θ(t) θ = pwu(T u ∼ T s) Γ `p C x → c : T u → t Figure 1. The Pointwise GADT type system.

3. If s . t = {(α, T s)} ] E 0 , then the same reasoning as in the previous case shows that {(α, T s)} ] E 0 7−6 → ⊥.

We now prove the second statement: subE is a pointwise unifier of s ∼ t. We start by showing that subE is a substitution. Let (α, y) ∈ E and (α, y 0 ) ∈ E. If y 6= y 0 , then E 7−→ ⊥, which contradicts the assumption that E is in normal form. Therefore y = y 0 and subE (α) is uniquely defined. Finally, we check subE against Definition 2.

A1. dom(subE ) ⊆ tv(E) = tv(s . t) = tv(s, t).

A2. θ is a unifier of x ∼ y for any (x, y) ∈ s . t, and from Lemma 3 and Lemma 11, we know θ is a unifier of s ∼ t. A3. Let α ∈ dom(subE ), and without loss of generality assume α ∈ tv(s). Then there exists a path p such that s . p = α. We first prove p is valid in t by contradiction. Assume to the contrary that p is invalid in t. Since s . t exists, there is a prefix q of p such that t . q = β. Then, {(β, s . q), (α, subE (α))} ⊆ E

or

{(s . q, β), (α, subE (α))} ⊆ E Since q is a prefix of p, α ∈ tv(s . q), so E is not in normal form, contradicting our assumption. Therefore p is valid in t. Now we prove t . p = subE (α) by contradiction. Assume to the contrary that t . p 6= subE (α). Then, given E 6= ⊥, {(α, t . p), (α, subE (α))} ⊆ E

α = tv(s) \ tv(Γ) Γ{u : ∀α. s} ` d : t Γ ` let u=e in d : t

LET-A

1. If s . t = {(α, β)} ] E 0 , then there exists a path p such that α = s . p and β = t . p (Lemma 10). If α 6= β, then either α ∈ dom(θ) or β ∈ dom(θ). Without loss of generality we assume α ∈ dom(θ). For all paths q, s . q = α implies t . q = β, t . q = α implies s . q = β, so α ∈ / tv(E 0 ), and 0 {(α, β)} ] E 7−6 → ⊥.

The analysis shows that s . t 7−6 → ⊥, therefore s . t 7−6 → ∗ ⊥.

Γ ` f : t1 → t2 Γ ` e : t1 Γ ` f e : t2

wise unifier if it exists, and the algorithm fails if the input types are not pointwise unifiable.

4.

The Pointwise GADT Type System

Parametric instantiation and type indexing are both easy for programmers to understand because they have a simple structure: type information flows only between the pointwise counterparts of the scrutinee type and the constructor range type. In this section we define the Pointwise GADT type system, which accepts only patternmatches that follow this simple structure of information flow. To avoid confusion, we will refer to the one proposed by Peyton Jones et al. [8] as plain GADTs. 4.1

Definition

Figure 1 shows the definition of the Pointwise GADT type system. It is almost identical to the plain GADT type system, except that the PAT rule uses pointwise unification (pwu) instead of standard unification. Since every pointwise unifier of s ∼ t is also a mostgeneral unifier of s ∼ t (Theorem 5), Pointwise GADTs are a restriction of plain GADTs: every expression that is well-typed in Pointwise GADTs is also well-typed in plain GADTs.

or

{(t . p, α), (α, subE (α))} ⊆ E Both cases imply that E is not in normal form, contradicting our assumption. Thus t . p = subE (s . p). The case where α ∈ tv(t) is similar and omitted here. Therefore subE is a pointwise unifier of s ∼ t. This proof shows that pointwise unification is sound and complete. Given two input types, pointwise unification computes their point-

4.2

Expressiveness

The Pointwise GADT type system is more expressive than ADTs, but less expressive than plain GADTs. Can it accept the plain GADT programs that practical programmers write? As a rough gauge on how typical plain GADT programs fare in Pointwise GADTs, we studied the Omega programs that Sheard prepared for the 2006 Spring School on Generic Programming [16] and the 2007 Central European Functional Programming School

[17].1 These program examples cover a wide range of applications; all involving significant use of GADTs: • Manipulation of values in different units, • Tagless term evaluation, • N -way zip (two implementations), • Type-indexed paths in a binary tree, • AVL-tree node insertion and deletion, and • Witness for integer-arithmetic theorems.

Our study finds that Pointwise GADTs are expressive enough to type all of these computations. The AVL-tree example, which uses GADTs to enforce the balance invariant, requires some refactoring (see §4.3); the Pointwise GADT type system accepts all other examples as they were written.2 This result is, in some sense, not surprising: since programmers typically encode different properties of a data value with different type indices, pointwise unification — which compares only the pointwise counterparts of the scrutinee type and the constructor range type — should be all that is necessary to type programs written in this manner. 4.3

2. The types u, v, and w are identical. Proof. Without loss of generality, assume that u is not a type variable. Since the C1 branch is well-typed, we know that the following unification problem has a pointwise unifier θ. Slide u v w ∼ Slide a a b Since u is not a type variable, θ(a) = u, and by Definition 2 we know u = v. Applying a similar argument to the C2 branch shows v = w. 3. The branch bodies of the C1 and the C2 branches must be welltyped under the same type environment as the case expression. Proof. Consider the C1 branch. Given that t is not a type variable, the unification problem Slide t t t ∼ Slide a a b has only one pointwise unifier θ = [t/a, t/ b]. Since we know {a, b} # tv(Γ), θ(Γ) = Γ. The case for the C2 branch is similar and omitted. The final fact requires that the argument y of ex1 has the same type in the C1 and the C2 pattern-matching branches, contradicting our assumption. So ex1 is not well-typed in Pointwise GADTs.

Programs Pointwise GADTs Reject

Another way to study the expressiveness of Pointwise GADTs is to study programs that are well-typed in plain GADTs, but not in Pointwise GADTs. In this subsection we first show three such program examples, and then we explain how to rewrite them as well-typed Pointwise GADT programs. First example Our first example is the function ex1 from §2.6 (reproduced here). data Slide a b c where C1 :: forall a b. Slide a a b C2 :: forall a b. Slide a b b ex1 e y = case e of C1 → y == ’c’ C2 → [if y then 5 else 7]

Even though the proof is quite involved, there is an intuitive explanation on why ex1 is not well-typed in the Pointwise GADT type system. Typing ex1 in plain GADTs requires associating a type variable in the constructor range type with different types in the scrutinee type. Using the first type we gave in §2.6, the type variable a in the range type of C1 corresponds to both (Char,Bool) and (q,r). Pointwise unification requires that a type variable must always correspond to a single type. The following diagram illustrates this requirement: since (Slide a a b) . 1 = (Slide a a b) . 2, u = v must hold. This requirement prevents us from using the aforementioned trick to give a type for ex1 in Pointwise GADTs, thus ex1 is not well-typed in Pointwise GADTs.

Proof. Assume to the contrary that ex1 is well-typed in Pointwise GADTs. From the code we can see that its argument e has type Slide u v w, where u, v, and w are types. We perform the proof by analyzing what types u, v, and w can be. 1. The types u, v, and w are not all type variables. Proof. Assume to the contrary that they are all type variables, then e has five possible types (modulo type variable renaming): Slide Slide Slide Slide Slide

p p p p p

p p q q q

p q p q r

None of them leads to a valid type of ex1. For example, assume e has type Slide p q r. The only options we have for the type of y are p, q, r, or s (another type variable), or T z, and none of them allows y to have type Char in the C1 branch and type Bool in the C2 branch. Therefore, u, v, and w are not all type variables. 1 We

studied all the programs in the lecture notes, except those that depend on other features of the Omega language, such as staged computation and type-level functions. 2 Data for all the case studies are available online at (http://web.cecs. pdx.edu/~cklin/pointwise/).

Slide 'u v w

Slide 'a a b

Theorem 15. ex1 is not well-typed in Pointwise GADTs.

1

''

'' '

2

oo a ooooo

a

θ θ

''

'' ' ooo/ v ooo / u o 1

2

Second example Here is our second example of a plain GADT program that Pointwise GADTs reject: data Split a b where D1 :: Split Int Int D2 :: forall a b. Split (Int, a) (b, Bool) ex2 :: forall x. Split x x → x ex2 e = case e of D1 → 7 D2 → (3, True)

Theorem 16. ex2 is not well-typed in Pointwise GADTs. Proof. Assume to the contrary that ex2 is well-typed in Pointwise GADTs. From the code we can see that its argument e has type Split u v where u and v each represents a type. To make ex2 well typed, the following unification problems must have pointwise unifiers: Split u v ∼ Split Int Int Split u v ∼ Split (Int, a) (b, Bool)

The existence of pointwise unifiers requires that u and v be different type variables, and there is no way to express the result type of the function. So ex2 is not well-typed in Pointwise GADTs. Third example While performing the AVL-tree case study, we discovered a programming style that can produce ill-typed Pointwise GADT programs. Since the AVL-tree example is quite complicated, we choose to demonstrate this programming style with a manufactured example. data Equ a b where Equ :: forall a. Equ a a data WrapE d where WrapE :: forall c d. Equ c d → c → WrapE d

ex3a e = case e of Wrap i → i+1

The second option is to hide the instantiation [Int/d] in a GADT witness IsInt d and examine it only inside the Equ branch. data IsInt d where IsInt :: IsInt Int ex3b :: IsInt u → WrapE u → Int ex3b k e = case e of WrapE w i → case w of Equ → case k of IsInt → i+1

The more general type of the scrutinee e in ex3b is pointwise unifiable with the Equ constructor range type. Pattern: Equ, Scrutinee Type: Constructor: Equ, Range Type: Type indexing: [a/c, a/d]

ex3 :: WrapE Int → Int ex3 e = case e of WrapE w i → case w of Equ → i+1

In WrapE, the (Equ c d) argument witnesses that the existentiallyquantified type variable c and the type argument d are the same type. In ex3, the type argument d in the type of e is instantiated to Int, and pattern matching on Equ, which substitutes Int for c, introduces indirect type-information flow. Pattern: Equ, Scrutinee Type: Constructor: Equ, Range Type: Parametric instantiation: [Int/a] Reflected type indexing: [Int/c]

Equ c Int Equ a a

Resolution Do these three rejected program examples reflect any serious limitations in the expressiveness of the Pointwise GADT type system? In our opinion the answer is no. In the function ex1, plain GADTs allow us to type the C1 and the C2 branches under arbitrarily different type environments. We see this extreme level of flexibility as more of a curse than a blessing, and programmers who need to write the function ex1 are well advised to rewrite it in the following way: data W a where W1 :: Char → W Bool W2 :: Bool → W [Int] ex1a ex1a W1 W2

:: forall r. W r → r e = case e of y → y == ’c’ y → [if y then 5 else 7]

The function ex1a is clear, easy to understand, and well-typed in the Pointwise GADT type system. The second example is not as objectionable as ex1, but it is also easily rewritten into a well-typed Pointwise GADT program: data United a where U1 :: United Int U2 :: United (Int, Bool) ex2a ex2a U1 U2

:: forall x. United x → x e = case e of → 7 → (3, True)

The third example is perhaps the most important because it is inspired by the AVL-tree case study. We can rewrite ex3 in two ways. The first option is to inline the Equ witness and express the equality c=d directly in Wrap. This refactoring is simple but may lead to some code duplication.

5.

Equ c d Equ a a

Pointwise Baseline

Up to this point, we studied the Pointwise GADT type system entirely through program examples. While program examples provide useful intuition and serve as a firm foundation to the typing derivations, this mode of study also has its problems. First, it is hard to come up with program examples that reveal unexpected behavior of a type system, because doing so requires intentionally violating good programming practices that have been drilled into our heads over the years. Second, program examples add a lot of incidental complexity to the discussion. A program example contains many trivial details — names of data constructors, names of variables and their binding sites, variable shadowing (or the lack thereof), and expressions that exercise variables in specific ways — details that are only marginally relevant to the important question, which is how the Pointwise GADT type system deals with pattern-matching branches in case expressions. In this section we raise the level of abstraction by introducing Pointwise Baseline, which models how the Pointwise GADT type system types pattern-matching branches. 5.1

Definition

Pointwise Baseline is an abstract model of how the Pointwise GADT type system checks case expressions. We designed it to model the Pointwise GADT PAT type rule: it retains the structure of the type rule (the relationship between types) but removes all distractions (the program being typed). Definition 5. A Pointwise Baseline is a sequence of quadruples of types B = {(c1 , g1 , h1 , z1 ), . . . , (cn , gn , hn , zn )} together with a pair of types (r, z) such that the following conditions hold for each quadruple of types (ci , gi , hi , zi ) in B: 1. tv(r, z) # tv(ci , gi ), 2. There exists a pointwise unifier σi of r ∼ ci such that 3. σi (gi ) = hi and σi (z) = zi . Example Consider the sum function in §2.4 (reproduced here): data Z data S n

data Wrap d where Wrap :: forall d. d → Wrap d

data L n a where Cons :: forall n a. a → L n a → L (S n) a Nil :: forall a. L Z a

ex3a :: Wrap Int → Int

sum :: forall m. L m Int → Int

g

Proof. Let 1 ≤ i ≤ n be given. Since dom(σi ) ⊆ tv(r) ∪ tv(ci ) and tv(z) # tv(ci ), we know dom(σi ) # tv(z . q), and therefore zi . q = σi (z . q) = z . q.

sum xs = case xs of Cons y ys → y + sum ys Nil → 0

The typing of its case expression corresponds to this pointwise baseline: i 1 (Cons) 2 (Nil)

ci L (S n) a L Z a

gi (a, L n a) ()

PAT

Ts

w

i 1 (Cons) 2 (Nil)

hi (Int, L n Int) ()

zi (L (S n) Int, Int) (L Z Int, Int)

PAT

θ(w)

θ(Γ, t)

(r, z) = (L m Int, (L m Int, Int)) The number of quadruples (ci , gi , hi , zi ) in the set is the same as the number of pattern-matching branches in the case expression, and each quadruple represents the types related to a branch. The types r and z are related to the entire case expression. What does it all mean? The best way to understand the formulation of Pointwise Baseline is to relate it to the PAT type rule in Figure 1 (reproduced here): PAT

C : ∀α. w → T s α # tv(Γ, u, t) θ = pwu(T u ∼ T s) θ(Γ{x : w}) ` c : θ(t) Γ `p C x → c : T u → t There is no strict one-to-one correspondence between types in a pointwise baseline and types in the PAT type rule. Instead, each part of a pointwise baseline corresponds to all types that play a particular role in the PAT rule. There are six roles in three groups, and we list them here with the corresponding Pointwise Baseline symbols in square brackets (so θ [σi ] means that θ in the PAT rule corresponds to σi in Pointwise Baseline): 1. The first group includes types that are used to generate the branch type refinement θ [σi ]. The scrutinee type T u [r] comes from the type environment. The constructor range type T s [ci ] depends on the pattern and is specific to each branch. 2. The second group includes types governed by parametric instantiation. They are the constructor argument types w [gi ], and the constructor argument types under branch type refinement θ(w) [hi ]. 3. The third group includes types governed by type indexing. The branch body type t and the types in the type environment Γ play the same role, because they are all subject to type indexing. We call types in this role [z] the environment types. The environment types under branch type refinement, namely θ(t) and θ(Γ), play the other role [zi ] in the group. Pointwise Baseline contains all six roles from the PAT type rule. 5.2

Properties

Let {(c1 , g1 , h1 , z1 ), . . . , (cn , gn , hn , zn )} and (r, z) be a pointwise baseline. In this subsection we present four theorems on the baseline. The first theorem states that type indexing affects only parts of the environment type (i.e., type of the branch body or types in the environment Γ) that share type variables with the scrutinee type. Theorem 17. Let q be a path valid in z such that tv(r) # tv(z . q). Then, for all 1 ≤ i ≤ n, zi . q = z . q.

The next theorem describes how type indexing may refine a type variable that does appear in both the environment type and the scrutinee type. Theorem 18. Let p be a path valid in r and q be a path valid in z such that r . p = z . q = α. Then, for all 1 ≤ i ≤ n, either zi . q = z . q, or p is valid in ci and zi . q = ci . p. Proof. Let σi be the pointwise unifier of r ∼ ci as per Definition 5. We consider whether α is in the domain of σi : 1. α ∈ / dom(σi ). Since σi (z) = zi , we know zi . q = σi (z . q) = z . q. 2. α ∈ dom(σi ). Since σi is a pointwise unifier of r ∼ ci , ci . p exists and ci . p = σi (α). Then, from σi (z) = zi , we know zi . q = σi (α) = ci . p. This case analysis completes the proof. The next two theorems are mirror images of the previous two. While the previous two theorems describe type indexing, the next two theorems describe parametric instantiation. Since these two theorems are structurally identical to the previous two, we present them without proof. The next theorem restates the standard restriction on existential types: parametric instantiation cannot affect any type variable that does not appear in the constructor range type. Theorem 19. Let 1 ≤ i ≤ n be given, and q be a path valid in gi such that tv(ci ) # tv(gi . q). Then hi . q = gi . q. The last theorem generalizes the standard restriction on existential types: parametric instantiation cannot affect any type variable that does not have a pointwise counterpart in the scrutinee type. Theorem 20. Let 1 ≤ i ≤ n be given, p be a path valid in ci , and q be a path valid in gi such that ci . p = gi . q = α. Then, either hi . q = gi . q, or p is valid in r and hi . q = r . p. The following example shows how we can use these theorems to show that a certain pointwise baseline does not exist. Example The problem of typing the case expression in the function ex1 (§2.6 and §4.3, reproduced here) data Slide a b c where C1 :: forall a b. Slide a a b C2 :: forall a b. Slide a b b ex1 e y = case e of C1 → y == ’c’ C2 → [if y then 5 else 7]

corresponds to this pointwise baseline where r and z are unknown: i 1 (C1) 2 (C2) i 1 (C1) 2 (C2)

hi () ()

ci Slide a a b Slide a b b

gi () ()

zi (Slide a a b, Char, Bool) (Slide a b b, Bool, [Int])

Using this pointwise baseline, we now present an alternate proof that ex1 is not well-typed in Pointwise GADTs.

Let us assume to the contrary that ex1 is well-typed in Pointwise GADTs, and the types r and z exist. If z is a type variable that does not appear in r, then by Theorem 17, z1 = z, which is false because z1 is not a type variable. If z is a type variable that appears in r, then by Theorem 18, either z1 = z, or there exists a path p such that z1 = c1 . p. Both equations are false because z1 is not a type variable, and z1 does not appear in c1 . So z must be built from a type constructor, and by Lemma 1, z must be a 3-tuple. Using the same strategy, we can show that z . 2 is not a type variable, and it must be built from a data constructor. In this case Lemma 1 requires Char = T C(z . 2) = Bool, which is obviously false. Thus we conclude that the type z does not exist, and ex1 is not well-typed in Pointwise GADTs. 5.3

• Wobbly types (Peyton Jones et al.) [8] captures the refinement

Discussion

The previous subsection clearly shows the usefulness of Pointwise Baseline. It allows us to state properties of the PAT type rule concisely, and it allows us to show, with concise arguments, that some programs are not well-typed in the Pointwise GADT type system. There has been much speculation about the logical structure of type indexing; some previous work treats it as logical implication [19, 20, 21], while another compares it to simultaneous rigid E-unification [15]. Our approach presents a more accurate picture than the previous attempts, because Pointwise Baseline is actually derived from the PAT type rule. We presented four theorems on Pointwise Baselines in this section. The first two theorems show that there is a rigid structure on how type indexing may affect the environment types: type indexing can only substitute a part of the constructor range type for a type variable that appears in the scrutinee type. The third theorem reiterates the standard restriction on existentially-quantified type variables, and the fourth theorem generalizes the restriction to Pointwise GADTs. We plan to investigate how these results can be used to improve existing GADT type inference algorithms.

6.

It soon became apparent that all these ideas were the same, and could be described by using Xi’s idea of type refinement. The term GADT first appeared in a paper by Peyton Jones et al. [8]. In this paper, Peyton Jones et al. focus on formalizing a type system that supports GADTs as well as ordinary ADTs, type inference, and constrained types. A flurry of papers followed, all based upon this principle: require functions that pattern-match on GADTs to be annotated with type information, but support polymorphism and type inference on the rest of the program. Each used a different mechanism to capture the case branch refinement:

Related Work

The earliest work we know on GADT-like data structures in programming languages is Silly Type Families by Augustsson and Petersson [2]. While many of the important features of GADTs are mentioned in the paper, the authors state “Even if this extension allows a few more programs to be written and type checked, it is by no means magic.” The authors conclude that they do not know how to use the feature to write many useful programs. By 2002, the ability to define algebraic data types where the range of constructor functions could mention specific concrete types (rather than only generic type variables) was back in demand. This demand was met by an interesting confluence of events. Two separate groups, Cheney and Hinze [4] and Baars and Swierstra [3], independently demonstrated how to define an equality type witness in Haskell (both papers were based upon ideas in an earlier work by Weirich [25]). Both Cheney and Hinze (First Class Phantom Types [5]) and Sheard and Pasalic (Equality-Qualified Types [18]) immediately recognized that embedding type equality witnesses into ordinary algebraic data types creates a GADT-like data structure. The thesis The Role of Type Equality in Meta-Programming by Pasalic [13] demonstrated that much of higher-order logic could be embedded in Haskell using this technique alone. At the same time, Xi et al. (Guarded Recursive Datatype Constructors [26]) developed a similar capability based upon the use of type-refinement rather than equality witnesses.

•

•

•

•

as a set of equality constraints, but solves these constraints locally and immediately at each case branch. While this was done in a broader Haskell context which supported class-based constrained types, the interaction between the two types of constraints was left for future work. Constraint solving (Stuckey and Sulzmann) [20] generates constraints by propagating type annotation information while traversing the whole program syntactically. Constraint solution is postponed and done globally after collecting all constraints. HMG(X) (Simonet et al.) [19] extends the constraint solving approach to Hindley-Milner style polymorphism. Done in the context of Objective Caml, the system also deals with OCamlspecific issues such as subtyping and implication constraints. A general constraint solver was left for future work. Herbrand constraint abduction (Sulzmann et al.) [21] was a general (though incomplete) technique to solve GADT-based constraints. Sulzmann et al. presented examples with an infinite number of maximal types for the first time, thus demonstrating that inferring Hindley-Milner style types for programs over GADTs was undecidable. Stratified type inference (Pottier and R´egis-Gianas) [14] was an extension of earlier work by Pottier and colleagues. Their contribution, local shape inference, propagates type annotation information better than their previous work.

The OutsideIn algorithm by Schrijvers et al. [15] is closely related to our work. Both groups recognize that type inference for plain GADTs is too hard, but take different approaches in restricting the plain GADT type system. We want to build a foundation for future GADT type inference research, so we designed Pointwise GADTs with maximal expressiveness in mind. Schrijvers et al., however, want to support complete type inference, so they designed their restricted GADT type system with simplicity in mind. Since their restricted type system never propagates type information from a GADT pattern matching branch to the outside environment, it rejects nearly all annotation-free programs that use type indexing (i.e., programs that make essential use of GADT patterns). Clearly, their restricted type system is significantly less expressive than Pointwise GADTs. Given that the OutsideIn type inference algorithm is sound and complete with respect to their restricted GADT type system, OutsideIn requires type annotations for most functions with GADT patterns, and hence the GADT type inference problem remains open. In addition to the papers we discussed, anonymous reviewers also suggested some other relevant work [6, 10, 12], which we intend to explore in the future.

7.

Conclusions and Future Work

In this paper we focused on the mechanism that the GADT type systems use to support parametric instantiation and type indexing. We showed that, although unification gets the job done, it is too

powerful for this purpose, because it allows the plain GADT type system to accept some programs that programmers may not expect to be well-typed. To remedy this problem, we proposed the Pointwise GADT type system, which works just like the plain GADT type system, except that it uses pointwise unifiers to support parametric instantiation and type indexing. Since a pointwise unifier propagates information only between the pointwise counterparts of the unified types, programmers can easily see why a program is well-typed in the Pointwise GADT type system. Even though Pointwise GADTs are less expressive than plain GADTs, our case studies indicate that most extant plain GADT programs are also well-typed in Pointwise GADTs. We attribute this discovery to the conjecture that programmers naturally think about the interaction between a scrutinee type and a constructor range type in a pointwise fashion. We proposed Pointwise Baseline as an abstract model of how Pointwise GADTs type pattern-matching branches in case expressions, and the model shows that Pointwise GADTs perform both parametric instantiation and type indexing in a highly structured manner. We plan to study the properties of the Pointwise GADT type system in greater depth, and to investigate how these properties can help us improve the effectiveness of type inference algorithms. Acknowledgments

[10] Andrew Kennedy and Claudio V. Russo. Generalized algebraic data types and object-oriented programming. In Proceedings of the 20th Annual ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications, OOPSLA 2005, pages 21–40. ACM Press, October 2005. [11] Chuan-kai Lin. Programming monads operationally with Unimo. In ICFP’06: Proceedings of the Eleventh ACM SIGPLAN International Conference on Functional Programming, pages 274–285, New York, NY, USA, September 2006. ACM Press. [12] Bruce J. McAdam. On the unification of substitutions in type inference. In Implementation of Functional Languages: 10th International Workshop, IFL’98, volume 1595 of LNCS, pages 137– 152. Springer, 1999. [13] Emir Pasalic. The Role of Type Equality in Meta-Programming. PhD thesis, OGI School of Science & Engineering, Oregon Health & Science University, 2004. [14] Franc¸ois Pottier and Yann R´egis-Gianas. Stratified type inference for generalized algebraic data types. In POPL’06: Conference record of the 33rd ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pages 232–244, New York, NY, USA, January 2006. ACM Press. [15] Tom Schrijvers, Simon Peyton Jones, Martin Sulzmann, and Dimitrios Vytiniotis. Complete and decidable type inference for GADTs. In Proceedings of the 14th ACM SIGPLAN International Conference on Functional Programming, pages 341–352, New York, NY, USA, September 2009. ACM Press.

This work is partially supported by the National Science Foundation under grants CCF–0541447 and CCF–0613969. We want to thank James Hook, Mark P. Jones, Tom Harke, Tim Chevalier, and Creighton Hogg for their feedback on drafts of this paper.

[16] Tim Sheard. Generic programming programming in Omega. In Roland Backhouse, Jeremy Gibbons, Ralf Hinze, and Johan Jeuring, editors, Datatype-Generic Programming, volume 4719 of LNCS, pages 258–284. Springer, 2006.

References

[17] Tim Sheard and Nathan Linger. Programming in Omega. In Zolt´an Horv´ath, Rinus Plasmeijer, Anna So´os, and Vikt´oria Zs´ok, editors, Central European Functional Programming School, volume 5161 of LNCS, pages 158–227. Springer, 2007.

[1] Jim Apple and Wes Weimer. Simulating dependent types with guarded algebraic datatypes. Online at http://www.cs.virginia. edu/~jba5b/singleton/ (accessed January 23, 2009), August 2008. [2] Lennart Augustsson and Kent Petersson. Silly type families. Online at http://web.cecs.pdx.edu/~sheard/papers/silly.pdf (accessed September 26, 2009), September 1994. [3] Arthur I. Baars and S. Doaitse Swierstra. Typing dynamic typing. In Proceedings of the Seventh ACM SIGPLAN International Conference on Functional Programming, volume 37(9) of ACM SIGPLAN Notices, pages 157–166. ACM, October 2002. [4] James Cheney and Ralf Hinze. A lightweight implementation of generics and dynamics. In Proceedings of the 2002 ACM SIGPLAN Workshop on Haskell, pages 90–104. ACM Press, 2002. [5] James Cheney and Ralf Hinze. First-class phantom types. Technical Report TR2003-1901, Cornell University, July 2003. [6] Thierry Coquand. Pattern matching with dependent types. In Proceedings of the 1992 Workshop on Types for Proofs and Programs, pages 66–79, June 1992. [7] Simon Peyton Jones, editor. Haskell 98 Language and Libraries: The Revised Report. Cambridge University Press, Cambridge, UK, May 2003. [8] Simon Peyton Jones, Dimitrios Vytiniotis, Stephanie Weirich, and Geoffrey Washburn. Simple unification-based type inference for GADTs. In ICFP’06: Proceedings of the Eleventh ACM SIGPLAN International Conference on Functional Programming, pages 50–61, New York, NY, USA, September 2006. ACM Press. [9] Simon Peyton Jones, Geoffrey Washburn, and Stephanie Weirich. Wobbly types: type inference for generalized algebraic data types. Technical Report MS-CIS-05-26, University of Pennsylvania, July 2004.

[18] Tim Sheard and Emir Pasalic. Meta-programming with built-in type equality. In Proceedings of the Fourth International Workshop on Logical Frameworks and Meta-Languages, pages 106–124, July 2004. [19] Vincent Simonet and Franc¸ois Pottier. A constraint-based approach to guarded algebraic data types. ACM Transactions on Programming Languages and Systems, 29(1):1–56, January 2007. [20] Peter J. Stuckey and Martin Sulzmann. Type inference for guarded recursive data types. The Computing Research Repository (CoRR), abs/cs/0507037, July 2005. [21] Martin Sulzmann, Tom Schrijvers, and Peter J. Stuckey. Type inference for GADTs via Herbrand constraint abduction. Technical Report CW507, Department of Computer Science, K. U. Leuven, Leuven, Belgium, January 2008. [22] The GHC Team. The Glorious Glasgow Haskell Compilation System User’s Guide, Version 6.4, March 2005. [23] The Darcs project. Online at http://www.darcs.net/. Accessed July 14, 2009. [24] The Pugs project. Online at http://www.pugscode.org/. Accessed July 14, 2009. [25] Stephanie Weirich. Type-safe cast: Functional Pearl. In Proceedings of the ACM SIGPLAN International Conference on Functional Programming (ICFP’00), volume 35(9) of ACM SIGPLAN Notices, pages 58–67, New York, NY, USA, September 2000. ACM Press. [26] Hongwei Xi, Chiyan Chen, and Gang Chen. Guarded recursive datatype constructors. In Proceedings of the 30th ACM SIGPLANSIGACT Symposium on Principles of Programming Languages, pages 224–235, New York, NY, USA, January 2003. ACM Press.

Identity Types in an Algebraic Model Structure