[[UPDATE: The pdf and docs versions above have been fixed [see comment below]. This hasn't been, may never be.]]
The Bounded Lattice Type System (BLTS) is at the core of the programming language I'm (slowly) working on. This is an attempt to extract a description of just the type system.
The whole thing fits together like a jigsaw (or sudoku) puzzle. You might think that you like some bits and not others but I strongly suspect that it is not possible to change the core ideas without it falling apart. It took years to get to this current point, and it was only well after the last fix that I noticed that there were no more nasty interactions (that I know about).
In the types of Type Theory, elements can only have one type. In BLTS a number can be an integer, and a rational and a Gaussian (complex) integer. An animal can be a dog, and a mammal, and a quadruped.
Procedures in BLTS are not normally total. If they can't return the required result they fail, and this often leads to alternative action. This is informational failure, which is the only sort under consideration in this document.
The Lattice
A bounded lattice is a partial order where every set of elements (including the empty set) has a least upper bound (join) and a greatest lower bound (meet). In BLTS the types form a partial order with the isA relation. So, using the examples mentioned, we have: Integer isA Rational; Integer isA ComplexInteger; Dog isA Mammal; Dog isA Quadruped. The isA relationships always come with a pair of procedures converting up and down. The up procedure always succeeds (for parameters of the right type), but the down procedure naturally fails when the value is not of the lower type.[1]
Types are characterized by their properties. Properties are monotonically decreasing. If Rational has the property "denominator" then Integer must have that property as well, and they must agree. The meaning of agreement is covered below.
The join and meet in BLTS are Union and Intersection. Union of the empty set (of types) is the type Empty which has no members (is uninhabited). Intersection of the empty set is the type Any (Union of all types). Union(X,Y) has values which can be either an X, or a Y, or both if they fall in the Intersection. So given xy:Union(X,Y), one can ask if it's an X with xy:X, or if it's a Y with xy:Y. Proponents of parametricity will say that these are the only properties there should be. However we'll see that that is not the case in general in BLTS.
To find Intersection(X,Y), find the set of types directly (immediately) below both, and take the Union of that set. The asymmetry between the definitions of Union and Intersection is, I think, illusory, and I believe these rules give the free bounded lattice that contains the partial order generated by the transitive closure of the isA relationships. Confirmation or disproof of this would be welcome.
Combining
This is going to seem like a strange detour, but actually it is a key solution to various problems.
BLTS utilises a procedure, caseP, which performs the function of a case/switch statement. It is curried. The first parameter is a set of procedures (actually a bag, but repeats won't change the result). This is like the arms of a case statement in other languages, except that the match tests have moved inside the procedures. The result of caseP is a procedure which takes a parameter, and passes that parameter to all of the procedures in the provided set. The common expectation is that one branch will succeed, all the others will fail, and the final result is the result returned by that successful procedure. However if two (or more) procedures return the same result then that is ok as well.
But what if the results returned are of a type that does not support equality (does not conform to Distinguishable). In this case we allow a weaker condition: Combinable. If a type supports Combinable then there is a property 'combine' which allows 2 or more of that type to be combined. In particular, procedures are Combinable. For types that support equality the combination is the value if all are equal, else it fails.
For procedures the combine procedure takes a set (bag) of the procedures to be combined and passes the parameter for the combined procedure to each of the procedures being combined. If the results are Combinable (includes Distinguishable) then the result is the combine of the results of the successful procedures. Yes that's right: the combine for procedures is precisely caseP.
That was a surprising result, but then caseP started turning up all over the place.
Properties
Properties are not duck typing. All properties come from Behaviours (spelt with or without the 'u') which are collections of properties and the laws they must obey. One aspect of agreement between properties is that they come from the same Behaviour.[2]
Suppose we have the property 'add' from the Behaviour 'Monoid', and both Integer and Rational conform to Monoid. The isA relationship has the effect that the two 'add' implementations are combined for both types. If the parameter to add is a pair of Integers then both of the combined original procedures succeed, and the results are checked to be equal. If the pair contains a non-Integer Rational then the 'add' property of Integer will fail, and the result will come from the 'add' property of Rational just as before. So the only effect of combining them is to check the compatibility. One would naturally hope to optimize this away most of the time.
We note that properties must be of a Combinable type. This is not expected to be an onerous restriction. Combinable is inherited in all standard types, so that, for example, a Tuple type whose component types are all Combinable will be Combinable.
Diamonds
Suppose we have defined a ComplexRational type giving the following diamond problem:
As we go up our partial order, our transitive closure generates the obvious induced up and down procedures. But what happens when we have a diamond as in the diagram. We require that both upward routes give the same induced up and down procedures. I'm sure you're ahead of me. Naturally we combine them, which ensures that when both paths succeed they give a compatible result.
Properties of Unions
Consider Union(Rational,ComplexInteger). If no types have been declared above Rational and ComplexInteger then the 'add' property is not defined for the Union. But now suppose we have defined ComplexRational with the obvious isA relationships, as above. For Union to give the join we must have:
The Union must be below ComplexRational. We construct the up and down arrows in an obvious way with combine[3]. But ComplexRational supports the add property, so everything below must also, including the Union. However it just inherits ComplexRational's add, and doesn't make the Union itself a monoid (which is good because it isn't).
To understand the value of this, consider the list [7,8/3,2+5i]. It is naturally a list of the Union of the types which is Union(Rational,ComplexInteger). If we sum the list we want to get a ComplexRational if that is defined.
Why don't we just say that ComplexRational is the Union of Rational and ComplexInteger? The answer is that there might be multiple types above both Rational and ComplexInteger. Let's add the type Cat with Cat isA Mammal and Cat isA Quadruped, and consider the list [cheshireCat, snoopy, pluto]. If forced to make a premature decision it would be a List( Mammal) or a List( Quadruped). But by making it a List( Union( Dog, Cat)) it isA either of these if that is required.
Which brings us to another interesting point. What is the type of the empty list, []. Since the type of an explicit list is the Union of the set of types in the list, and since the Union of the empty list is the type Empty, we see that [] has type List(Empty). And since Empty isA X for all types X, we have (with reasonable assumptions about active declarations) that List(Empty) isA List(X). Which is exactly what you want to be able to use the empty list in any appropriate situation. I prefer this to the way the empty list is handled in other languages.
No home types
If x is both an Integer and a Rational, there is no sense in which we can ask "Is x really a Rational with denominator 1, or is it really an Integer". If this were not so then the programmer could encode hidden variables in that type information which would be most undesirable.
Out of Scope
Tuples (product types) are also used heavily in my language. Naturally the 0-Tuple is Unit. There is much more, but it is not closely connected to the lattice nature of the type system and the role of combine. So I should stop here for today. But I can't resist one more diversion.
Implementations of Types
All types are either primitive or implemented in terms of some other type. So the full Nat type of natural numbers might be implemented using List(UInt64) where UInt64 is a primitive type of 64 bit unsigned integers provided by the target hardware, and List types might be implemented using various possible primitive interfaces to memory. This "implemented by" relationship has absolutely nothing to do with the isA relation.
Primitive types could arise in various ways. One that mathematicians seem strangely fond of is the type of hereditary (pure) sets which in ZF(C) set theory are sets whose only members (if any) are other hereditary sets. It is possible to implement a model of the natural numbers with such sets. But this is an "implemented by" situation. There should be no suggestion that the natural numbers are such sets. Rather the natural numbers are defined by their properties: zero and successor, and laws governing those. To know that you've implemented the natural numbers you need to define the properties and prove the laws of natural numbers using the properties and laws of the type with which they are implemented.
[1] These procedures don't change the types of any values, and seemingly don't do anything other than failing at times in a downward direction. In the practice of an actual programming language there is a meaningful interaction with the implementations of the types.
[2] A type can conform to a Behaviour in more than one way. The different ways are indexed by some type that supports equality. Properties are thus specified with a Behaviour, an index and a property name. We will ignore this subtlety.
[3] The up arrow is the combine of (1) down to Rational and up to ComplexRational, with (2) down to ComplexInteger and up to ComplexRational. We see that this must succeed one way or the other (or both if it is an Integer). Down works the other way and will fail iff both initial down movements fail.
Had join and meet reversed ... fixed
ReplyDeleteIt was a good idea to write this, because it clarified my mind. My desire to say "you can't distinguish between integer 7 and rational 7/1 is impossible sadly. So back to the status quo ante: (1) going up is painless, and indeed a subtype is a valid implementedBy, so no action at all needed; (2) going down is like convertsTo, only happens when explicitly required (e.g. by : op). The `: op checks type (current or up) but doesn't do down or convertsTo.
ReplyDelete