Friday, October 3, 2014

Wombat Type Hierarchy

It is natural to regard some types as being contained in others. An Integer is a Rational. A Dog or a Cat is a Mammal or a Quadruped. Programmers expect this to mean that they can use an Integer where a Rational is expected, and so on. If you follow this through to its logical conclusion you are bound to get the Wombat solution outlined below, or something very close. Note that this is about lossless conversion which can be reversed in a “case” statement. Wombat also allows for lossy conversion with convertsTo which will be covered in a separate post.
The relationship between types is declared (and tested) with variants of the “isA” operator. When declaring that Int isA Rational we provide a pair of functions:
  1. The “to” function creates a Rational with denominator 1;
  2. The “from” function fails if the denominator is not 1, else returns the numerator.
The effect of declaring Int isA Rational is that Int acquires all the semantics of Rationals. For example if r:Rational then we can get the denominator as r.denominator. This must now apply to Ints as well. By default the Int value will be converted to Rational, then the denominator will be extracted. It is possible and easy to add a specialization so that for Int values it returns 1 directly, though it seems unlikely that it would get much use.
Also, all overlapping semantics (such as, in this case, addition and multiplication) have to be compatible. This is a delicate matter, particularly if the types come from modules from different sources, so we’ll defer it to a later post.
Another motivating example would be similar to OO style subtyping. Suppose we have types Mammal, Quadruped, Cat and Dog such that Cat isA Mammal; Dog isA Mammal; Cat isA Quadruped; Dog isA Quadruped. Now what can we make of this list:
[ myDog myCat anotherDog ]
What is it a list of? It could be a list of Mammals, or a list of Quadrupeds. Should we make the programmer decide? What if he wants to pass this to a procedure that expects a list of Mammals, but later wants to pass it to a procedure that expects a list of Quadrupeds? It is stuff like this that drives people away from static typing to dynamic typing.
In Wombat: for every set of Types there is a Union which is the smallest Type that is above all of them. So our list above is a list of Union(Dog,Cat). This automatically behaves correctly with respect to declared relations. So Union(Dog,Cat) isA Mammal and Union(Dog,Cat) isA Quadruped. Note that all the semantics of Mammal and of Quadruped are also compatibly in Dog and Cat. So there is no problem bestowing them on Union(Dog,Cat). In a Union where the components have no common higher isA relation defined, such as Union(Dog,Int), then there are no defined semantics other than those from Union.
To fill out our list example we need to have a rule that whenever X isA Y then List(X) isA List(Y). Which gets into yet more delicate areas that we’ll leave for another day.
On the other side we have Intersection(Mammal,Quadruped) which is the largest type that is below each of them, and gets rid of those pesky bipedal mammals. There has to be at least one type that is declared to be below all the types in the Intersection, otherwise the Intersection is Empty.
The rules for the interaction between Union and Intersection need to be extended in obvious ways (such as associativity) and this makes Types into a lattice (http://en.wikipedia.org/wiki/Lattice_(order)).
Wombat’s type-assertion/conversion operator is “:”, and it can go up or down or even sideways. So myDog:Mammal always succeeds (calling the “to” conversion), myMammal:Dog calls the “from” conversion and may fail (allowing other options to be tried in a case statement). Finally myMammal:Quadruped will go down to the Intersection (possibly failing in the bipedal case) then up. Only when the Intersection is Empty does “convertsTo” come into play.
Yet another future post will show how the type hierarchy gives Wombat named parameters with default values.

[update 2016-09-28: That last promise was never fulfilled. It's too easy for a post. If you want named arguments to a procedure, use a Struct parameter. Structs allow default values which have this effect: a Struct without the defaulted field will convertTo any type with extra fields that have defaults. Voila.]