Friday, September 19, 2014

Wombat Types and their implementations


Wombat has quite a few little ideas in it, and one big idea. The big idea is this:

Types are defined by their semantics. A type can have multiple implementations with different performance characteristics.
Since changing implementation doesn’t change the semantics, the programmer doesn’t specify the implementation as part of the language. Programmers can influence the compiler’s implementation choices with pragmas.
An important addendum is that implementations need not cover all values in the type. There does have to be an implementation that covers all possibilities, and this is specified when the type is created. Here are some examples:
  • Int is the type of integers.
    • The comprehensive implementation is BigInt, which is, in turn, implemented by List Int64.
    • The normal implementation is Int64.
If an Int64 operation overflows then the operation is rerun giving a BigInt result and the whole continuation of everything that follows is switched to one using the BigInt implementation for calculations depending on that value that overflowed.
  • Array(Assignable X) is the type of traditional arrays from imperative languages: i.e. the bounds don’t change after creation but each entry can be updated. Wombat is functionally oriented, but not prescriptive. This example is the one that inspired the type/implementation split because the comprehensive implementation isn’t something one would normally want.
    • The comprehensive implementation is just an array of addresses in memory where an X is stored. The addresses are not (necessarily) laid out in any neat way.
    • The normal implementation is just a start address plus, in the 1d case, a length (and maybe a stride). Slightly more complex in higher dimensions.
This case is discussed in other blog posts: “Solving a 30 year puzzle” and “some history”.
  • Any value known at compile time can be implemented by Unit. I.e. we don’t need to allocate any space on the stack to pass it as a parameter. Though it is a slightly complicated example, let us consider the expression s.length from the previous post. This is actually String.asProc(s).length, and in a normal way String.asProc(s) is just a closure returned from within the String.asProc code, and inside which s is locked down. In order for the compiler to use the zero length implementation of the Atom parameter, there has to be a specialization of that closure for when the parameter is exactly .length. Likely String.asProc is just one big case statement and this specialization will just throw away all the code except the little bit returning the length. So we see that for practical purposes we end up with the same result as in languages where a method does map to a special sort of procedure.
I won’t go on here, except to say that this is a very fruitful concept, and more examples will appear.
I gave a talk on this once. The visiting expert at the end asked a question which I couldn’t understand, and hence answered incoherently. Afterwards I realised that he was suggesting that this was the same as interfaces/traits. Obviously my talk did not do a good job of explaining the concept. Or maybe the visiting expert was jet-lagged. Anyway this reminds me: if anyone wants me to give a Wombat talk, let me know and I’ll figure out if I can get there.

[update 2014-09-27: I should have mentioned the interconversion rules for implementations, which is a key thing that is different from interfaces/traits where no interconversion is expected. Every new implementation has to give a pair of functions for converting to/from some other implementation (commonly the initial universal implementation). The "to" direction has to be 1-1, and the "from" direction fails for values that aren't covered by the new implementation.]

No comments:

Post a Comment