Sunday, December 23, 2018

post-talk plans


The talk was not one of my best. I do this thing where I get the talk to a usable level quite early, planning to improve it. But then the thought of changing it makes me nervous. It could have done with animations and better use of the presentation hardware, and more planning to leave room for the interesting stuff at the end.

I've decided to give up on space as the separator in explicit Lists, though cute. So comma is always used for operations that create a compound value, overriding the comma operator that creates Tuples.

Had an interesting discussion of subtyping and equality. I can see that it is useful to also have a system where you take a type and add some extra data and methods, and have a forgetful procedure/coercion to go back to the original type, and some automatic inheritance of properties. However I'm wary of introducing the confusion that eventuates in OO where parent and child have the same property with different meanings, e.g. equality.

Another interesting discussion with the guy advocating weak algebraic effects. That only works with left to right evaluation. Haskell was forced to monads because of lazy evaluation. I hope to avoid that with the TimeOrder type. To reduce the need for !s I will introduce EffectfulProcedures (`!{ }) which force left to right evaluation (i.e. calls to them have an implicit ! unless there is an explicit one).

I was disappointed not to get any useful discussion on proofs.

I'm preparing two docs: Wombat Tour and Wombat Operators to cover my ambitious implementation plan.

Sunday, November 4, 2018

Upcoming talk, compiler runs real programs

I'm giving a talk at SAPLING18 which is an informal meeting following http://aplas2018.org/. See at the bottom of the "tracks" menu, but the link is down as I write. I have preliminary "death by powerpoint" slides https://docs.google.com/presentation/d/1a72bKhCxfNBj7lqPVWfqLPDW_QPgQEKy7N0LM8RCo0o/edit?usp=sharing. Hopefully they'll be improved for the actual talk. I'll also be at the main conference. I'm shy, but happy to talk to anyone. I'll also be there on the Saturday before the meeting if anyone wants to have coffee or some such. My email is robert.kenneth.smart@gmail.com. To finish off wombat I need an effects system, and a proof system, but I'm happy to talk about any other aspects of programming language design.

I'm happy to give Wombat-related talks in Australia. I did give one last year at the Melbourne FP meetup: https://www.meetup.com/en-AU/Melbourne-Functional-User-Group-MFUG/events/zhtdllywkbjb/. An attendee was disappointed that there was no code to play with. That inspired me to write the current compiler which now correctly runs some programs, such as factorial:

%/include wombat.wh

    `fact = { case $:Nat of [
                { $ = 0; 1}
                { $ = `n; n*fact(n-1) } # n-1 fails if n==0
              ]
            };
    print (fact 4)

And it can even run simple programs backwards, though not factorial. Check it out from github: https://github.com/rks987/marsupial. (cd src; python3 compiler.py test.w)

Thursday, October 4, 2018

Effects and Proofs

Running procedures backwards is nice at times, particularly for matching in case statements. However it is only going to work when there is no information loss (procedure is 1-1), and most complex procedures aren't. I think I know how to run more complex things, like factorial, backwards (to stop arms of a case going into infinite loops you run the cases in coroutines, and when one succeeds kill the others and restart them with the improved parameter). However that will very rarely be useful. The current scheme probably hits the sweet spot in usefulness.

So I should move onto more important things: Effects and Proofs.

My simple-minded idea of effects is that they arise in types where the semantics includes the time. For example an Assignable(X) has a stream of assignments with associated times, and a stream of accesses with associated times, and the semantics specify that the result of an access is equal to the value of the most recent assignment. Similarly IO operations typically need to be done in some order (not necessarily linear: some can be done in parallel). The plan is to provide a framework for this, but leave any specific cleverness to the types themselves (and their related operational properties).

First we introduce a new (special) type TimeOrder. This always starts as a future time, then turns into an actual time when passed. Time related operations use the ! operator. There is also !< and any other time related operators should include an !.

  • `t ! expression means that the expression should happen at TimeOrder t. This is infix !.
  • ! expression allocates a nameless time. This time specifies that the expression is executed in lexical order relative to other ! expressions in the current closure. This is prefix !.
  • t1 !< t2 means that t1 must happen before t2.
So consider:
`t1 ! expr1; `t2 ! expr2; `t3 ! expr3; t1 !< t3; t2 !< t3
This says that t1 is before t3, and so is t2, but t1 and t2 can be in either order (or concurrent). TimeOrder entities can be passed around like any other. If a call to a procedure is not done with a bang (!), then, even if it has IO or other effects internally, it will be treated like any other expression and may be executed in any order or not at all.

It is not yet clear whether this simple treatment can be used by libraries to achieve the control of effects that we are coming to expect.

The next issue is proofs. I need proofs, and it seems that the language should be well designed to allow them, but I can't seem to figure it out.

Propositions are dependent types which are proved by constructing an inhabitant. Wombat supports dependent types and uninhabited types. Proofs are programs, i.e. procedures. That shouldn't be a problem.

Let's suppose that a % sign indicates a proposition as a type. Let's consider 1 <% 2 (natural numbers). This type is inhabited. Maybe it is useful to think about what the inhabitants are like and what relation they have to each other. Proofs are programs, meaning a closure in Wombat. So I guess it takes no input and generates an object of the type. Suppose we have some builtin stuff (axioms):

  • zeroLessThanNonZero takes n:Nat and returns an element of the dependent type 0<%n.
  • moveLessUp takes an n<%m and return an element of type succ(n)<%succ(m).
Now our proof of 1<%2 is { 1 |> zeroLessThanNonZero |> moveLessUp } where x|>f means f(x).

Now suppose we have a proposition "for all abc there exists xyz such that prop(abc,xyz)". This also has a (different) procedural view where forall corresponds to input and there exists corresponds to output. That should suit Wombat where we access to both in the body of the closure. So first of all we have something like: { $=`abc; `$=`xyz; code-generating prop(abc,xyz) }. Then we have to turn that into a proof. We must be pretty close.

Thursday, July 19, 2018

Compiler works (minimally)!!!

Version 0.0.1 of marsupial/wombat has successfully run on one specific program ("print 42").

At this point I should say a lot more and there should be docs, etc. But all that will have to wait.

The code is at: https://github.com/rks987/marsupial

The next version, 0.0.2, will aim to run the factorial program backwards. Then I'll retire unless I find some collaborators.

The compiler is in python 3. All the bugs would have been found by type checking. [Update: 0.0.2 development have revealed a few "real" logic bugs, but I certainly plan for the actual compiler to use python type-checking.]

Saturday, March 24, 2018

The Bounded Lattice Type System

I've written a little doc on the Bounded Lattice Type System (BLTS) that I'm trying to implement. The pdf is at https://drive.google.com/file/d/1j8JkphjBrTrx-WrQPNbO8VlC2wIE8uSK/view?usp=sharing. This comes from the google docs doc: https://docs.google.com/document/d/1fSxGKq5aD7BMOUGJwH0_jkvVi6w1doRXldDIgJISSWQ/edit?usp=sharing. This is an uglier version:
[[UPDATE: The pdf and docs versions above have been fixed [see comment below]. This hasn't been, may never be.]]

The Bounded Lattice Type System
The Bounded Lattice Type System (BLTS) is at the core of the programming language I'm (slowly) working on. This is an attempt to extract a description of just the type system.
The whole thing fits together like a jigsaw (or sudoku) puzzle. You might think that you like some bits and not others but I strongly suspect that it is not possible to change the core ideas without it falling apart. It took years to get to this current point, and it was only well after the last fix that I noticed that there were no more nasty interactions (that I know about).
In the types of Type Theory, elements can only have one type. In BLTS a number can be an integer, and a rational and a Gaussian (complex) integer. An animal can be a dog, and a mammal, and a quadruped.
Procedures in BLTS are not normally total. If they can't return the required result they fail, and this often leads to alternative action. This is informational failure, which is the only sort under consideration in this document.

The Lattice

A bounded lattice is a partial order where every set of elements (including the empty set) has a least upper bound (join) and a greatest lower bound (meet). In BLTS the types form a partial order with the isA relation. So, using the examples mentioned, we have: Integer isA Rational; Integer isA ComplexInteger; Dog isA Mammal; Dog isA Quadruped. The isA relationships always come with a pair of procedures converting up and down. The up procedure always succeeds (for parameters of the right type), but the down procedure naturally fails when the value is not of the lower type.[1]
Types are characterized by their properties. Properties are monotonically decreasing. If Rational has the property "denominator" then Integer must have that property as well, and they must agree. The meaning of agreement is covered below.
The join and meet in BLTS are Union and Intersection. Union of the empty set (of types) is the type Empty which has no members (is uninhabited). Intersection of the empty set is the type Any (Union of all types). Union(X,Y) has values which can be either an X, or a Y, or both if they fall in the Intersection. So given xy:Union(X,Y), one can ask if it's an X with xy:X, or if it's a Y with xy:Y. Proponents of parametricity will say that these are the only properties there should be. However we'll see that that is not the case in general in BLTS.
To find Intersection(X,Y), find the set of types directly (immediately) below both, and take the Union of that set. The asymmetry between the definitions of Union and Intersection is, I think, illusory, and I believe these rules give the free bounded lattice that contains the partial order generated by the transitive closure of the isA relationships. Confirmation or disproof of this would be welcome.

Combining

This is going to seem like a strange detour, but actually it is a key solution to various problems.
BLTS utilises a procedure, caseP, which performs the function of a case/switch statement. It is curried. The first parameter is a set of procedures (actually a bag, but repeats won't change the result). This is like the arms of a case statement in other languages, except that the match tests have moved inside the procedures. The result of caseP is a procedure which takes a parameter, and passes that parameter to all of the procedures in the provided set. The common expectation is that one branch will succeed, all the others will fail, and the final result is the result returned by that successful procedure. However if two (or more) procedures return the same result then that is ok as well.
But what if the results returned are of a type that does not support equality (does not conform to Distinguishable). In this case we allow a weaker condition: Combinable. If a type supports Combinable then there is a property 'combine' which allows 2 or more of that type to be combined. In particular, procedures are Combinable. For types that support equality the combination is the value if all are equal, else it fails.
For procedures the combine procedure takes a set (bag) of the procedures to be combined and passes the parameter for the combined procedure to each of the procedures being combined. If the results are Combinable (includes Distinguishable) then the result is the combine of the results of the successful procedures. Yes that's right: the combine for procedures is precisely caseP.
That was a surprising result, but then caseP started turning up all over the place.

Properties

Properties are not duck typing. All properties come from Behaviours (spelt with or without the 'u') which are collections of properties and the laws they must obey. One aspect of agreement between properties is that they come from the same Behaviour.[2]
Suppose we have the property 'add' from the Behaviour 'Monoid', and both Integer and Rational conform to Monoid. The isA relationship has the effect that the two 'add' implementations are combined  for both types. If the parameter to add is a pair of Integers then both of the combined original procedures succeed, and the results are checked to be equal. If the pair contains a non-Integer Rational then the 'add' property of Integer will fail, and the result will come from the 'add' property of Rational just as before. So the only effect of combining them is to check the compatibility. One would naturally hope to optimize this away most of the time.
We note that properties must be of a Combinable type. This is not expected to be an onerous restriction. Combinable is inherited in all standard types, so that, for example, a Tuple type whose component types are all Combinable will be Combinable.

Diamonds

Suppose we have defined a ComplexRational type giving the following diamond problem:
As we go up our partial order, our transitive closure generates the obvious induced up and down procedures. But what happens when we have a diamond as in the diagram. We require that both upward routes give the same induced up and down procedures. I'm sure you're ahead of me. Naturally we combine them, which ensures that when both paths succeed they give a compatible result.

Properties of Unions

Consider Union(Rational,ComplexInteger). If no types have been declared above Rational and ComplexInteger then the 'add' property is not defined for the Union. But now suppose we have defined ComplexRational with the obvious isA relationships, as above. For Union to give the join we must have:
The Union must be below ComplexRational. We construct the up and down arrows in an obvious way with combine[3]. But ComplexRational supports the add property, so everything below must also, including the Union. However it just inherits ComplexRational's add, and doesn't make the Union itself a monoid (which is good because it isn't).
To understand the value of this, consider the list [7,8/3,2+5i]. It is naturally a list of the Union of the types which is Union(Rational,ComplexInteger). If we sum the list we want to get a ComplexRational if that is defined.
Why don't we just say that ComplexRational is the Union of Rational and ComplexInteger? The answer is that there might be multiple types above both Rational and ComplexInteger. Let's add the type Cat with Cat isA Mammal and Cat isA Quadruped, and consider the list [cheshireCat, snoopy, pluto]. If forced to make a premature decision it would be a List( Mammal) or a List( Quadruped). But by making it a List( Union( Dog, Cat)) it isA either of these if that is required.
Which brings us to another interesting point. What is the type of the empty list, []. Since the type of an explicit list is the Union of the set of types in the list, and since the Union of the empty list is the type Empty, we see that [] has type List(Empty). And since Empty isA X for all types X, we have (with reasonable assumptions about active declarations) that List(Empty) isA List(X). Which is exactly what you want to be able to use the empty list in any appropriate situation. I prefer this to the way the empty list is handled in other languages.

No home types

If x is both an Integer and a Rational, there is no sense in which we can ask "Is x really a Rational with denominator 1, or is it really an Integer". If this were not so then the programmer could encode hidden variables in that type information which would be most undesirable.

Out of Scope

Tuples (product types) are also used heavily in my language. Naturally the 0-Tuple is Unit. There is much more, but it is not closely connected to the lattice nature of the type system and the role of combine. So I should stop here for today. But I can't resist one more diversion.

Implementations of Types

All types are either primitive or implemented in terms of some other type. So the full Nat type of natural numbers might be implemented using List(UInt64) where UInt64 is a primitive type of 64 bit unsigned integers provided by the target hardware, and List types might be implemented using various possible primitive interfaces to memory. This "implemented by" relationship has absolutely nothing to do with the isA relation.
Primitive types could arise in various ways. One that mathematicians seem strangely fond of is the type of hereditary (pure) sets which in ZF(C) set theory are sets whose only members (if any) are other hereditary sets. It is possible to implement a model of the natural numbers with such sets. But this is an "implemented by" situation. There should be no suggestion that the natural numbers are such sets. Rather the natural numbers are defined by their properties: zero and successor, and laws governing those. To know that you've implemented the natural numbers you need to define the properties and prove the laws of natural numbers using the properties and laws of the type with which they are implemented.

[1] These procedures don't change the types of any values, and seemingly don't do anything other than failing at times in a downward direction. In the practice of an actual programming language there is a meaningful interaction with the implementations of the types.
[2] A type can conform to a Behaviour in more than one way. The different ways are indexed by some type that supports equality. Properties are thus specified with a Behaviour, an index and a property name. We will ignore this subtlety.
[3] The up arrow is the combine of (1) down to Rational and up to ComplexRational, with (2) down to ComplexInteger and up to ComplexRational. We see that this must succeed one way or the other (or both if it is an Integer). Down works the other way and will fail iff both initial down movements fail.