Monday, June 17, 2024

The new operator system

Syntax

The new operator syntax takes its lead from another language (Agda?), However that language's system is less comprehensive, so my solution is more messy.

Instead of a separate way of introducing operators, we just allow a special sort of identifier.

Wombat2 operators are identified by the sequence of mandatory suboperators (such as +, or if-then-else), plus identifying whether an operand is required or absent in each slot (after each, plus one to the left of the first suboperator). We also need the left and right precedence at the start and end, which are decimal numbers greater than 0.

To simplify matters I'm removing subsuboperators, and optional suboperators. Optional can be handled by just having multiple similar operators, though there is the danger of combinatorial explosion. These capabilities can be added to the scheme later if desirable. We do still need repeating suboperators.

Firstly we make the rule that a suboperator can't include any digits.

Now we construct our identifier as follows:

  • We have an initial backquote (`). This means two backquotes on first use, but first-use backquotes are not required at top (file) level where operators are likely to be defined.
  • We then have alternating numbers and suboperators, starting and ending with numbers.
  • A zero in a number spot means that no operand is present there. A non-zero number means that there is an operand required there. 1 should be used on internal operand positions.
  • The left and right numbers are precedence numbers. If a decimal dot is included it must have digits on both sides.
  • A repeating subop is indicated by following it with \*. This is zero or more times, so if you want one or more times then include the first instance separately. Note that if a repeating subop is the last subop then both it and the last mandatory need right precedence numbers.
  • \S is the whitespace suboperator and \J is the juxtaposition suboperator. \\ is a single \.
So we can just write:
  • `5+5.1 = myAddProcedure # greater right precedence means left associative
  • `0if1then1else3 = myIfProcedure

Of couse you aren't allowed to define new ones of these that are incompatible with an existing one, such as by having different precedence, or different repeating groups.

Translation

We prefer to associate operators with relevant Behaviours. Consider x+y. We convert x+y to Monoid.add(T).op(x,y). The question is how we work out the type T.

The answer is that we search downwards from the type of x and from the type of y, until we find a pair of types such that there is a type above both that conforms to Monoid.add.

Firstly we'll suppose that our types are: Integer, Rational and ComplexInteger (i.e. Gaussian Integer), with Integer below both of the others, and all conforming to Momoid.add. Suppose we try to add a Rational and a ComplexInteger. There is no type that is above both so we have to move down. There are two solutions: If we move Rational down to Integer then ComplexInteger is above (or equal to) both. If we move ComplexInteger down to Integer then Rational is above both. So we have two possible operations and either or both will work if one of x and y can coerce to an Integer (either because it is Rational with denominator 1, or ComplexInteger with imaginary part 0). So we combine the operations, giving a result that is a Union of Rational and ComplexInteger, and which fails if neither wants to be an Integer.

Now  suppose there is a type ComplexRational, that is above Rational and above ComplexInteger. Then we don't need to go down at all. So we use Monoid.add(ComplexRational).op. This coerces both x and y up to ComplexRational and the addition happens there and always succeeds.

The plan is to do sensible things that are close to the way run-time typing works in dynamically typed languages.

Postscript

Note that Monoid.add means the same as Monoid(.add). Since .add is an atom, this means Monoid is a Proc(Atom,Behaviour). This lets us distinguish different ways that a type can be a Monoid. Presumably there will be a Monoid.mul and a Monoid.max and others.