Arrays, Final(ly)

I think the final piece of the array puzzle finally fell into place. The issue resolved when I last wrote about arrays — whether all data Models have array capability versus arrays being a distinct data model — still stands.

The final piece involves a slight syntax change and — much more importantly — the full rationale for the array syntax. Not really having one was a bother; it made arrays seem arbitrary and patched on. The rationale brings them more fully into the BOOL fold.

Let me start with the syntax change. The new version looks like this:

*array <*model-name> [dimension(s)] array-name

The syntax change involves two things: Firstly, the reversal of the array’s Model declaration (which was previously the third item) and the array’s dimension declaration (which was previously second); Secondly, the addition of angle brackets (< >) around the array’s model. The angle brackets declare a binding to whatever they enclose.

An actual array definition might look like this:

*array <*int> [10] numbers

Which creates an array, named numbers, consisting of ten (uninitialized) *int Objects. A two-dimensional array might look like this:

*array <*any> [8,8] board

Which creates an 8×8 matrix of Objects of any type. (The *any Model plays the same role here as in Action input parameters: it says “any Object can be bound here!“)

The array above can be declared without the Model binding:

*array [8,8] board

Without a binding, the array is created as if there was a binding to the *any Model. The two definitions above are effectively identical.

The array dimension declaration is optional if, and only if, the array dimensions are provided by an initializer List:

*array <*int> numbers = 1,2,3,4,5,6,7,8,9,10

The above is very similar to the first example in that both create an array, named numbers, which has ten *int Objects. But the first one has uninitialized *int Objects, whereas in the one directly above each Object has an initialized value.

However the following is not allowed:

*array numbers = 1,2,3,4,5,6,7,8,9,10

Even with an initializer List, the compiler can’t be sure what Object Model was desired for the members of that List. The *any Model is virtual; it’s not possible to create Object Instances of it. Even if the List values had clear types, the syntax is not allowed. (The compiler shouldn’t have to try to figure out the List types in the first place, and if it did it would also then have to ensure they were all the same type. Can of worms best left sealed.)

The angle brackets (< >) indicate a binding, and that binding is part of whatever is being defined. In the case of an array, they bind to the *array Object Instance and become a part of that Instance. The array “knows about” the Model of its members.

A Model definition that sub-classes another Model also uses a binding:

.   *int x = 0
.   *int y = 0

**3dpoint <*2dpoint>
.   *int z = 0

In this case, the Model, *2dpoint, binds to, and becomes part of, the new Model, *3dpoint. Object Instances of the new Model know about the bound one, which creates the “is-a” relationship.

Model Actions defined outside a Model definition also make use of a Model binding to link the Action to its Model:

@@swap:: <*2dpoint>
>>  *2dpoint that
<<  !
.   *2dpoint temp = !
.   set:! that
.   set:that temp

This creates a new Action (method) for the *2dpoint Model or overrides any existing one (if the Model allows overrides). The Action implements the (possibly new for this Model) swap: Message.

The same rationale applies in all cases. The angle brackets create a binding from the object being defined to the object enclosed in brackets. What the defined object does with this binding depends on the object. Obviously, a binding only applies to definitions that know what to do with one. An unexpected binding is a syntax error (assuming situations that allow bindings are syntactically different — if they turn out not to be, then it’ll have to be a run-time error).

The rationale of the array dimension declaration has always bothered me, mostly because the obvious syntax, the square brackets, means “label” in BOOL. So one either has to equate “label” with “dimension” in some rational fashion, or one has to find a different syntax for one or the other.

I like the label syntax, so I really want to keep it. But nothing else really works with arrays. The square bracket syntax is so deeply ingrained in so many languages. The thing is, BOOL labels apply to the following object (BOOL has strong pre-fix sensibility).

In the original syntax, the dimension declaration preceded the array Model, and one could make the rationale that the label indicated to the Model that it should make that many copies. (Not my favorite rationale.)

Now the label precedes the array name, and the rationale is that it attaches to that the defined Object and provides its dimensions. That’s kind of exactly how labels should work, so the rationale fits well with the overall BOOL rationale. (And now I can stop using the word, “rationale”!)

The final note is that, of course, array dimensions can be provided by an object, rather than by a literal:

*int max = 40
*array <*string> [max] buffer

I like the syntax! It feels good to me, and so arrays finally are fully specified!


%d bloggers like this: