Draft

Robustness in Push (and Klapaucius)

Draft of 2018.10.10

May include: GPPush&c.

For most of a year, I’ve been wrestling with a design “problem” in the Push language specification and implementation I’ve been calling Klapaucius.

If you’re just tuning in for the first time: Push is a language designed as a substrate for genetic programming. It’s extremely simple to describe the interpreter dynamics, and therefore very easy to implement as an interpreter in some other (human-written) language. Several of the features that its creators made central are intended specifically to permit Push code containing any recognized tokens in any order to be executed successfully. Further, the core behavior of the interpreter is so simple, even a novice programmer can make a lot of headway in building a new interpreter from scratch.

The implicit purpose of this syntactic and semantic robustness in the Push language’s behavior is to foster as many possible partial solutions to coding tasks as possible, during automated searches for novel code—especially those that productively use “interesting” Computer Science structures like looping, recursion, type systems and so on. Anybody who’s ever written code by hand will balk at the notion of a language that executes regardless of token order, as they think about how brittle their familiar human-coding languages are in comparison. Consider how well your Ruby or Clojure or Haskell or Javascript code would be if you shuffled the positions of all the names and operators.

But Push (and languages like it) aren’t made for the clarity of a human programmer’s thoughts on the domain, they’re made to increase the reachability of many possible approaches. So for example, in Push—as originally defined in the Push3 standard—we can execute code like this, where x and y are variables containing some kind of integer numerical value (maybe?)

(8 y :integer-dup x :integer-lessthan? :integer-negative? :boolean-and)

…and what will happen is (1) easy to describe, and (2) easy to work out by hand. Let me walk through the steps of executing this code, starting from a Push interpreter where I’ve already “loaded” the variables x and y with their values of 11 and -3, and “loaded” the program onto the :exec stack (further details will become apparent by example, I hope):


:x       :: 11
:y       :: -3
:exec    :: (8 y :integer-dup x :integer-lessthan? :integer-negative? :boolean-and)
:integer :: ()
:boolean :: ()

:x       :: 11
:y       :: -3
:exec    :: (y :integer-dup x :integer-lessthan? :integer-negative? :boolean-and)
:integer :: (8)
:boolean :: ()

:x       :: 11
:y       :: -3
:exec    :: (:integer-dup x :integer-lessthan? :integer-negative? :boolean-and)
:integer :: (-3 8)
:boolean :: ()

:x       :: 11
:y       :: -3
:exec    :: (x :integer-lessthan? :integer-negative? :boolean-and)
:integer :: (-3 -3 8)
:boolean :: ()

:x       :: 11
:y       :: -3
:exec    :: (:integer-lessthan? :integer-negative? :boolean-and)
:integer :: (11 -3 -3 8)
:boolean :: ()

:x       :: 11
:y       :: -3
:exec    :: (:integer-negative? :boolean-and)
:integer :: (-3 8)
:boolean :: (true)

:x       :: 11
:y       :: -3
:exec    :: (:boolean-and)
:integer :: (8)
:boolean :: (true true)

:x       :: 11
:y       :: -3
:exec    :: ()
:integer :: (8)
:boolean :: (true)

One might say that this Push program “determines” whether (y < x) AND (y < 0), based on the observation the :boolean value left on top of that stack at the end of execution of code is the result of that particular computation. Note though that Push code is generally developed under selection, and that while in this case we “get” a true on the :boolean stack, we also get an 8 on the :integer stack. So while we may say it’s “calculating” (y < x) AND (y < 0), we should also remember it’s simultaneously “calculating” 8, which is the constant hard-coded in the original code.

There is no notion of “returning a value” in Push. Each step of code execution changes the state of one or more stacks and their contents (and possibly the interpreter state on some higher level). We “prepare” an interpreter by setting up some state, and we run code for some potentially arbitrarily number of steps—by convention, until the :exec stack is empty, or until we get worried it will never empty (for instance if we’re in an long recursive state)—and then we look at some or all of what’s happened in the meantime.

We could also say that, if x is set to some other kind of value (a floating-point number, or a boolean bit, or a string, or nothing at all), other stuff will happen:


:x       :: "foo"
:y       :: -3
:exec    :: (8 y :integer-dup x :integer-lessthan? :integer-negative? :boolean-and)
:integer :: ()
:boolean :: ()

:x       :: "foo"
:y       :: -3
:exec    :: (y :integer-dup x :integer-lessthan? :integer-negative? :boolean-and)
:integer :: (8)
:boolean :: ()

:x       :: "foo"
:y       :: -3
:exec    :: (:integer-dup x :integer-lessthan? :integer-negative? :boolean-and)
:integer :: (-3 8)
:boolean :: ()

:x       :: "foo"
:y       :: -3
:exec    :: (x :integer-lessthan? :integer-negative? :boolean-and)
:integer :: (-3 -3 8)
:boolean :: ()

:x       :: "foo"
:y       :: -3
:exec    :: (:integer-lessthan? :integer-negative? :boolean-and)
:integer :: (-3 -3 8)
:boolean :: ()
:string  :: ("foo")

:x       :: "foo"
:y       :: -3
:exec    :: (:integer-negative? :boolean-and)
:integer :: (8)
:boolean :: (false)
:string  :: ("foo")

:x       :: "foo"
:y       :: -3
:exec    :: (:boolean-and)
:integer :: ()
:boolean :: (false false)
:string  :: ("foo")

:x       :: "foo"
:y       :: -3
:exec    :: ()
:integer :: ()
:boolean :: (false)
:string  :: ("foo")

Notice

  1. we get a new “result” on the :boolean stack, which is technically the result of calculating (y < y) AND (8 < 0) (neither of which will ever be true),
  2. we end up with no number on the :integer stack,
  3. we have a place for the :string to go, when it’s encountered, or we make a new one.

Further, Push fails silently if the arguments specified for any instruction token are not available in the locations (the stacks) defined internally for that instruction. An instruction like :integer-lessthan? is hard-coded in its definition such that it “wants” to take two argument items from the :integer stack, and will push its result (if the arguments were there) onto the :boolean stack. Consider this one:


:x       :: "foo"
:y       :: "bar"
:exec    :: (8 y :integer-dup x :integer-lessthan? :integer-negative? :boolean-and)
:integer :: ()
:boolean :: ()

:x       :: "foo"
:y       :: "bar"
:exec    :: (y :integer-dup x :integer-lessthan? :integer-negative? :boolean-and)
:integer :: (8)
:boolean :: ()

:x       :: "foo"
:y       :: "bar"
:exec    :: (:integer-dup x :integer-lessthan? :integer-negative? :boolean-and)
:integer :: (8)
:boolean :: ()
:string  :: ("bar")

:x       :: "foo"
:y       :: "bar"
:exec    :: (x :integer-lessthan? :integer-negative? :boolean-and)
:integer :: (8 8)
:boolean :: ()
:string  :: ("bar")

:x       :: "foo"
:y       :: "bar"
:exec    :: (:integer-lessthan? :integer-negative? :boolean-and)
:integer :: (8 8)
:boolean :: ()
:string  :: ("foo" "bar")

:x       :: "foo"
:y       :: "bar"
:exec    :: (:integer-negative? :boolean-and)
:integer :: ()
:boolean :: (false)
:string  :: ("foo" "bar")

:x       :: "foo"
:y       :: "bar"
:exec    :: (:boolean-and)
:integer :: ()
:boolean :: (false)
:string  :: ("foo" "bar")

:x       :: "foo"
:y       :: "bar"
:exec    :: ()
:integer :: ()
:boolean :: (false)
:string  :: ("foo" "bar")

Notice that in the last two steps of execution, where we try to execute :integer-negative? and :boolean-and, we are missing one or both arguments. So we just ignore the shortage, and move on.

Push instructions (1) know where (which other stacks, by name) to get their instructions, and (2) fail silently at runtime if all of those items are not forthcoming, without consuming arguments.

This weird behavior—“weird” in terms of more familiar programming languages—gives Push a remarkable ability. Notice that all of the following programs also execute, and leave various things on top of various stacks when the :exec stack is empty:

  • (8 :integer-dup :integer-lessthan? :integer-negative? :boolean-and x y)
  • (:integer-dup :integer-lessthan? :integer-negative? :boolean-and x y 8)
  • (:integer-dup :integer-negative? :boolean-and x y 8 :integer-lessthan?)
  • (y :integer-negative? x 8 :integer-lessthan? :boolean-and :integer-dup)
  • and so on

Each one does (or silently fails to do) slightly different stuff, in slightly different order. The interpreter is able to “adapt” to what would be syntax errors or semantic crashes in other languages.

This is crucial to the Push language’s usefulness as a substrate for genetic programming, because genetic programming is by definition messing around with the order and contents of programs and hoping they do something. As long as something happens, and we can assign a relative score to that something as if it were a partial—although quite possibly “stupid”—solution to the task of interest, we can make progress. We can change tokens in these Push programs to arbitrary new tokens; we can shuffle multiple programs together or concatenate them into a bigger one, we can halt execution somewhere mid-way and still expect something to have happened, and so on.

Which is one reason I am willing to say: Push is robust.

Certainly compared to other programming languages. Imagine taking two small Javascript or Java or Haskell or Lisp programs, each of which works, and then shuffling the instructions together like a deck of cards. How does that make you feel? How does it make you feel to say, aloud to yourself right now, “Yeah of course that will run!”

Every Push program will run. That’s the point.

Benefits at cost: type handling by fiat

The way Push interpreters tend to solve this need for robustness can only partially be ascribed to the “fail silently if the arguments are missing” thing. Indeed, that’s not especially helpful. Perhaps the primary way Push does the heavy lifting is the way Push handles “types” as if they were “stacks”.

(Scare quotes intentional.)

You’ll notice that in the examples above, I put integer-valued things like 8 and the value of x onto the stack called :integer, and I put the strings "foo" and "bar" onto a stack called :string, and true and false onto a stack called :boolean.

One of the core assumptions in Push interpreters is that there is some kind of global “route handler”, which is type-sensitive. That is, whenever the interpreter pops a token off the :exec stack that is recognized as an integer literal, it will be pushed to the :integer stack. Whenever it’s a boolean literal, it will be pushed to :boolean, and so forth.

If you watched those execution traces carefully, above, you’ll recall the places where the literal 8 was popped from the :exec stack, and pushed onto the :integer stack. You’ll also recall where x was popped, interpreted as the bound value, and then that value was pushed to the appropriate stack—for example, when x was 11, the 11 went to :integer, and when x was "foo", the "foo" went to :string.

That last one might make you squint a little.

But notice that there was other routing going on, and that it was a very different sort. When the instruction :integer-lessthan? was executed, you probably noticed that it popped (or “wanted” to pop) two integer arguments from :integer, and then pushed a :boolean result. What’s up with that?

That’s a very good question.

It turns out that by convention Push interpreters have been written—for more than fifteen years—so that each instruction is responsible internally for its own self-defined “routing”. That is, the :integer-add instruction is written so that it “takes” two :integer items and pushes a new one; the :code-swap instruction is written so that it takes two :code items and pushes them back in reverse order; the :float-to-integer instruction is written so that it “takes” a :float item and creates a new :integer item.

That is, every instruction contains a separate router for its own use.

And did you notice anything else odd in my explanation just now? I said, “…it ‘takes’ two :integer items and pushes a new one,” not “integers”. I very carefully said that :float-to-integer “takes a :float item”—meaning it goes to the stack named :float and obtains an item there.

And in general, including my own code in most instances, there is no real type-checking whatsoever in most Push interpreters you can download today. Where things come from as arguments, and where things go as results? That’s down to convention only. Interpreter authors are very careful to avoid runtime type errors in the code executed within an instruction, but to be honest there is just no capacity in the definition of Push itself which obliges all the numbers on the stack named “:integer” to actually be integers.

I don’t believe there’s any centralized facility for the arguments or results of any Push instruction to be type-checked. This first started to concern me several years ago. In the Klapaucius interpreter I was working on back then, I was implementing a square root function. As you know, the square root of a negative number is a complex number, not a real. But in the language I was using to write the interpreter at the time, the built-in square root function was perfectly happy to produce a complex result when handed a negative argument. And because the code for my square root Push instruction would happily push “the result” onto the stack named :integer without checking whether it was in fact a real integer, I would eventually end up with a complex value on a stack that all the other instructions assumed would only include real integer values.

So the instructions that used integers for indexing collections would find a complex number, and squawk. Or the instructions for iterating would try to step from towards , and squawk. Or worse: the instructions that subtracted or multipled :integer values, which were also written internally so they never bothered to check the type of the arguments, would be fine, and so the complex values would propagate through the stack of what would nominally be :integer items, and far, far down the road some other instruction remote in time and space would pick one up, and squawk.

Where “squawk” means “raise a runtime exception”.

Did you notice how I was saying how intentional Push’s original design was, in its efforts to avoid runtime errors? It turns out that we were ignoring something important by squashing and redistributing it, somewhat unquestioningly, down into all these hundreds of separate decentralized accidents waiting to happen at runtime for some other reason.

There’s a problem here besides the obvious, “It’s very tricky to avoid these runtime errors for edge cases”. A usability problem.

Push is also intended to be very extensible. That is, because we’re trying to help people discover patterns and algorithms in domain-specific fields of interest, we want to empower them to add new “types” to the Push core. If you want to explore a financial time series domain, it should be easy for you to understand how to make numbers into time-series, and vice versa. It should be easy for you to implement time-series-specific versions of instructions Push provides for all its core types: the combinators like swap and dup, the introspection instructions that count the number of items on stacks, the type-conversion instructions that construct and deconstruct your new thing from its components.

But as Push ontologies get “too big”, those of us who have been working in the language for many years start to get squeamish. Sure, you can have an ordered collection of integers, but when you add your time-series will you also want an ordered collection of time-series items? Oh wait, do you mean your time-series objects might not only have integer values, that it might have null values sometimes, or maybe strings? Oh wait, do you mean that the values associated with each time point might be a collection itself? Is it a collection of things all the same type, or do you want to allow anything to be in any collection? What about empty collections? If I break down a time-series object into individual values, where should I put those values?

Or (this is an old problem): We have scalar numbers, and there’s a stack for those (and maybe two). We also provide a built-in type for vectors of numbers. We also have vectors of booleans, and vectors of strings, and vectors of characters (which I guess are strings). But you want to implement a :matrix type in Push, to do some linear algebra stuff. So is a :matrix just a vector-of-vectors-of-numbers-all-the-same-length? You can construct a :matrix out of :vector-of-numbers items, but what will you do when they’re not all the same length? What will you do when you try to invert a :matrix, and that produces complex values? What will you do when your :matrix-multiply instruction has to find argument matrices of the correct size in a stack that doesn’t permit filtering?

There needs to be something automatic watching over all these robustly-generated structures and saying what they are and where they go. Not just the fallible, easily-confused human author of this crucial new code library.

My example of matrices is salient. There’s one more crucial place where Push stumbles over robustness, and it’s in the nature of the data structures it provides, and how they interact with other types. The items I’ve shown in the sketched execution above are all obviously “atomic” in some familiar sense. 8 is a single integer, and false is a single boolean value. But it rapidly becomes necessary in most domains to use more structured data: ordered collections of things, and key-indexed collections, and sets, and maybe mixtures and nested versions of all those.

And as soon as you have a collection of “stuff”, you quickly realize you need to manage what kinds of stuff is in that vector. A geometric point [8 false] is hard to draw a pixel in, and an entry in a time series of price information that looks like ["10:11:12am" :green] is not going to plot well in your financial project.

Push programs are intentionally able to change things around all over the place. That’s the benefit, that’s where all the useful exploratory potential arises. But it’s a pain in the ass to try to draw the complicated line around what playful Push is allowed to do, and what it’s not.

Take indices, for example. If I say I want to read just the 88th item in a list of 100 items, both of us are comfortable in most programming languages. If I say I want the 88th item in a 8-item collection, then you should squint—but also realize that in Push we want robust behavior and so there should be some convention by which this is possible. If I say I want the 9.3rd item, what will you suggest? (It turns out many languages, Clojure for example, are fine with this, too).

How about the item under the key :green? Again, there are somewhat-exotic languages where this is permitted. But recall that we’re trying to give Push, as a core design principle, as much leeway as possible to discover and manipulate novel data structures and algorithms.

What we do, as authors of Push interpreters, when we say “:integer-add tries to get two arguments from the stack called :integer” us remarkably similar what I’m discussing here. Should you (or the interpreter running code) be “permitted” to store a string under a boolean key in an object that’s a :vector-of-floats?

We permit human-written instruction code in Push interpreters to do that kind of bullshit, without much more than a frown and a knowing smile about the mysterious runtime errors that will pop up some day.

What needs to change, I think, is something about the interpreter’s internal representations of data and type and structure, and what the Push interpreter itself is doing.

Tagspaces: approximate matching, via Spector, via Holland

Some time back—several years, as I recall—Lee Spector said he was chatting with John Holland, and John reminded Lee of one way that programming languages and natural systems solve problems of “indexing” very differently. In biological systems, “search” is localized, and approximate. That is, actual physical things in cells (and the growing tips of neurons in brains, and the ants from a colony) that are looking for something are located and moving in a real space of possibilities. The simplest and most productive (and robust) design pattern seems to be: knowing what you want, if you don’t see it where you are, go forward; when you get to it (or close enough) that what you see matches what you want, know you’ve found it.

In human-writable programming languages, we’re very used to the notion of “trivial lookups”: When we have an Array called a in Ruby, we can easily say a[2] or a[-2] and know we’re looking up the third (because of zero-based indexing) item from the start, or the second item from the end. When we have a Hash called h and ask for h[:bar], we know we’re going to get the actual contents of the item stored under the key :bar.

We never consider how those items are found in the array and the hash, we just expect smart computer people have optimized it for performance (whatever that meant to them at the time), and it will just work. We don’t consider the decisions made “inside” the programing language’s compiler or interpreter for that kind of direct access; it’s just the way we’ve learned to do it, as human programmmers.

But notice also that there’s an aspect of operant conditioning here. When we write a[2] or h[:bar] we’re asking for those particular indexed things because we, the human programmers, have a mental model of cause-and-effect in our minds which assumes that those locations occur, and that what we want in the moment is located there, and that what we get as a result will match our expectations. The context already exists. We, the human programmers looking things up in these structures, might be wrong about the labeling or the contents of those containers, and as a result we’ll maybe get a runtime error for accessing an array with an index out of bounds, or handling a nil result looking up a nonexistent key in a hash.

But this exactness in indexing deeply informs with the way we separate our coding practice into “ideation” and then “implementation” and then “error-handling”.

Think now of the automated algorithms we’re building Push and its descendants to discover. Those algorithms aren’t “written” in the same way a human writes code, and yet they tend to be constrained by that standard incremental approach to finding specific things in specific places. These automatically-generated “programs” are born whole in many cases—or piecewise in an arbitrary order—and (as I’ve harped on above) we’re taking pains to leave these new ways of discovering working code always do something.

Natural systems, as opposed to these abstract systems which tacitly serve human coders’ and designers’ habits of thought, are not thus constrained. And indeed, we find in writing these languages for GP that absolute indexing in collections eventually becomes brittle. It’s the point where many partial solutions fall down, because the difference between the item at location 1 and the item at location 2 (especially when no human is overseeing ontological purity) can be so arbitrary, so uncorrelated.

Because we want a some consistency with human-written code (for many reasons), we still find it useful to think in terms of “functions” and “instructions” manipulating particular kinds of arguments, to produce familiar kinds of results. We still say “this instruction adds two numbers to produce a sum”, and we also still say when we describe biochemical systems that “this reaction combines these two compounds to produce this product.” What John saw as he described dynamics in emergent and self-organized complex systems was that natural systems do not say “this reaction combines the compound in this designated index with the one in this other indexed location, and places the result in this named field,” but rather it uses a sort of search-driven approximate matching—approximate in space, in time, in accuracy—to “find” those components in the first place.

Lee’s insight was a very interesting data structure I like very much, and one I’m trying hard here to preserve and extend in new work. It was the notion of a tagspace.

Several versions of Push, including Klapaucius, provide a tagspace structure already. The core idea is relatively simple using numerical tags, though of course there are open design questions at edges. There are very interesting and very general definitions possible, but let me start with an extremely simple approach first, using a “slight” generalization of an indexed key-value store as a starting point.

First, let’s look at a “traditional” Hash or struct or Dictionary or whatever your favorite human-programmer’s language has for key-value maps. In many languages, we often use arbitrary items as keys, but here we’ll constrain the rules so that keys must be scalar numerical values. Further, we’ll maintain the collection in a state where the keys are sorted in numerical order.

Here’s an empty tagspace, using the most explicit “hash” notation I know, which is that from Ruby: {}.

As I said above, when we store an item in a tagspace, we use a numerical key. Don’t worry where the key comes from now, or what its associated value should be. I’ll use letters, but they can be anything.

I store the value a under key 93, getting an updated tagspace that looks something like {93 => a}. So far things work as you would expect.

Next, I’ll store a value b under key 222, producing an unremarkable result {93 => a, 222 => b}. You’re not too confused, I am sure. There was no key already matching 222 in the tagspace, so I made a new one and stored the value there.

One more. I said that the keys are maintained in sorted order, and so when I now store value c under the numerical key -7.2, you will be unsurprised to see the result as {-7.2 => c, 93 => a, 222 => b}. The smallest key, the middle key, the largest key, in that order.

And, to demonstrate how similar these can be to your standard data structures of note (at least in Klapaucius), let’s write the value d to the key 93. And hey! It works just as one might expect, producing {-7.2 => c, 93 => d, 222 => b}.

In the version of tagspace behavior I’m describing here, the behavior diverges from that of a “traditional” key-value map when we’re looking things up. It may help to imagine these three numerical keys as being spread out on a number line. Something like this, where I’ve dropped in a few other familiar numerical landmarks to set the context:


(-∞) ... (-7.2) ... (0) ... (93) ... (222) ... (+∞)

When items are stored in this kind of tagspace, they’re associated with these specific positions on a numerical continuum, so let me draw the keys as “landmarks” on the number line, and the associated values as sort of dangling off the number line where the keys have been placed:


(-∞) ... (-7.2) ... (0) ... (93) ... (222) ... (+∞)
            (c)              (d)       (b)

When I look up a key in this kind of tagspace, I use a numerical index, and return the item stored under the first key equal to, or larger than, that index. There are some rules about “wrapping around” that I will add below, but in the meantime, just watch:

  • the item indexed by -7.2 is (obviously) c
  • the item indexed by 12 is d, because 12 falls above landmark -7.2, and below the next fixed key-landmark at 93
  • the item indexed by 93.001 is b, because again we have “started” above the landmark at 93, and we “slide up”

Now, the item indexed by 223 is… slightly more complicated. You may already be intuiting that for reasons of robustness, when we look up items from index numbers higher than 222, we maybe want to return -7.2. That’s correct! We do in fact wrap the search around.

Note that this has an interesting side-effect: any lookup in a non-empty tagspace will always return a value.

Similarly, the item indexed by -8.2 “should” probably be c, stored under -7.2. Because we “land” in an “empty part” and slide up to the first defined landmark. That one is more obvious and intuitive, I hope.

But now an interesting Mangle of Practice dynamic kicks in: If you play for a while, you’ll notice that indices “landing” arbitrarily far outside the range of existing “landmarks” are a bit more complicated, because one wants a sense of “balance” among the landmarked items. First, notice that the subdivisions in this numeric range are in some sense proportional to the gaps between the landmarks, and if we imagine “arbitrary” (uniformly sampled random) indices landing in the span between the smallest and largest landmark values, the ones following the biggest gaps will be looked up most often. That’s cool and interesting, especially because it provides some extra behavior standard uniform key-value stores don’t offer!

But I labeled (-∞) and (+∞) on purpose. If we decided to only “slide up and wrap around”, we’d essentially be saying that—under conditions of uniform random sampling—the first landmarked item is infinitely more likely to be chosen than any other. Any number larger than 93 would slide the “cursor” up to (+∞) and back around to (-∞) and up to -7.2, and any number less than -7.2 would also reach that point. The other landmarks have finite “spans” before them, but the first one has all the excess space from both the top and the bottom of the line.

Your next impulse, as a programmer, might be to do something with modulo. That is, maybe you’d subtract to calculate the range between the top and bottom landmarks, and then take a query’s numerical value modulo that range. That does some work “balancing” out the “unfair” weight of the first landmark, but now we’ve made it infinitesimally unlikely to ever pick it! For any positive or negative index outside the range of the landmarks, taking the value modulo the range will force the index into [0,max_landmark]. Now, the only way we can ever pick the first item is if the index value is an exact integer multiple of the range itself, and that’s very unlikely to happen. We’ll almost always get a result slightly higher than 0….

As a result, in the Klapaucius implementation, I add a “balancing” term, which is super ad hoc but at least works for the tasks I was targeting. Suffice to say we give “some” space to the first landmark, depending on the average span allotted to all the others, in such a way that every landmark has a finite and comparable probability of being chosen by an arbitrary index value.

Despite the kludges, I have found these simple numerically-indexed tagspace structures to be much more robust under evolutionary search than traditional discretely-indexed ones (arrays and maps). Thinking always about arbitrary code, not human-written stuff fraught with habit and consistency, just imagine how often a random program would try to look up the 9e12th or -88th item from a list of five items. Imagine, for any kludge solution you can come up with to avoid runtime errors for standard array lookups with arbitrary numerical indices, that you’d still have to consider whether writing to the 88th item in an array “really” adds 87 empty slots beforehand, or what…. And if you want some time to scratch your head, imagine providing “standard” key-value map data structures, but think of how they’d “work” under conditions where arbitrary keys might arise. How likely does it sound that storing and then retrieving items from an arbitrary index of any type or value would occur?

A: Not very likely.

When we try to force human-writable data structures to behave robustly—in the senses I’ve outlined above—we must decide whether we prefer to be confused or wrong when our code tries to handle the tacit work we humans were expected to do.

This project surfaces the fact that we human programmers have been trained to use the strict syntax of familiar traditional data structures, and at the same time tacitly conditioned so that we are willing to do the mental work it takes to hold in our minds a model of what is “in” each location in a store, and what that thing’s “address” should be, and what should happen if we ask for a “bad address” (out of range, undefined, whatever). All of the traditional data structures demand we programmers act as a kind of “external temp storage” for important semantic and pragmatic information. And to be honest, somebody reading another person’s code has to do all that same work, and discover and store that extra information again while they try to understand what was intended….

Which should make you think of the problem of naming things.

In the context of automatically generating code, I find we’re much better off when we change the behavior of data structures to weed out the human baggage-carrying. I understand better than most that it can be tempting to impose constraints on what our search process can “say”, just to preserve the familiar and traditional behavior of data structures in the results. Here are some trivially simple FizzBuzz programs I evolved in Clojush, and I am still challenged to say what some of them are doing to solve the problem….

But the fundamental goal of GP should not be to write code like people do, but rather to find new stuff. We should want it to do its work far enough outside of the box that it provokes us, but not so far that we cannot learn from it new ways to build entirely new things for ourselves.

Do you notice, in my explanation of tagspace structures, how many kludges I’ve already imposed? These things I’ve described are constrained to use numeric keys, but Lee’s original descriptions permitted any ordered type, and indeed any pattern-matching approximation system. What might it mean to let indices “be” both integers and strings? What is “adding some slack to balance spaces between landmarks” going to mean in non-numeric continua? Should we use lexical ordering, or some kind of other distance? As I recall, many “distance” measures for strings are not necessarily metrics, nor are they trivial to measure….

Further, I didn’t even implement an important feature of the original tagspace sketches Lee produced: Notice that in the implementation I described above, assignment is exact, but retrieval is approximate. When I write a new thing to a given index, I simply create the new landmark as given, or if that particular landmark already exists in the tagspace I update the existing one’s value. What might it mean to have “approximate writing”? When (in “programming”) do we really need to create a new landmark, as opposed to overwriting an old one? Is it the case that a process should always return a value from a non-empty tagspace, or should there also be a “no really look at that landmark only and don’t slide up” instruction? In software development design patterns we’re trained (see that word, again) to think of “getters” and “setters”, but just this minor change in behavior—opening up what an “index” means—implies there should be “inserters” and “searchers” as well.

What do we want? What does the thing we’re building want?

Klapaucius: more paint gets you to the corner faster

In building Klapaucius (which was the sixth or seventh Push language interpreter I’d created), I tried to compensate for some of the shortcomings I found in Push as it existed in the only public specifications at the time: Push3, and the production codebase from Lee’s lab, Clojush.

I consolidated the traditional Push notions of :integer and :float into a single :scalar type, and added a separate and integrated :complex type. There were consequences regarding the way :integer values were used as indices of collections. But those were relatively painless.

I abstracted as much as I could concerning “vectors of some type”, and “sets” and “code blocks” and some other collections, and automated (somewhat) the process of building “derived types” and the associated instructions. One can add a new “item” type manually, give it an “attribute” like “listable”, and that programmatically get an entire “vector of [item]” type with all its accompanying expected functionality fully-formed. So if you add a :color type, and give at an attribute :listable, magically there will also appear a :colors vector type, representing an ordered collection of only :color items. So that’s useful for domain modeling.

I smoothed over some of the rough spots where Push instructions required specific “types” as if they were type-checked, but without actually doing the work of type-checking in the interpreter. For example, when indexing collections and iterating over a range. It turns out that several human-writable programming languages are perfectly happy to give you the 9.3rd item from an array, and Klapaucius’s Push is as well. Where Klapaucius goes farther is in permitting you to get the rd item of a 10-item array, or the th, and in a standard way across all scalars.

And there were some generalizations I won’t go into here concerning local variables and how they work. A lot of research-focused Push interpreters gloss over the usability of “input values” and “results” and local variables and such, thinking I suppose correctly that (1) defining inputs and outputs is often a matter of user preference and therefore complicated and problem-specific, and (2) it’s fragile when it comes to extending these to new user-specified types and instructions. One wants to let users give an “image” as an input and get an “image” as an output… but then the user will have to provide an “image” type, and “color” and “pixel” and all the stuff to hook those into the primitive types already present in Push…. But it gets worse when your code has no internal way of saying some particular blob of primitives “is” an image, rather than a matrix or a vector or a color object, and also has no way to refer to it as “color #1” vs “color #2”. The idea of all these new types and the program’s ability to refer to items of those types, indirectly and with runtime-generated names, is all tied up in knots.

And so on.

But at a certain point, the complexity hit me again. Even in my code, each instruction is left being “responsible” for where it gets its arguments, and where it puts its results. The only type-checking in the Push interpreter, as a rule, lives at the place where the :exec stack routes items it pops from the executing program.

So for a while I did that. Instead of pushing the results of all the instructions to specific stacks, I tried pushing those results to the :exec stack, where the interpreter would subsequently route them according to its consistent rules, to the correct “address” stack. That way, the instruction :scalar-sqrt could be confident that there would be only :scalar values on the stack of that name, because a complex result from taking the square root of a negative number would be pushed to :exec and then routed to :complex not :scalar.

Ah, but group theory.

Because of course every :scalar is also a :complex number. So what happens, now, when we take the square root of 4, vs the square root of (4+0i)? Is one a :scalar value of 2, and the other a :complex value of (2+0i), or are they both :scalar (because of implicit “type simplification”), or are they both :complex (with implicit “type promotion”)? This is the problem in many languages with implicit type-casting systems, so it’s not just Push.

Just as problematic as the programmer’s habit of using “just the normal primitive types like int8 and float64” can be problematic for somebody who doesn’t want to model their “real domain” in terms of numbers of bits, so is the alternative of pooling them all together with the fancy Standard Library extension. Floating-point representations suffer for reasons we all know, involving precision and accuracy. But arbitrary-precision numeric types are problematic as soon as you divide 1/3, complaining that the resulting repeating decimal never terminates. Either one will cause troubles as soon as you include exponentiation—and why wouldn’t you include exponentiation? All of a sudden you’re not only worried about when an Integer becomes a BigInteger (stuff most human-writable languages take care of behind the scenes), but you’re worried about when a number “becomes” Infinity, and whether you should include a value of Infinity at all, and if you do whether dividing by zero produces Infinity or some other thing called NaN….

Foreshadowing: Standard numeric types will not save you. They’re made for mindful programmers, not for running arbitrary code. Just like indexing, they are in practice “offloading” tacit knowledge from the implementation code into your damned head, because juggling all the aspects of hardware and compilation and ad hoc decisions somebody made in an RFC in the 1970s for an Algol compiler is too hard to remember.

And then, one day, I needed a :date. That was the camel’s back.

Somehow, the growing network of interconnected instructions makes the resulting whack-a-mole project of “smoothing over the rough places where language designers have pushed important rules from code to programmers’ heads” is more complicated in Push than in would be in human-sensible languages. Because there are no human brains around to do that work, unless you count the person writing the language itself. The problems with numeric types aren’t “edge cases” when people write programs in Ruby or Clojure, because when people write them they (1) have been trained to stay away from some of those problems, and (2) they are forced to fix bugs by hand.

What if we map the square root function over the vector [-9 (4+0i)]. What’s that? If I (a hyoo-man) am writing this program, then I’ll either explore that question and its answer in a unit test (and will have to adapt the specification for map or sqrt to suit what I desire to happen), or I’ll have to debug something in production it it happens that I’ve never thought of the problems this brings up before the production code discovers it for me.

And so on. A thousand little cuts.

You (am Experienced Programmer) may now have the sense that all one needs to do to avoid this sort of problem is a bunch of abstract theoretical type system design, so that we can ensure that every integer goes to a place called :integer, and every vector-of-booleans manipulated by some kind of function or instruction is definitely of the correct type, and so on.

“Be more careful,” is what I hear you saying. “Be more rigorous.”

I smile, somewhat sadly. But no. Because one of the other incredibly useful[-seeming] features of Push is the ability (like Lisp) to treat “code as data”. You can see a facet of this even in the simple traces above: there is a “special” stack called :exec where the program being run is placed, and from which tokens are popped and interpreted.

But some instructions—especially those involved in control structures like conditional execution, iteration and recursion—will write stuff back to the :exec stack. The “type” of that stack is “any executable code”. And oh hey look at this, there’s also another stack, called :code, which is idiomatically used “as if” its type was “quoted code”, and which also can hold anything of any type. And then there are those “variable bindings” I mentioned: Do those have a fixed type? If so, how does that work, since… and so on.

There are a lot of “and so on” problems here.

This is not at all a complaint; it’s admiration I’m expressing, if anything. I’ve had a lot of success using the Klapaucius interpreter for much more complex problems than those that are handled by earlier Push implementations. It’s more stable, more extensible, and more consistent internally than others.

Rather than complaining, I’m just noticing that these increasing problems with “type” and and usability all form what we in the software development world call a “code smell”.

There’s an abstraction here, trying to let me know what it wants me to do. I am hoping that noticing it and listening to it will improve matters in terms of the “big goals”—usability, evolvability, and so on. But while I hope for that, I am certain that by making the change it seems to imply, I can at least simplify my ongoing process of understanding what’s wrong more generally.

types and places

I’ve written before and at length about how Push mixes up the notion of type (in the computer science sense of the term) and address. The language and its cognates rely on all those instructions to reliably and consistency pop the right arguments and push the right results to the right stacks. But as I’ve repeated above, in Push the “right” place doesn’t always mean “by type”, it means “by convention”.

Those conventions break down when we want to add new types. Because in practice, when we reach a certain size and scale of both ontology and program, we want and need a “number” to sometimes be a “probability” or an “index” or a “count”. The convention in Push has been that there should (sometimes) be a unique type and stack for this sort of thing, or (sometimes) that there should be an obligation for the user to model and interpret slight complications (like “probability values”) in terms of existing primitives (like “floats”).

But this is just the obligation to the “user” of Push. What about the Push interpreter itself? Is the vector [9.0 (4+0i)] something you should call a “vector-of-integers”? Is an “index”? Why is not a “matrix”, but might possibly be? If you let a new :matrix type “be” a simple vector-of-vectors-of-numbers, then is almost certainly going to pop up some day….

Similarly, as I’ve added those “handy” automatically-derived types, like vector-of-item, there’s a new pressure to combinatorially multiply all the types into derived subtypes. For every kind of thing, must there also be a vector-of-that? What about a set-of-that? How would I build a tree-of-that—a necessity in our models of some domains? Should it be a vector-of-that-and-also-nested-vectors-of-that?

It feels as though there’s an abstraction regarding vectors (ordered collections), trees (ordered collections and sub-collections), matrices (ordered collections of collections of specified size), sets, sets of key-value pairs, ordered key-value sets where the values are any of exactly five specified types and where the key must be orderable….

Oh but wait! Suppose (since it’s true) what I want to do is also support a reasonable-seeming functional programming system in Push. That is, I want somehow to define some block of Push code as a “function” f_88, and then run code like ([1 3 9 "foo"] f_88 map), and have the “function” f_88 “applied” to all the items in the vector [1 3 9 "foo"]. What is the “type” of f_88? And sheesh, what’s that damned vector with a string and some numbers? Where will the results be put? How will it be sorted out?

And worse, how will a human adding new types and domain-specific stuff to the language cope with wiring up their new things in a consistent and reliable way? It’s not just stacks. If you add a “point” type so you can do some graphics manipulations or plane geometry, you’re going to also have to convert those pairs of numeric values back and forth to standard Push numbers, somehow.

The design goals of the Push language have in some sense succeeded at building in some robustness, especially in the form of runtime crashes. But at the expense of extensibility. It implements a deeply interconnected mesh of “primitive types” and instructions moving things around among stacks safely, but adding even one new type and the requisite domain-specific instructions can quickly cause unexpected results: either the type will be too isolated, and “ontologically starve” during the execution and search over randomly-generated code for lack of primitive arguments, or you will have to make the new thing so deeply interconnected with the existing system that you risk causing new runtime errors in unexpected places…. And then you want to talk about :date objects. Need to. It’s a Goldilocks problem. Decisions made for the best of intentions have consequences downstream.

Gödel would be smiling.

But I would be willing to assure him: I don’t want to model everything consistently, just enough to be useful.

Where we’re at

The process of developing this sort of framework is not a single design-implement-test-debug story, it’s a conversation between we (the “authors”), and the thing itself as it communicates to us the details about the real world and our assumptions about it that we’ve missed. Abstraction never gets stuff right. All models are bad, but some useful. All that aphoristic bullshit is true bullshit we should always remember as we build new bullshit in new ways.

But here are some of my design problems with Push as it lives today, boiled down to pithy contrasts:

  1. Instructions care deeply about the types of arguments, but only the :exec router checks types of things that get put onto other stacks.
  2. Collections have complicated “type” which can’t just be in reference to a stack; if I add a :string item to a vector that only contains scalars, I have implicitly changed that collection’s type. That implicit thing is one cause of pain for users adding new types or instructions to model their projects.
  3. A Push instruction is not a function in the sense of functional programming paradigms. Indeed, their behavior is defined entirely in terms of what we’d normally call “side-effects”: things disappear from the interpreter state, and then new things get moved around in the interpreter as a result of some secret internal calculations that happen within the span of the instruction “transaction”.
  4. Push code does not have “type”, except insofar (and in a hand-wavy way) as the instructions it contains “want” to “consume” arguments not present in the code, and insofar as the stacks themselves end up with new stuff.
  5. Some Push instructions act on the interpreter or whole stacks directly. That is, they modify the interpreter state observed by other instructions, change routing rules, or rearrange the contents of particular stack. In Klapaucius, they can also construct new stacks, or destroy them.
  6. Oh god, collections. In Klapaucius there is a :tagspace collection “type”. There is also a general ordered :vector “type”, and “vectorized” collections of many other atomic types which are automatically generated. And there’s a :set type, which is an unordered collection, but to be honest I made no attempt to make typed sets because it was tiring and confusing and it smelled bad. Then there are a sort of unordered “map” structure, which is intended to suit for keyword-keyed domain-specific “objects” or “structs”, and which can be automatically produced from a template of “other types” associated with “names”. This in itself is problematic and troubling for any Push language going this direction, because there is also the question of mixtures of those collection types, trees, that sort of thing. Oh and also there are already several implicit collection types in Push: the “code block” (which has special behavior in the interpreter cycle, and which also is intimately involved but not coequal with the things on the :code stack…) and there’s the “stack” itself. Push, though stack-based and fundamentally “about” stacks and stacklike behavior, has no way to do anything to any stack explicitly from the code. It’s a fucking mess.
  7. The primary implementation of Push, the Clojush interpreter from Lee’s lab, doesn’t have runtime-defined variables at all. It has input variables, but they’re weird sorts of “standard” one-item collections (like they are in most human-readable languages). In Klapaucius, in a nod towards abstraction and enhanced structure manipulation, I created a new thing called a binding, which is a fancy way of saying that every variable is itself a stack of items. When a new value is assigned to a binding, it’s pushed onto the stack; when a binding is dereferenced, the top item is peeked (not popped, as with arguments from stacks). But then there’s the question—not insignificant—of whether a variable can “have” a “type”.
  8. There’s no question that Push wants to have some kind of type system—it’s how users think. Nor is there any real question of whether it should have some kind of centralized type-checking system for pushing results to and popping arguments from stacks. Because it’s annoying when there’s a problem and confusing to add a new “type” without realizing that the place to put the type-checking is in all the new instructions you have to write to handle that new type.

Those last two are huge. Imagine wanting (as I do) to do a little project involving strategies for playing the game checkers (draughts). What types and instructions do you really “want” to evolve a checkers player? There’s a checkers board, and there are pieces, and there is the notion of turns, and there is probably some kind of history of turns, but there are also “imagined” turns and the consequences of moves, and there are calls of some sort to some kind of checkers-simulator API that your “strategy” is going to have to use in order to “imagine” future moves…. It’s horrible. Modeling it is trivial on paper, and no more complicated than a brief conversation with a patient friend. But acting on that model, in Push, entails wiring it into the existing types and stacks.

I mean: is a checker board a kind of “array”? Is it a matrix? Is it something that has a state of where the pieces are? What is a “king”? And that’s just domain modeling. Imagine the instructions you will need to concoct to make these new “types” interoperable with the “standard” types: vectors, strings, booleans, numbers, collections of all kinds….

And for the coup de grace, consider that many of us (me included) are really interested in finding “programs” that are actually agents, and which themselves develop rules for manipulating their own state. Lee Spector calls this process “autoconstruction”, and in his hands it involves giving the “programs” not only a way of addressing the task at hand, but also giving them some kind of way of updating their internal state between games (or across “generations”), of manipulating their own internal constants or of producing the “genome” of their own “offspring”.

In summary: The arc of Push language development has been excellent for research and ideation on artificial life and genetic programming, and it sucks as soon as one needs to do anything practical. Set aside the problem of understand how a program you might discover “works” to solve a task—the work one needs to do to actually model the domain is overwhelming. It will only get worse in the future.

But note well: This is not different from the problems exposed in other human-writable programming languages. What is different is that when we concoct those human-writable languages, humans are the place we tacitly offload the hard bits. We make people remember the syntax, we force them to follow arbitrary semantic rules (like which index is viable in an array of finite size) and constraints (like don’t expect a value from an indexed lookup to an empty slot in a container), and we hide and smooth over the inner problematic kludges until a runtime error squishes out the side (don’t try to convert the rational number 1/3 to BigDecimal in Clojure).

In Push and derived languages, we’ve done as much as we can with the notion of adding functionality to support those tasks human-writeable languages foist on modelers and coders and debuggers. But mostly by accretion. Maybe it’s time to consolidate and purge a bit again.

What do we want?

We (where “we” is “me” I suppose) want a language which is like Push in these ways:

  • a fundamentally stack-based representation1
  • robust to arbitrary code order
  • robust to missing arguments
  • capable of creating and maintaining complex internal states during evaluation (like Push is, with its stacks)
  • capable of building and using local variables at runtime (as Klapaucius is)
  • size limits on arbitrary structures, to avoid memory errors at runtime (most Push interpreters just limit the size of structures you can make, with runtime failure if for example you concatenate two 128k-item vectors)
  • capture and surface (for introspection and optimization during search) runtime errors that are permitted, such as division by zero, overflows, missing arguments, and so on (as Klapaucius does)

But what we’re missing is these:

  • simple way of creating new types and instructions without concern about runtime type errors
  • consistent way of building collections—ordered, sets, key-value pairs, tagspace-like—automatically
  • simple (but possibly opinionated) way of defining and constructing domain-specific types
  • integration of those tacit “pseudo-types”, the “stack” and “code block”
  • disambiguation of “type” and “stack”
  • ability to define instructions based on what is actually needed for type-safety, not just in terms of where arguments are expected to be found
  • ability to construct proper functions at runtime
  • ability (because this is for genetic programming, not people) to manipulate the state and behavior of the interpreter at runtime—including the possibility of constructing novel internal variables and types during runtime execution
  • interoperability with standard OS and domain modeling types: images, text, files, ports, data analysis, YAML, and so forth
  • interoperability with accepted standards of mathematical modeling, including not only standard numeric types, but also abstract higher-order categories
  • more!

That seems like a lot. Maybe it is. But Klapaucius has well over 1000 instructions defined, and several dozen explicit types, and it’s fascinating and useful and I’ve gotten a lot out of it, but it has reached the point where the underlying Push language feels like the obstacle.

So it’s time for the next phase: K-Push.

K-Push

K-push, or “Klapaucius Push” (named in reference to K-pop) is my attempt to cope with these problems, and further simplify the use of GP for searching for novel algorithms and structures.

It’s still in the sketching stages, but here are some notes:

stacks

I’ve been thinking a lot about collections. Consider those die-hard core ideas from Push: the “stack” and the “code block”. In Push (including Klapaucius), each stack has a name. That is, the collection of stacks is treated as a key-value map: when I ask for the top thing from the :x stack, I look in the collection of stacks for the one labeled :x and pop the top item.

A stack is like a special kind of vector, isn’t it? We have “vector” types in Clojush and Klapaucius, but we manage them separately for some reason. Sometimes those “vector” types have a kind of ad hoc type checking, so that we can (in Clojush) have a :vector-of-strings or (in Klapaucius) a :scalars stack, with its own special filtering behavior. A vector that contains any item that is not a string doesn’t count as a :vector-of-strings, and a vector that contains any item not a :scalar can’t be a :scalars.

But where does that checking “live”? The vector itself doesn’t “know” what it contains; if you do a type-check on one, you need to examine every item and ensure they are all compliant.

That would technically be true of a stack, too, if we ever bothered to check the type of things we push onto it. Does a stack “know” what kind of thing it is supposed to contain, and can it provide any insurance that some random instruction hasn’t snuck in a ringer? Not in Clojush, and not in Klapaucius either.

A “code block” is also “like” a special kind of vector, as it happens. It’s a vector-of-code (aka “stuff”). But except for the fact that we’ve always written a code block in parentheses and a vector in square brackets, and except for the niggling problem that they are treated differently by the interpreter when they are popped from the :exec stack, they don’t do anything.

But a “code block” in any other stack—whether it’s the :code stack or in some variable’s stack—doesn’t “act” in any particular way. It’s just… some stuff. In order. It’s really the behavior of the interpreter loop that manipulates code blocks in any way other than expected. There’s no tree-like nested structure for items on other stacks.

And then there’s the question of routing. When I write code in a new instruction’s definition that pushes an item like 99 into a name-specified stack in Push, then it goes onto that damned stack. Nobody checks to see if the stack likes 99, and nobody checks when it’s created that it “should” be there. When the interpreter pops 99 from the :exec stack, then it gets type-checked and routed as if we know that all integers go there. Duh.

And then there are vectors proper, and vectors-of-type, and sets, and [nonexistent but desired] sets-of-type, and sorted-key-value :tagspace items, where the key has to be a :scalar, and unsorted-key-value structured types. And all the ways they need to interconvert with one another.

There’s also one more thing: When the interpreter encounters a token representing instruction (in any Push language) or a :ref (in Klapaucius), special stuff happens. The instruction’s associated code is executed, or the bound value of the reference is retrieved. It’s almost as if there is an implicit key-value map there too: each instruction has a key, and the interpreter “recognizes” valid instructions and references (variable names) by consulting this list of keys.

So the access of stacks works like accessing a key-value store by stack name. The instruction handling and local variables act like we have a key-value store where we look them up by name. Stack contents themselves act like vectors, in that they’re ordered and the items don’t have to be unique. Push vectors and code blocks are also like vectors (intentionally, here): ordered and permitting repeats. The Klapaucius :tagspace data structure is like a sorted-map, with numeric values for keys (and some other odd behavior for lookups). Indeed, in practice the most obvious conversion from a :vector to a :tagspace is to simply assign numerical keys to the items in the :vector that are equal to the items’ indices, putting “landmarks” at the index values. In Klapaucius a :vector-of-thing is like any other vector, but somewhere there is also a “recognizer” that also knows how to check all the contents for consistency with the specified type, so that when you drop a :string item into a :booleans vector, you get a generic multi-type :vector by default. And the Klapaucius :set is like a vector, sortof, except it doesn’t permit duplicates.2

The “missing” type of data structure, the odd one out as it were, is the “compound type” unordered key-value structure. But of course it’s the most useful of all when it comes to domain modeling. We need something that will work for a domain-specific thing like a :color (where there are :red and :green and :blue and :alpha), or :image or :employee or :bookcase.

That is, honestly, we need something closer to “objects”.

This fundamental notion is “missing” from this ontology because there’s really no obvious way to make the jump from the (very useful) :tagspace with its robust approximate lookup dynamics, to something with specified keys like that. And if one were to generalize the :tagspace structure, what would it mean to have one with numerical keys, and then write something to key "foo"? Does "foo" sort on the set of strings but not numbers? Would a key of {:red 99 :green 33 :blue 132} or (->Point 9 3), which are unsortable, be… just unordered? How does “approximate matching” work for those? What happens if I attempt to look up item (->Point 9 3) in a :tagspace containing keys [8 111 9131 "b" \h]?

What if

Suppose we think of all collections as stacks. Seriously. Maybe “enhanced stacks”, since there are extras required, but me try, at least.

Suppose a stack is a stack. OK. But a stack (any stack) can have an associated type filter. You cannot push any item onto a stack which its type filter rejects. If there is a stack with a filter “only positive integers”, and you try to push "Sophocles" onto it, it will hand it back to the pushing process.

Suppose also that a “code block” is a stack. That is, the Push code (7 (true false 9) :x :scalar-add) is a stack of code containing 4 items: a number, a stack, a variable name, and an instruction. That second item, itself a stack, contains three items.

When in Push we send something to a specific stack, we use a unique address. The :integer stack is, implicitly, found “under the key” string “"integer"” in some kind of table. In K-Push, we make that table explicit.

Suppose that the stacks are stored therefore in a key-value map, by name.

Suppose further that the instructions are also stored in such a key-value map. When the interpreter encounters something that appears in that list of keys, it will find a value containing code that can be executed.

Note: Does this mean we can write KPush instructions in KPush itself? Maybe so. We’ll come back to that.

Now in Klapaucius I insisted that variables are also a kind of stack. That is, when we declare an “input” or “output” or “local” variable, there is a stack created. But note that in Klapaucius, the behavior of “dereferencing a variable name” means peeking at the top value (leaving it in place), while the act of “consuming an argument for an instruction” means popping the top value of some stack.

If we’re to generalize behavior, then these KPush “super stacks” need to have both modes somehow.

So imagine for a moment where we are just with “standard Push”, under these changes in simply describing the data structure we use. We’ve got some kind of key-value store, and it looks something like this, with stacks being in parentheses, and code blocks in parentheses, and “vectors” in brackets, and maps in curly braces, and sets in hash-braces….


:exec     (item item () item (item item) item ...)
:scalar   (scalar scalar scalar ...)
:scalars  ([scalar scalar] [scalar scalar scalar] [] ...)
:complex  ({:re scalar :im scalar} {:re scalar :im scalar} ...)
:boolean  (boolean boolean boolean ...)
:booleans ([boolean boolean] [] [boolean] ...)
:string   (string string string ...)
:tagspace ({-7723 item 732 item 3e12 item Infinity item} {-9 10 9 10} ...)
:code     (item item () item (item item) item ...)
:set      (#{item} #{item item} ...)
:chars    ([char char] [] [char char char] ...)
:point    ({:x scalar :y scalar} {:x scalar :y scalar} ...)
:vector   ([item item] [item item item] ...)
...
:x1       (item item () item (item item) item ...)
:x2       (scalar scalar scalar scalar ...) ;; "typed"
:x3       (complex complex complex complex ...) ;; "typed"
:y        (item item () item (item item) item ...)
:a842     () ;; local
:a7131    (scalar scalar scalar ...) ;; "typed" local
...
:inst-1   (code for inst-1)
:inst-2   (code for inst-2)
:inst-3   (code for inst-3)
...

When we run the interpreter loop in Push style, what are we doing?

  1. We consume (pop) the top item from the :exec stack.
  2. If it’s a key for one of the other things in that collection, we look up the associated thing—maybe some code, maybe some bound values—and we “do” that thing. If it’s code, we run it. If it’s a binding, we peek at the top value and return that to :exec.
  3. If it’s a “code block”, we unwrap the items and push them back in the same order onto :exec.
  4. If it’s a code literal, then we… well, in most Push implementations I’ve ever seen, we start at the top of the list of stacks and we check to find the first matching “type recognizer”. And if we find one, we push the thing onto that stack.

In Push interpreters, this is all handled by a variety of non-overlapping data structures of course. Because Push itself developed over many iterations, and new things were added to its ontology to handle good ideas and make it more usable and evolvable. There’s a lot of contingency here.

But having made some progress, and seen the rough spots, do we need to keep it around?

Let me speculate for a moment. Look at the example I’ve sketched: I’ve got a :point stack there. See it? It contains items with the structure {:x scalar :y scalar}. I’m using shorthand here; maybe to be (vaguely and approximately) consistent with other type specification syntax in “real” languages, I could use something like this: {«x:scalar» «y:scalar»}. That is, “there’s a thing called x that is a scalar”.

Stacks of stacks

Here’s a wild thought. Let me riff on it a while.

When we write an instance of one of these structures in human-parseable programming languages, we assume that all assignments are unique. But what if the assignments in this kind of structure are all stacks. That is, a point is not (going back to Clojure-like syntax){:x 9 :y 111} but is rather {:x (9) :y (111)}. And that when we “assign” a new value to the :x of this point, what we’re really doing is pushing the old value down in the stack. So for example, if we set :x to a new value of 88, we’ll get {:x (88 9) :y (111)}.

That seems nice, somehow. Something’s good here.

But I threw away important information, didn’t I? There really needs to be a constraint on :x and :y in the :point structure. It needs to be {«x:scalar» (88 9) «y:scalar» (111)}. It should be “impossible” to push false to :x in this structure, if you ask me. I don’t know yet what should happen if you try, but I know it should never go onto that stack.

Hey, hang on a minute. What happens if I make this adjustment to the “cartoon Push” above?


«exec:item»          (item item () item (item item) item ...)
«scalar:scalar»      (scalar scalar scalar ...)
«scalars:[scalar]»   ([scalar scalar] [scalar scalar scalar] [] ...)
«complex:complex»    ({:re scalar :im scalar} {:re scalar :im scalar} ...)
«boolean:boolean»    (boolean boolean boolean ...)
«booleans:[boolean]» ([boolean boolean] [] [boolean] ...)
«string:[char]»      (string string string ...)
«tagspace:tagspace»  ({-7723 item 732 item 3e12 item Infinity item} {} ...)
«code:item»          (item item () item (item item) item ...)
«set:set»            (#{item} #{item item} ...)
«chars:[char]»       ([char char] [] [char char char] ...)
«point:point»        ({:x scalar :y scalar} {:x scalar :y scalar} ...)
«vector:[item]»      ([item item] [item item item] ...)
...
«x1:item»            (item item () item (item item) item ...)
«x2:scalar»          (scalar scalar scalar scalar ...) ;; "typed"
«x3:complex»         (complex complex complex complex ...) ;; "typed"
«y:item»             (item item () item (item item) item ...)
«a842:item»          () ;; local
«a7131:scalar»       (scalar scalar scalar ...) ;; "typed" local
...
«inst-1:code»        (code for inst-1)
«inst-2:code»        (code for inst-2)
«inst-3:code»        (code for inst-3)
...

That’s interesting. So there’s a general item type, of course. There always was. But also there’s an interesting thing that can happen here with [scalar], a syntactic sugar to indicate “an array of only scalar items”. But there doesn’t seem to be a lot of improvement for things like «tagspace:tagspace», does there?

Can one write a type specification for a Klapaucius :tagspace? Maybe. I think it would be something like: it’s an “array” of keys and values in pairs, with the keys all numeric and unique. Since I stole some syntax from Swift before, let me look briefly at its specification syntax for Dictionary types…. OK how about this, if I mix up Clojure’s curly braces and Swift’s type specification? A :tagspace item’s type might be defined something like {scalar: item}, which is to say, “a key-value map using scalars as keys and any item as values”. What’s missing here is the notion of sortedness of the keys, though. And also their implicit uniqueness—there cannot be multiple keys of the same value.

But of course I’ve already suggested that :tagspace is good, but insufficient in the way I’ve currently implemented it in Klapaucius. A KPush “tagspace” needs to be able to handle arbitrary key items somehow. Sort of like the way the interpreter is able to look up items on stacks, or references, or instructions’ code, even though I’ve drawn all those things in my “cartoon” as being in the same place.

Hmmm….

But let me finish this thought about structured types and generalizing stacks. At the moment the notion I’ve had for :tagspace items is that each assigned value is unique (or absent). But hey… is this another place for a stack?

Maybe every non-literal value in KPush is a stack, by default? Not just bindings, but also the fields of structured types, and stacks themselves, and code blocks?

I’m convinced that a vector can be a stack, without loss of generality, and maybe with some benefits. And let’s look at the hash type in Clojure: It’s “really” an ordered collection of pairs of key and value, with a constraint of uniqueness on the keys.

Well shit. What is “uniqueness of keys”, really? What is {(3 1) (9 3) (7 2 11) ("foo" 88)}? It’s a perfectly good Clojure map, since the keys are unique. But we could treat those keys as if they were peekable bindings, couldn’t we? That is we can use the top items in the key stacks as the “current” key, so that indexing by 3 produces 9, and indexing by 7 produces "foo".

Further, this permits the approximate tagspace dynamics: Indexing by any value 3 or less will produce 9, indexing by any value more than 3 up to and including 7 produces "foo", and indexing by values larger than 7 will return 9 again.

It works as long as the type of the key stack is still scalar. I wonder what we can get away with if it isn’t.

Say this is a KPush tagspace:

{(1 2) (3 4) (5 6 7) ("8" 9) ("apple") (11) ("canvas") () (#{:foo}) ([:bar {:x 9}])}

If we treat the top item in each key stack as “live”, then the visible keys would be [1 5 "apple" "canvas" #{:foo}]. That is, some are numbers, some are strings, and one is a set.

To write a new item to an existing value stack, I would assign it using the exact (visible) key in a given key stack. So if I assign a value 777 to key 1, then the stack (3 4) would become (777 3 4).

To write an item to a new value stack, I assign it using a key not currently visible. So if I assign a value 777 to key 2, I’d start a new stack with just 2 in it, and get

{(1 2) (3 4) (2) (777) (5 6 7) ("8" 9)...}

Note though that I can use any orderable key to assign values sloppily if I wish. That is, I can assign 777 “approximately” to a numeric key of 2.2, and the cursor will “slide up” among the numbers (because of the type of the key I’ve used) to find the key (5 6 7), producing a new pair {(2.2 5 6 7) (777 "8" 9)}. The 2.2 key is pushed onto the key at which it first matches.

But I can also do the same thing with the string keys. If I use the string key "bar" without approximation (sliding up), it will find there is no exact matching visible key, and produce a new pair {("bar") (777)}. If I assign it approximately, it will see the (sorted) string keys ["apple" "canvas"] and so it will push its value (and itself) onto the latter to produce {("bar" "canvas") (777) }.

Finally, there are the unsortable items that can be used as keys: the set #{:foo} for example. For these, there can be no “approximate” matching. If a key matches exactly, it will be used; if it doesn’t exist, then there’s no way to establish a reasonable order over the keys at all.

Thus, we can let the Push :tagspace have a bit more freedom: it can use arbitrary keys, for reading and writing. We can extend its behavior with instructions suitable for manipulating the stacks of keys and values.

Further, we can treat it as a stack of pairs of stacks. That is, as in Clojure and other familiar programming languages (like Ruby) that permit interplay between ordered collections and unordered hashes, we can think of this KPush tagspace object as though it were also a vector of pairs, addressable by index. And so on.

So there is a thing, a super stack. It has stuff on it.

If I draw it like a “vector”, it might look like this: [8 6 3 "foo" 6 8 "bar"].

If I draw it like a “set”, it might look like this: #{8 6 3 "foo" ø ø "bar"}, where ø is some kind of notation for “hidden items”.

If I change it into a tagspace with numeric keys based on the indices, it might look like this: {(0) (8) (1) (6) (2) (3) (3) ("foo") (4) (6) (5) (8) (6) ("bar")}

If I change it into a tagspace by taking it as pairs, it might look like this:


{(8) (6)
 (3) ("foo")
 (6) (8)
 ("bar") ()}

Let’s look some things up by index.

The rd item of [8 6 3 "foo" 6 8 "bar"] is 6 (sliding up), because "foo" is the item at index 3.

The rd item of #{8 6 3 "foo" ø ø "bar"} is "bar" (sliding up), because "foo" is the item at index 3, and the items at 4 and 5 are hidden (thus, unavailable).

The rd item of {(0) (8) (1) (6) (2) (3) (3) ("foo") (4) (6) (5) (8) (6) ("bar")} is 6 (approximating with sliding up). It’s nothing (there is no value with that key) if we’re not approximating.

The item looked up by rd {(8) (6) (3) ("foo") (6) (8) ("bar") ()} is a bit more complicated to get: First we gather the type-appropriate pairs, and sort them: {(3) ("foo") (6) (8) (8) (6)}. If we don’t slide up, then there is no value. If we do, then the value is 8.

Next time

  • If we’re actually going to permit arbitrary keys, then there is an obligation to be able to query an arbitrary object and discover whether it’s “orderable” or not. That is, whether there is a strict order we can impose on its type. For scalars, sure; for vectors, not so much.
  • What happens with failed writes? What do we do if we try to write a prohibited item to a stack with a strong constraint? Does it go to the next place it can fit? To a pile of “failures”? Does it generate an :error?
  • In nested stacks (such as those in :exec), what sorts of tree manipulations are possible? Permitted? What kinds of indexing might we do? Is there a useful (or interesting) role for a tree as a primitive here as well?
  • Can we specify the “kind” (more than type) of arguments of instructions in KPush, and have that specification suffice to draw arguments from named stacks, bindings, or other places? In other words, what are we “doing” when we “gather arguments”? For example, if we implement :scalar-add as something that “asks for” two items matching «scalar», where do we look, and in what order?
  • If we can specify KPush instructions in KPush itself (more notes than I have energy for, here), it seems we can also specify “functions” at runtime just as easily. That is, if we can concoct arbitrary blocks of imperative KPush code that implement the Push vocabulary, we can let GP generalize for mappable functions.
  • How will functional programming ideas work with these weird generalized stack/tagspace things?
  • The problem of matrices is still pressing. Whatever happens, numerically structured data needs to be facilitated without too many kludges. What will it mean to multiply two matrices, if we’re “fetching” the arguments of an instruction from remote collections of some sort? If it’s the top two matrices, then they’ll likely be incompatible. If it’s “sloppy matrix math”, then it’s not familiar for users. If it results in some kind of filtering (as with the tagspace lookup sketch)… well, maybe that will work. I mean, when we’re looking through a tagspace for an item matching “bar”, we filter out all non-string visible keys; if we’re looking through a collection for a matrix of a certain size, we should be justified filtering out all incompatible matrices, yes? Maybe?

All of this feels fraught with even more unexpected consequences. But necessary. We’ll see in time, perhaps.

  1. I’ve built a few non-stack based languages for GP through the years, not least ReQ—which I’ll return to some day—which is a queue-based language, and duck (a duck-typed language with weird closures all through it), and Maarten Keijzer’s push-forth. But Push and K-Push “should be” similar to one another, I think. There’s a simple model here, and the consistency and continuity is useful. 

  2. Interestingly, there’s no reason we couldn’t by default permit the K-Push :set type to have an order. It’d just be a bit more overhead.