Expressiveness, “contraction” and the “edge of chaos” in genetic programming

Draft of 2012.02.16 ☛ 2015.03.17 ☛ 2016.07.22

May include: GP ↘ philosophy ↗ &c.

I’ve spent about the last six weeks writing a “simple” language for genetic programming projects. As with the Nudge language I wrote (with Brian Kerr, Bill Merrill, Bob Kuehne, James Sweeney, Trek Glowacki and Jesse Sielaff) a few years back, I’m going through the effort to surface the subjective experience of design and project management in genetic programming projects.

Quite a bit more on that in a bit. At any rate, an interesting thing just happened, on that subjective experience front.

First, a brief bit of background (because the book isn’t available yet, even in draft):

When GP was first popularized by John Koza and his colleagues back about 20 years ago, there was a huge—nay, overwhelming—founder bias in the way problem-solving was approached. The “language” chosen in those early days was eminently practical for the hardware and software development constraints in play. Let alone the cultural context of being Very Abstruse Computer Science from the 1990s.

In other words, everything was pretty much Scheme. At least very very Lisp-like.

The cultural influence of these early design decisions can’t be overemphasized. I mean, look up GP anywhere, in books or software projects, and you’ll see trees. Over twenty years, people have “experimented with” linear representations, and stack-based ones, and grammatical evolution in more traditional languages, and some really funky representations like the data-flow systems of Cartesian GP.

But you can’t see the forest for the trees. Indeed, unless things have seriously changed over the last couple-three years I stopped paying attention, the major conferences all probably still spend most sessions talking about variations on tree-handling algorithms.

Now trees have a few interesting traits, which I’ve been reminded of through my recent experiences building the Duck language (which is not tree-based, much, sort of).

First of all, trees tend to have one base node, or root—at least the way human beings tend to define them by analogy. If you look at the diagram so prominently shown at Wikipedia’s “genetic programming” example, you see a tree with some arithmetic function at its root.

And this makes sense, of course, when you’re a classically-trained Computer Scientist. Because everybody knows you describe syntactically correct programs using some kind of a grammar, and the easiest way to approach it is to say a program is a kind of [expression | literal], and that an expression is, for example, something like [[literal | expression] operator [literal|expression]], and that a literal is maybe [int|bool|float|string] and an operator is maybe [+|*|–|÷] and so on and so on.

From the mighty acorn doth the tiny oak tree grow, or something like that.

But there’s a very particular three-way tension in all genetic programming work:

You want to design a representation language or process, suitable for the representation of answers to your problem. It needs to be sufficiently expressive to capture both the “core concepts” of domain modeling and the “obvious toolkit” normal humans can use to at least understand a prospective solution. In other words, your representation language has to be able to accept inputs that are appropriate for your project, act on them algorithmically in ways that are appropriate for your project’s cultural context, and produce outputs that can be assessed in terms of that context.

Because the concept is so often ignored in computer science, here are some explicit examples (not all recommended): “We model stock prices as integers (in cents), and keep track of the price over the last five days.” “The lawnmower is modeled as an agent on a lawn that is a grid, that can move one step forward, turn, or reverse one square.” “The [answers] specify the arrangement of circuit components, which in turn are connected by [heuristics] before the resulting model is sent to the simulator for evaluation.”
You want to design a second language/process, suitable for describing the effective dynamic search for better answers to your problem. In other words, you can only use “crossover” when answers written in your representation language can be chopped up and glued back together; you can only do “hill-climbing” when you have some sense of what a “close” or “far” change in your code is; you can “repair constraint violators” only if you have the resources, and so on. I’ll argue that there’s a language of search, and it’s tied to the language of representation in many ways.

This is the thing almost every word ever written about GP is about, so I feel unmotivated to provide further examples.
You want to design a third language/process, suitable for the unfolding goals of your project. This is the language you, personally, use to describe what it is you’re doing, and why. It’s how you adapt and plan and modify the project. This is not merely the language in which the project is specified, but the words and concepts that describe how you play your role in the project as a process.

Because the concept is so often ignored in computer science, here are some explicit examples (not all recommended): “I find that the mutation rate is too high to let the search converge to a sufficiently good answer.” “I removed trigonometric functions from the representation because they were wasting time.” “The answers quickly grew too large, so we added selection pressure to reduce them in size.” “We showed the best answer to the customer, and they laughed at the nine digits of precision on the numbers.” “I tried that, and it didn’t work.”

To my knowledge, GP folks assume the first one is pretty much a given (trees!). Not to be rude, but they often also act as though the third one was some sort of given: after all, research is a matter of Pure Reason and Inspiration and Funding and Student Interns, not “process”!

So they focus—as so many nerds are wont to do—on the second one. Algorithms for search: How More to Make Go Fast Now.

Every solution constrains further problem-solving

I imagine only a few of us in the field have experimented with alternative representation languages. I bet most of those have quickly run into a problem that standard trees can cause.

The last three times I’ve found myself designing a language for representation from scratch, I’ve run into one in particular which I haven’t heard expressed anywhere. Took me three rounds to notice, though.

Here are some. (Note that I’m talking about S-expressions, like you see in almost every GP book, grown recursively according to a simple grammar.):

Every tree returns a value. Because of course why would you want to write a program trying to fit a curve through some points that didn’t give you a number as a result?
Randomly growing trees samples the set of “possible” trees in a representative way. Because, you know, you’re sampling them randomly, so they must all be in there somewhere. Right?
Strongly typed programming is your friend, since you don’t want the result of some badly-chosen subtree to bollux up a calculation upstream with a runtime error, getting you back to “return a value”.
And the one that brings me here today:

Recursion and iteration with trees makes me feel icky and strange.

So I guess I’ve written about eight languages that were specifically intended for general and specific genetic programming projects over the last decade or so. Some were tree-based, some were linear, some (like Nudge and Duck) stack-based, some have been “interesting” in ways that put Cartesian GP to shame—remind me to show you Tangle in a few months.

In almost every case (except maybe Tangle), my overarching subjective experience has run about like this:

Oh look, a rationalization for writing a qualitatively different new language for GP!
It should be non-brittle! Sure, this is making it harder to read, but easier to use in search! Crossover!
It should do arithmetic, and have conditionals operation of some sort, and maybe manipulate complex structures like trees or arrays or stuff, because you know people don’t describe their goals in terms of scalar numbers.
Hey look: I threw all the simple stuff in, and I’m running millions of random programs!
Hmmm. [puts hand on beard] Hang on….
Almost every random script I run finishes in Order(N) steps. And thus: they suck at solving interesting problems.

See, just like everybody else (and more, maybe), I learned a valuable—and dangerous—habit back 20 years ago when I started fitting data with S-expression arithmetic.

Arithmetic is contractive—or whatever the right technical term I can’t be bothered to look up is. Look at addition: two numbers enter, one number leaves. Subtraction, multiplication, division. Fancy graduate-level math like big-sigma summation over a set? Contraction. Hell, it’s not just arithmetic: think about [almost] every Excel spreadsheet function, where no matter how abstruse, the result is never bigger than the arguments.

Trees are a way of drawing contractions like this. The hundreds of millions of spreadsheet users are a valid stress test of the notion: work is mostly about shuffling tree structures around. (Well, OK, Directed Acyclic Graphs, but guess what? Odds are, they’re contractive too!) Hell, calculus and algebra are about reducing expressions too: throw them in the mix. Database searches and set theory? I’m gonna guess “yup”. Statistics? Hell, by definition of the term.

But you know what? Powerful as they are for the expression of complex models, purely contractive schemes suck for writing useful programs.

Recursion and iteration with trees makes me feel icky and strange.

It’s easy to write a computer language for people to use that includes recursion and/or iteration. It’s harder to write a representation language for genetic programming, because the very notion of “random programs” (let alone mixing up chunks of those) brings up the Halting Problem, and the Not Answering Problem, and (my favorite) the What the Fuck is this Here Doing? Problem.

[If you’ve never seen a WTFITHD? Problem: I recently evolved some Duck programs to score Bowling Games. I’ll talk about the experience in detail elsewhere, but I want to mention that it took me most of two days to figure out how some of the answers “solving” the task. Turns out they were running random chunks of code and accumulating a bunch of stuff, and then counting how many things they had left over afterwards, because they wanted a way of “storing” the number 103. I still have no idea know what special value 103 has towards scoring bowling games, but I also haven’t yet found a way to ask the inventors directly….]

It’s harder to write a representation that does recursion and iteration because when you chop things up and mix them at random, you’re forced to cope with the inevitable permanent loops, and iterations fill memory exponentially, and the ghost of Alan Turing just stands there pointing and smirking wryly. You’re forced to cope with “answers” that don’t say anything at all, or which “return” 812 numbers. All kinds of weird stuff.

Math, as it happens, isn’t hard. Shopping is much more complicated. [Or rather, I realize later: Shopping doesn’t let you squish all the hard parts into a set of handy algorithmic received wisdom, like mathematics and genetic programming (to date) does. To shop usefully, I’ll wager one must be far more fully immersed in that third, reflective language that talks about why.]

After all this rambling I merely want to surface a trivial point, because it’s what writing the third language and scratching my beard the third time surfaced in me: I think that in order to juggle the benefits you get in terms of representation with iteration and recursion, and the horrors of coping with the negative search consequences of non-contractive processes, you’re forced to step into the third realm and actually think about how to run a project.

Nerds don’t like to think—let alone talk—about how they solve problems. We don’t like to surface our processes. And we hate questioning the utility of our planning decisions: just look at any academic software if you’re unsure.

So we spend a lot of time looking at problems we can solve with trees, or other contractive representations (like even weird ones like Cartesian GP can be). Looking under the lamp post for the keys we lost.

Not because writing recursion and iteration are hard.

Because they force you to think. And thinking—this is the Dirty Little Secret of Genetic Programming—isn’t “automated”. Having to think and talk about how anybody solves a problem usefully is at odds with all the hype and self-delusion about what Genetic Programming is supposed to be about.

Though I think there is hope. Strangely enough, us nerds also seem to love systems like these. Those, like genetic programming more generally, are forcing people to frame things in that third language as well.

My Duck language, as of this writing, has all sorts of amazing stuff it can do. Map and reduce, fancy concatenative things, higher-order functions.

But it’s so awfully, awfully contractive at the moment.

The interesting thing, taking a half-dozen years and three go-rounds of having the same subjective experience?

I only need to change one minor incidental behavior—minuscule to the point of triviality in comparison to all the rest of the last six weeks’ work—to change it from a boring old do-nothing to a flexible, expressive language that can… well, do shopping in addition to math.

And I find that interesting. Not the detail of what it is I’m gonna do. Just that it’s one dinky little (central, transformative) thing that jumps Duck from boring old consistent contractive order, to the lively edge of chaos.

And that this all is a story in the third language.