A quick Artificial Chemistry in Clojure

Draft of 2016.07.27

May include: GP ↘ artificial life ↗ learning in public ↖ &c.

I’ve been going to technical conferences for a long time, and along the way I’ve seen a lot of great talks. There are a few classes of “great talk”, of course: there’s the one that makes me change the way I approach a whole class of problem in the future; the one that makes me want to change the way I talk about my work in the future; the one that makes me realize there’s a whole sub-discipline out there that I never knew about before; the one that makes me realize there’s been a subtle shift in the direction of my own field (or should be)… and so on.

There are also some talks that blend a few of those together, and throw in a pile of making me say “Whaaaa?!” as well.

I remember vividly, more than a decade back, when Wolfgang Banzhaf gave a presentation on this work he did with Christian Lasarczyk. He gets a gleam in his eye, like he already knows there’s going to be “Whaaa?!” noises before he’s done. And he starts with something along the lines of, “Well! I think I may be speaking today of something a little bit different from what you are used to, but I think also very interesting!” Smiling all the time, but then again I have rarely seen Wolfgang not smiling. But extra smiling….

And then he quickly reminds us of his extensive work with linear genetic programming. Now most of you who have ever heard of genetic programming have probably only heard of the tree-based symbolic regression stuff that John Koza popularized in the 1990s, probably with its inevitable circles-and-arrows diagrams that have arithmetic and numbers in them. Some of you who hang out with me have probably heard of the stack-based Push-like languages that Lee Spector popularized (and I work with, too). Linear GP is another, somewhat different animal, but one at least as powerful. In Linear GP, “programs” are composed of a sequence of imperative calculation-and-assignment operations.

Here. Let me lift and extend a quick example from Wolfgang and Christian’s paper, and we’ll do some more in a bit.

Linear GP is a representation scheme for genetic programming that feels a lot “closer to the metal” than others. Imagine there are a collection of registers, which is just a nerdier way of saying “named variables”—but these can in fact be high-efficiency memory registers on the damned chip itself. And imagine also we can put values in those registers, and read values back out. And finally, imagine we can construct and apply an arbitrary sequence of imperative operations to the contents of those registers, like this:

R1 = R2 + R4
R7 = R3 * R1
R2 = R2 / R6
R4 = R0 - R4
R3 = R1 - R1
R6 = SHIFT_LEFT R3
R8 = NOT R4
R11 = R9 CONCAT R2
R99 = R3 POS? R2 : R9 
...

That’s an example of how Linear GP programs can look. They tend to have a very Space Age Assembly Code aesthetic to them, if you ask me, and along with that package they have a small-and-robust Lunar Lander appeal. I remember being taught in the 1970s and 80s to write programs like this: DO THIS! NOW! GO THERE! DO THAT AGAIN! NOW! And we liked it.

It’s assumed that some values—maybe some constants, maybe some input values—have been “put into” the registers before we begin. Maybe some of them are different types, maybe types are strict or dynamic, maybe there are failure modes like crashing or just NOOPing when there’s a “syntax error”.

But you can probably see that by picking register names and operations out of the proverbial urn, you can quickly and succinctly build a “program” that will “do a bunch of stuff” (change values) when you run it. That’s how I got those last few lines I added: I picked them from the “proverbial urn”.¹

Since you’ve already been exposed at least a little bit to me ranting about Genetic Programming more generally, you should be able by now to intuit how one might evolve these structures. You start by picking a random program from the urn. You set up the registers with some constants maybe, and you place the input values for a training case in some particular registers. Then you yell GO NOW and the little steps are all run, top to bottom, and the values in the registers are different afterwards, and you have established a priori that the “output” is going to be R99, and so you examine R99 and you find a value of 81.2.

And then you compare 81.2 to the desired output for that training case, which let’s say was 777, and you can assign an error measure and try some other training cases with that same prospective answer, or try some other answers. You can chop these answers up by line and mix them together to do “recombination”, and you can change the names of the registers or the operations to do “mutation”. All kinds of stuff. Hey look at that, you’re able to evolve programs.

All in service of the goal of finding a novel program that “does what you want it to”. And it’ll be one—as long as you didn’t get too weird when constructing things—that is pretty much already written in machine-runnable code. Linear GP can be used to evolve bytecode or Assembly or whatever you want. It’s a powerful technique, and one more people should explore.

But… back to 2004.

Wolfgang gave all of us in the room in Ann Arbor a very similar review. Then he dropped the bomb on us. As I recall it, he gave a little extra grin and said something a lot like, “Now, instead, we will be constructing useful programs in which we run the steps in arbitrary and random order.”

Whaaaa?

No, seriously, that was it.

Probably because he is very professional about these things, Wolfgang didn’t say, “Let’s just see what happens!” If you read the book chapter I linked to you’ll encounter several rational arguments involving the rise of asynchronous processors, and robustness of data processing, and parallel algorithm discovery and loads of authoritative-sounding stuff like that. There are also quite reasonable historical contexts for why this is maybe a good “next step” for various ongoing research programs in Artificial Life—including specific Artificial Chemistry research programs—that revolve around just this sort of thing. All kinds of stuff from Walter Fontana for instance on the origins of life and complex systems carrying out “calculations” in nonlinear environments while at the same time self-organizing, and so on….

So there are reasons. Of course there are reasons. But we are here, in my place, and here I get to say: Let’s just try it and see what happens.

Here’s how it’ll be:

Almost everything I’ve already described in my explanation of Linear GP will remain true. Registers. Those will have values in them. A “program” will still be a collection of individual steps, each reading from registers, doing some work, writing the results to other registers. The steps will be run, each will say DO THIS NOW! GO! in its own turn, and in the end I will examine the contents of some pre-assigned “output” register to see what the value is.

The difference that takes us from “program” to “artificial chemistry” is simple enough: Instead of running the program from top to bottom, one step at a time, we will run the artificial chemistry by picking a random line of the program, doing that, picking another random line—with replacement—doing that one, and so on. For a long time.

Say this is my entire “program” (it’s the same one from before, without any more lines implied):

R1 = R2 + R4
R7 = R3 * R1
R2 = R2 / R6
R4 = R0 - R4
R3 = R1 - R1
R6 = SHIFT_LEFT R3
R8 = NOT R4
R11 = R9 CONCAT R2
R99 = R3 POS? R2 : R9 

I just won’t be running it in line order. Instead we will “run” it by picking a number from \(\{1,2,3,4,5,6,7,8,9\}\), doing that step, picking another number from \(\{1,2,3,4,5,6,7,8,9\}\), doing that step, and so on, many times:

user=> (take 100 (repeatedly #(inc (rand-int 9))))
(5 9 4 6 1 8 3 7 3 7 8 4 1 1 9 2 3 1 6 1 6 8 2 4 7 6 5 5 6 8 2 4 1 5 5 1 5 4 8 8 1 5 6 1 6 3 5 6 6 8 9 7 7 2 9 8 6 1 3 8 3 6 6 8 2 9 4 4 9 1 7 4 4 9 7 6 1 3 9 6 5 6 5 3 7 8 8 6 6 4 5 2 1 9 2 4 3 3 9 5)

And then, after running a few hundred or a few thousand steps, I’ll look at R99 and see if it’s the answer.

So by now you should be thinking or saying “Whaaaa?!” Especially if you write programs for a living. Maybe a little spluttering, too.

How can that do anything at all? Go ahead and splutter; it’s perfectly natural, and we all do it now and then.

Let’s just try it and see what happens.

Trying it

Wolfgang and Christian call these little random chemistry thingies “Register Machines”, so I’ll do the same. What I’ll try to do today (maybe even before the repair man comes to replace our gas meter) is construct a Register Machine thingie, and maybe poke it enough to get a sense of how it works, and then tomorrow (or as soon as mutually convenient and as long as I don’t blow up in the meantime, because gas meter) I’ll spend some time evolving them to do stuff.

So looking over Table 11-1 in the book chapter, I can already see a few milestones to aim for in building some running code. I write those on 3⨉5 cards, so I can look over them as I work:

There are 30 or so “connection registers”, which I suppose are the things I’ve already labeled R1 and so forth
There are 11 “evolved constants”, which I suppose I can treat either as read-only registers with values tucked into them
The experiments seem to use a few different functions: add, sub, div, mult, pow, and, or, not. Interestingly some are logical operators, some arithmetic, so I… guess that means our connection registers and constants are actually dynamically typed?
initial from-the-urn register machines will have 100 steps in them
I suppose each step should be an arbitrary function from the list, and each argument should be chosen at random from the union of the constants and connection registers, and each output value should be chosen at random only from the connection registers (since I can’t write to the constants)
there are two problems spelled out in the paper, and I may as well try to emulate that for now
one problem described in the book chapter is the approximation of the function \(sin(x)\), which may seem trivial but realize we’re cobbling it together from the arithmetic and logic functions I listed, and don’t have access to transcendental functions and stuff
The other problem described in the chapter is a “thyroid data” classification problem… and that’s a bit more problematic.

I look at the link from the bibliography… and discover it’s an obsolete URI for [the UCI data repository site], not the data set as such. Poking around in the search interface there, I discover this Thyroid Dataset, but the book chapter describes

It contains 3772 training and 3428 testing samples, each measured from one patient. A fitness case consists of a measurement vector containing 15 binary and 6 real valued entries of one human being and the appropriate thyroid function (class).

And yup, the dataset seems to match that, since its main page says

# A Thyroid database suited for training ANNs 

** 3 classes 
** 3772 training instances, 3428 testing instances 
** Includes cost data (donated by Peter Turney) 

Whew. I was worried for a minute that the dataset had disappeared or been superseded somehow. I can’t really complain about the mis-linked data, since to be honest I think I copy-edited the book chapter, and it was my responsibility to catch that the link was a bit too general. So that’s on me.

Finally, it looks as if (by fiat) when we “run” a RegisterMachine, we take five random steps for every item in the “program”. If the “program” is 100 steps long, as it will be in an initial random guess, then I suppose we take 500 random steps to “run” the chemistry it encodes, not counting loading the registers at the beginning.

The rest of the details from Table 11-1 are all about searching, so I’ll leave those for next time. That seems like plenty to do, just building these little RegisterMachine things and running them.

Testing, all the way, of course.

Continued in “Making a Register Machine”

Fun trick you can use to amuse yourself in classes and seminars: Whenever somebody writes or says “drawn from an urn”, hear it as pulled out of my ass ↩