# When do continuous spans overlap?

Draft of 2016.07.02

May include: Push ↘ GP ↗ learning in public ↖ &c.

This morning I thought I’d spend a “few minutes” to address an old issue in the Klapaucius Push interpreter. Long ago, in some other genetic programming interpreter—I no longer recall which one—I remember implementing a type which described a *continuous range of scalar values*. I think it was called a `span`

, but it might also have been called a `range`

, but now I’m working in Clojure and a `range`

is a core function describing something rather different. I want mine to be a bit more abstract and even lazier than the Clojure function: more of a static structure, honestly.

In the last year it’s become surprisingly easy to add a new type to the Klapaucius interpreter, though as I look over it I see some places to improve and simplify even more. But that’s for another day. The basic workflow is (today):

- assuming the new type requires a Clojure
`record`

to manage it, create a new file like`push.type.definitions.span`

, where the`defrecord`

and any associated type-specific “helper” functions will be placed, plus a`recognizer`

predicate function that will tell me whether an arbitrary Clojure thingie is a Push`:span`

or not - create a Midje test file in
`push.type.definitions.span_test`

, and start implementing tests for the core functionality of the type - create a file like
`push.type.item.span`

, which actually constructs the Push type itself, defines its instructions with the Klapaucius`build-instruction`

macro and PushDSL code, adds`aspect`

-based automatically-generated instructions if you want any, and finally invokes`make-type`

(or`make-module`

) to compile it all into a single record - build a Midje test file like
`push.instructions.base.span_test`

to exercise the new type’s instructions and router, and make sure an arbitrary Push interpreter does what’s expected whenever any instruction is called - actually wire the fully-tested type (or module) into the interpreter templates, especially the one I’ve been calling
`one-with-everything`

^{1}

Actually, now I look at this it doesn’t seem quite as “simple” as I imagined. It’s *simpler* than it has ever been, but I think very soon I will need to fix it quite a bit more. It should be three steps, easy to understand and explain and implement.

But not today.

## Spans

The `:span`

type I have in mind represents a continuous set of real numbers, a “region of the number line”. As you probably remember from High School math, we speak of these regions as being “open” or “closed” on each end, too, depending on whether the number at the end is strictly part of the set or is *just outside* the set. For example, I’m envisioning the `:span`

includes the numbers 3 and 4 and every other number in between; does not include the number 3, just everything between it and 4 (and also 4); includes all the numbers between 3 and 4, but neither of those.

Unlike the mathematical convention—and mainly because I know random programs and evolutionary search are going to do this even if I don’t want them to—I’m willing to permit *orientation* here as well. That is, is not *equal to* in my scheme, but they do *coincide with* one another, in the sense they contain the exact same set of numbers. This doesn’t imply anything about the mathematics; it’s honestly just an arbitrary design decision I’ve made to see what happens (if anything).

I figure I’ll spend some time building some “simple infrastructure” (famous last words) in `push.type.definitions.span`

, because I know/remember a few things I want this type to do. Most of the simplest stuff comes “for free” when I add a Push aspect like `equatable`

or `moveable`

later on, so I can focus on the type-specific behavior here. What I have in mind are mainly *relations* between `:span`

items and `:scalar`

values:

- does the
`:span`

contain the`:scalar`

? - does the
`:span`

overlap another`:span`

? - does the
`:span`

contain another`:span`

?

Those seem pretty straightforward, but (foreshadowing!) I am starting to guess that with orientations and open vs closed ends, there might be some edge cases to handle.

## Literal edge cases

Getting started is pretty easy, although I think for a bit about some possible architectures for the basic Clojure `Span`

record, to handle open vs closed ends. Obviously every `Span`

will have a `:start`

and `:end`

, because duh. But either or both of those could be “open” or “closed”, and one wonders how to handle that. One “simple” way is to add two new fields to the record that specify each end’s openness, something like `:start-open?`

and `:end-open?`

. Another way might involve making a variety of different `Span`

“subclasses”, like `Span`

and `HalfSpan`

and `OpenSpan`

, but that idea makes me back away. Another *other* way might involve leaving `Span`

with a single `:start`

and `:end`

, and then pushing “open” vs “closed” somehow *down into* the values of `:start`

and `:end`

. This last one is interesting, but odd; I wonder if there is such a thing as an “open number” vs “closed number”, because at least up to the point where this occurred to me, I only thought numbers were numbers.

While the last one feels philosophically *interesting*, I’m actually not here to do much philosophy. The first one feels like the simplest way to proceed, so I add

(defrecord Span [start end start-open? end-open?])

This is just raw Clojure, and to be honest I never like using the “free” constructor `->Span`

outright, so I add a simple wrapper as well, like this:

(defn make-span "Takes `:start` and `:end` scalar numeric arguments and by default returns a closed `Span` record. Optionally takes `:start-open?` and `:end-open?` boolean keyword arguments, which can create a partially or fully open span." [start end & {:keys [start-open? end-open?] :or {start-open? false end-open? false}}] (->Span start end start-open? end-open?) )

I test that a bit while I’m writing it *and for the most part beforehand*, but I’m recording this narrative after the fact so let me show you how the tests end up. Just to give a sense of how Midje testing works, in case you’re landing here out of order:

(ns push.type.definitions.span_test (:use midje.sweet) (:use [push.type.definitions.span])) (fact "make-span returns a Span record with closed bounds" (make-span 9 6) => (->Span 9 6 false false) (make-span 1 1) => (->Span 1 1 false false)) (fact "make-span can take optional keyworded open bounds arguments" (make-span 9 6 :start-open? true) => (->Span 9 6 true false) (make-span 1 1 :end-open? true) => (->Span 1 1 false true) (make-span 2 3 :start-open? true :end-open? true) => (->Span 2 3 true true)) (fact "Span getters work as expected" (:start (make-span 3 4)) => 3 (:end (make-span 3 4)) => 4 (:start-open? (make-span 3 4)) => false )

Like I said, I’ve written most of these tests *before* I wrote code to make them pass, using the errors and failures produced by Marick’s excellent Midje framework to direct me what is needed after each new test is added.

The question of whether a number falls “within” a `:span`

seems like a good starting point after this infrastructure is done. Here are some tests I end up with:

(fact "span-include? works for closed spans" (span-include? (make-span 3 4) 3.5) => true (span-include? (make-span 3 4) 3) => true (span-include? (make-span 3 4) 4) => true (span-include? (make-span 3 4) 2) => false (span-include? (make-span 3 4) 7) => false (span-include? (make-span 4 3) 3.5) => true (span-include? (make-span 4 3) 3) => true (span-include? (make-span 4 3) 4) => true (span-include? (make-span 4 3) 2) => false (span-include? (make-span 4 3) 7) => false )

And the code I need to make those pass:

(defn span-include? "Takes a Span record and a scalar, and returns true if the scalar falls strictly within the Span" [span n] (let [s (:start span) e (:end span)] (not= (compare s n) (compare e n))))

I’m pretty pleased with that `compare`

sitting there. The Clojure `compare`

function is one of my favorite algorithmic gimmicks, it turns out. For numeric arguments^{2} it returns `-1`

if the first argument is strictly less than the second, `1`

if it’s strictly larger, and `0`

if they’re equal. Just sketching a bit, I realize that if the number is outside the span on the left *or* right, it will be (respectively) less than or greater than both ends of the `:span`

, regardless of the orientation of that `:span`

. If it “lands on” either end, it’ll be equal and not-equal….

*Oh hang on.*

Right, so a `:span`

can have identical `:start`

and `:end`

. Oh, and also those can be *open*. So if the number falls *on* an endpoint, whether it’s “inside” the `:span`

will depend whether that point is open or not, and *also* whether the span is zero-length.

Time for some more tests! I realize as I’m writing the first test how annoying and wordy it is to write `(make-span x y :start-open? true :end-open? true)`

, so I throw in a new function I haven’t even written yet, called `make-open-span`

, which I assume will just by default set both ends to “open” status.

(fact "span-include? works for open spans" (span-include? (make-open-span 3 4) 3.5) => true (span-include? (make-open-span 3 4) 3) => false (span-include? (make-open-span 3 4) 4) => false (span-include? (make-open-span 3 4) 2) => false (span-include? (make-open-span 3 4) 7) => false (span-include? (make-open-span 4 3) 3.5) => true (span-include? (make-open-span 4 3) 3) => false (span-include? (make-open-span 4 3) 4) => false (span-include? (make-open-span 4 3) 2) => false (span-include? (make-open-span 4 3) 7) => false ) (fact "span-include? works for empty spans" (span-include? (make-open-span 3 3) 3) => false (span-include? (make-open-span 3 3) 4) => false (span-include? (make-open-span 3 3) 2) => false ) (fact "span-include? works for partially open spans" (span-include? (make-span 3 4 :end-open? true) 3) => true (span-include? (make-span 3 4 :end-open? true) 4) => false ) (fact "span-include? works for point spans" (span-include? (make-span 3 3) 3) => true (span-include? (make-span 3 3) 4) => false )

I also realize as I’m writing some of these that I’ve left room for there to be such a thing as an “empty span”: one with identical `:start`

and `:end`

values, and either endpoint open. Well, sortof: one has to wonder what a `:span`

“really is” where both ends are identical values, but one is open and one is closed, like . I elide this issue for the moment, just by saying “we’ll burn that bridge when we get to it,” and thinking something like “openness trumps closedness”.

Anyway, there’s some wrestling after this, and a few more definitions of helper functions and predicates, and I end up with the following function, which seems to handle all these edge cases:

(defn make-open-span "Takes `:start` and `:end` scalar numeric arguments and returns a doubly-open `Span` record." [start end] (->Span start end true true)) (defn span-empty? "Takes a Span and returns `true` if it has identical `:start` and `:end`, and either is open" [span] (and (= (:start span) (:end span)) (or (:start-open? span) (:end-open? span)))) (defn span-include? "Takes a Span record and a scalar, and returns true if the scalar falls strictly within the Span" [span n] (let [s (:start span) e (:end span) so? (:start-open? span) eo? (:end-open? span)] (cond (and (= s n) so?) false (and (= e n) eo?) false (and (not (span-empty? span)) (= s e n)) true :else (not= (compare s n) (compare e n)))))

This conditional result in the `span-include?`

gets a bit complicated. The first two lines essentially say: “Is the point exactly on the end of the `:span`

, and is that end also open?” The last one is the same old one that worked fine for the case of a non-degenerate closed `:span`

. The third clause was pretty tricky, and to be honest I’m not entirely comfortable with it even yet. It exists to handle the possibility that the span is degenerate (same `:start`

and `:end`

) but *not* empty, when the point lands exactly on one (and thus both) of the endpoints.

Literally edge cases!

That said, I can’t think of any more situations, for whatever combination of degeneracy, openness, emptiness, and so forth.

## Overlapping spans

The actual point of the `:span`

Push type is to handle some interesting Computational Geometry benchmarks I’ve been throwing at GP systems for the last few years. These involve complex hulls and the “shadows” of polygons projected onto points in 2d spaces… plus a whole bunch of other stuff…. But the point here is that my notional “span” can be thought of as the *shadow* of some geometric object in a plane, and the goal of such a notion is whether and how much that “shadow” overlaps some other object. I’ve already handled the case where the “shadow” does or doesn’t include a 1d point. Now it’s time to handle the case(s) in which one `:span`

overlaps another.

The easiest one for me to get my head around is the case where both spans are fully closed, and non-degenerate, and either overlapping on one end or nested inside one another.

(fact "span-overlap? works as expected for closed spans" (span-overlap? (make-span 3 5) (make-span 4 6)) => true (span-overlap? (make-span 3 5) (make-span 2 4)) => true (span-overlap? (make-span 3 5) (make-span 17 77)) => false ) (fact "span-overlap? works as expected for nested spans" (span-overlap? (make-span 3 8) (make-span 4 6)) => true (span-overlap? (make-span 3 4) (make-span 2 8)) => true )

Makes sense, right?

I write a *very* simple `span-overlap?`

function here that invokes `span-include?`

to see whether the endpoints of one `:span`

fall within the other. This works, but it fails immediately for the very next test I write so I won’t even bother showing it to you. I suspect you could already write it yourself, if you wanted.

But here’s the first bad thing:

(fact "span-overlap? can notice touching-but-not-overlapping open spans" (span-overlap? (make-open-span 3 4) (make-span 4 6)) => false (span-overlap? (make-span 4 3) (make-open-span 2 3)) => false (span-overlap? (make-span 4 3) (make-span 2 3)) => true )

Aha, nope, that fails badly with just “is the end point of one inside the other” testing. The reason makes me hesitate and consider my fundamental design choice for the `Span`

record and the `span-include?`

functions: the failures occur when the *open* endpoint of one span falls inside the other, because when I check with `span-include?`

to determine whether a point is inside a span (including overlapping its endpoint), there is no concept of whether the point is itself “open” or not. What does that even mean, to have an “open” point?

Madness, I tell you. But madness we will have to work with.

I add a few more checks to my clean and elegant `span-overlap?`

predicate, to handle this failing test, and immediately—*while I’m writing the code to fix this failure*—recognize there’s another edge case that will fail as well. And then that case reminds me of another, and so on…. Let me write all of those here, too:

(fact "span-overlap? can notice touching-but-not-overlapping open spans" (span-overlap? (make-open-span 3 4) (make-span 4 6)) => false (span-overlap? (make-span 4 3) (make-open-span 2 3)) => false (span-overlap? (make-span 4 3) (make-span 2 3)) => true ) (fact "span-overlap? works as expected for overlapping open spans" (span-overlap? (make-open-span 3 8) (make-open-span 4 6)) => true (span-overlap? (make-open-span 3 8) (make-span 4 6)) => true (span-overlap? (make-span 3 8) (make-open-span 4 6)) => true (span-overlap? (make-open-span 3 4) (make-open-span 2 8)) => true (span-overlap? (make-open-span 3 4) (make-span 2 8)) => true (span-overlap? (make-span 3 4) (make-open-span 2 8)) => true ) (fact "span-overlap? works as expected for nested open spans" (span-overlap? (make-span 3 8) (make-open-span 4 6)) => true (span-overlap? (make-open-span 3 4) (make-span 2 8)) => true ) (fact "span-overlap? works for identical spans" (span-overlap? (make-span 3 4) (make-span 3 4)) => true (span-overlap? (make-span 3 4) (make-span 4 3)) => true (span-overlap? (make-open-span 3 4) (make-open-span 3 4)) => true )

I am sparing you some (much, actually) thrashing around here, just presenting you after the fact with where I ended up. Some of my tests were written wrong—always a sign that things are getting overwhelming and a place where you should start to get concerned about the usefulness of your abstractions. Several of my tests also didn’t catch other edge cases.

I start to think seriously about the new `clojure.spec`

system for defining logical relationships, but I don’t actually want to use alpha software (as of this writing) or learn a whole new representation today. So I do that stupid thing (not foreshadowing this time) and *try to pay better attention*.

Along the way, I realize I also need to check the possibility that the two `Span`

objects are not identical, but do *coincide*, in the sense that they are oriented in opposite directions with the same endpoints and openness. So I write a `span-coincide?`

predicate as well.

I also discover that it’s more succinct and/or convenient to ask whether an endpoint is *closed* rather than *open*, so I end up with some helpers that check that for me as well. These are `start-closed?`

and `end-closed?`

, which simply contradict the built-in `:start-open?`

and `:end-open?`

fields of the `Span`

record.

Here are the implementations of `span-coincide?`

and `span-overlap?`

that I ended up with:

(defn span-coincide? "Takes two spans, and returns `true` if they are identical, or if the reverse of the second is identical to the first" [span1 span2] (or (= span1 span2) (= span1 (span-reverse span2)))) (defn span-overlap? "takes two Span records, and returns true if they strictly overlap (taking into account openness of ends)" [span1 span2] (let [s1 (:start span1) e1 (:end span1) s2 (:start span2) e2 (:end span2)] (cond (and (span-include? span1 s2) (start-closed? span2)) true (and (span-include? span1 e2) (end-closed? span2)) true (and (span-include? span2 s1) (start-closed? span1)) true (and (span-include? span2 e1) (end-closed? span1)) true (and (span-include? span1 s2) (span-include? span1 e2)) true (and (span-include? span2 s1) (span-include? span2 e1)) true (span-coincide? span1 span2) true :else false )))

I’m not happy with it at the moment, but I can’t see a way to refactor it. I *do* think it’s interesting, though, how it explicitly “unfolds” the combinatorics of the edge cases in this problem into individual cases. Is the endpoint being tested open or closed? Are both end points of one span inside the other? Are the two `Span`

items “the same”?

But I also dislike the visible repetition, and I want to refactor some or all of that away. I wonder how I can do that… and also how the “few minutes” to implement this type became an entire Saturday morning.

### A bit of refactoring, and a whole new problem

I’d rather at least some of that repetition were removed to a separate function, just for clarity. So I pull out the thing that checks whether one end of a span is closed and falls inside the other, and also pull out the thing that checks to see if one span is surrounded by the other:

(defn contains-closed-end? "Takes two spans. Returns `true` if either end of the second one is closed AND falls within the first." [span1 span2] (let [s2 (:start span2) e2 (:end span2)] (or (and (span-include? span1 s2) (start-closed? span2)) (and (span-include? span1 e2) (end-closed? span2))))) (defn span-surrounds? "Takes two spans, and returns `true` if the first one completely surrounds the second one; that is if both ends fall strictly within the first span." [span1 span2] (let [s2 (:start span2) e2 (:end span2)] (and (span-include? span1 s2) (span-include? span1 e2)))) (defn span-overlap? "Takes two spans, and returns `true` if they strictly overlap (taking into account openness of ends)." [span1 span2] (let [s1 (:start span1) e1 (:end span1) s2 (:start span2) e2 (:end span2)] (cond (contains-closed-end? span1 span2) true (contains-closed-end? span2 span1) true (span-surrounds? span1 span2) true (span-surrounds? span2 span1) true (span-coincide? span1 span2) true :else false )))

I start feeling overwhelmed by all the edge cases, and have a sneaking suspicion I’ve missed something. I throw in a couple of tests that I’m *sure* will catch something I’ve slipped up on… and one of the most obvious-feeling tests fails right away.

(fact "span-overlap? works when an open end overlaps another" (span-overlap? (make-open-span 3 5) (make-open-span 4 6)) => true )

That is, an open-ended `:span`

overlapping another one doesn’t trigger any of the conditions I’ve set up yet. *Dagnabbit.*

This starts to feel a bit like Whack-a-Mole. So I need another predicate, I suppose, which tests that an endpoint of one span falls “extra-strictly” within the other. “Extra-strictly” means, I guess, that the point should be detected by `span-include?`

but *also* should not exactly equal either end point, even if both are closed.

Here’s the whole pile of stuff I’ve ended up with, just to get things to pass for all the edge cases I can imagine (so far) of points and `:span`

overlaps, with open and closed ends on the `:span`

items &c &c:

(defn span-include? "Takes a Span record and a scalar, and returns `true` if the scalar falls strictly within the Span" [span n] (let [s (:start span) e (:end span) so? (:start-open? span) eo? (:end-open? span)] (cond (and (= s n) so?) false (and (= e n) eo?) false (and (not (span-empty? span)) (= s e n)) true :else (not= (compare s n) (compare e n))))) (defn contains-closed-end? "Takes two spans. Returns `true` if either end of the second one is closed AND falls within the first." [span1 span2] (let [s2 (:start span2) e2 (:end span2)] (or (and (span-include? span1 s2) (start-closed? span2)) (and (span-include? span1 e2) (end-closed? span2))))) (defn fully-contains-end? "Takes two spans. Returns `true` if either end of the second one falls within the first, and is not identical with either endpoint of the first." [span1 span2] (let [s1 (:start span1) e1 (:end span1) s2 (:start span2) e2 (:end span2)] (or (and (span-include? span1 s2) (not= s1 s2) (not= e1 s2)) (and (span-include? span1 e2) (not= s1 e2) (not= e1 e2))))) (defn span-surrounds? "Takes two spans, and returns `true` if the first one completely surrounds the second one; that is if both ends fall strictly within the first span." [span1 span2] (let [s2 (:start span2) e2 (:end span2)] (and (span-include? span1 s2) (span-include? span1 e2)))) (defn span-overlap? "Takes two spans, and returns `true` if they strictly overlap (taking into account openness of ends)." [span1 span2] (let [s1 (:start span1) e1 (:end span1) s2 (:start span2) e2 (:end span2)] (cond (contains-closed-end? span1 span2) true (contains-closed-end? span2 span1) true (fully-contains-end? span1 span2) true (fully-contains-end? span2 span1) true (span-surrounds? span1 span2) true (span-surrounds? span2 span1) true (span-coincide? span1 span2) true :else false )))

There’s a kind of code smell here I don’t like, but I also don’t know what to do about it. I’m writing finer and finer detectors, some of which (like this one) overlap the others in an almost-nested but not-quite-nested way. It feels a lot like the “open” vs “closed” abstraction is making a mess somehow. Makes me grumpy and frustrated, to be frank… but even more important, it makes me worry I’m missing some other kind of edge case that will also blow up, and that this new edge case will be so esoteric that I don’t even notice the failure.

## Moving on…

That feeling of frustration means I should probably step away and maybe sleep on the whole thing. For the moment, all the tests pass, and I can’t think of any more that exercise this particular functionality. For something that I imaginatively thought would take “a few minutes”, it’s been a long day already.

Hell, I haven’t even gotten to the *interesting* stuff, which is the real point of these little objects: binary operators over them. Let me spend some Me Time now just listing them here, speculatively, so I have something nice and optimistic to look at in the morning.

Thinking back to the half-remembered implementation I did before, I recall things like this:

`:span-extent`

takes two`:span`

items, and returns a new one that’s the maximum length from the leftmost of all four endpoints, to the rightmost. The orientation of the result is*always*left-to-right.`:span-net`

takes two`:span`

items, and returns a new one that connects the`:start`

of the first to the`:end`

of the second (ignoring all excursions and jumps in between)`:span-gap`

is like`:span-net`

but it connects the`:end`

of the first to the`:start`

of the second.`:span-space`

connects the first`:start`

to the second`:start`

`:span-error`

connects the first`:end`

to the second`:end`

`:span-reverse`

swaps the`:start`

and`:end`

, and openness of one`:span`

item`:span-close`

makes both ends of a`:span`

closed`:span-open`

makes both ends of a`:span`

open`:span-subtract`

takes two`:span`

items; any portion of the first one overlapping the second one is deleted; if the result is an empty`:span`

, both`:start`

and`:end`

become the old`:end`

`:span-xor`

takes two`:span`

items; imagine drawing the first,`:start`

to`:end`

, and then setting a “pen” back down at the`:start`

of the second which “draws” on unoccupied empty space, but “erases” occupied space. Thus any region of mutual overlap between the two is removed from both. Note that*two*`:span`

s can result, an that the orientation of both is preserved.`:span-intersection`

is something like`:span-xor`

, but instead of drawing lines with a self-erasing pencil, draw them with a pencil that only makes a “real mark” wherever at least two lines cross; thus, the new`:span`

is composed of only the part of the arguments that overlaps, and has an orientation the same as the second one; again, if the result is empty, its`:start`

and`:end`

are the same as the`:end`

of the second argument- etc…

Yeah, that should keep me busy for a few more minutes. I’ll do a bit more tomorrow, I think.

Continued in “Adding a Span type to Push”…