Rethinking the Push span as an undirected interval

Draft of 2016.07.05

May include: Push ↘ GP ↗ learning in public ↖ &c.

Continues “Adding a Span type to Push, Part 2”

Yesterday, in between difficulties framing edge case tests and handling all the combinatorics of the Push Span type, I was discussing the work I’ve done so far with friends online. Several of them asked, “Why are they directed? Does the direction make a difference, or is it a red herring?” It’s always a good idea to have that sort of question in mind, not least because of the First Law of Holes.

So I spent some time this morning thinking about the reasons and use case for this Push type. The fundamental concept is the manipulation of continuous mathematical intervals, which are not directed. I think the notion of letting :start and :end take arbitrary scalar values (thus of preserving a sort of “orientation”) was a mix of my aversion to premature optimization and limitation, plus a long-wanted feature of handling exterior algebraic operations.

I’m willing to back out some or all of the work done so far. That’s what the testing infrastructure is all about, and I certainly have been experiencing some pain as I’ve worked because of the “extra” combinatorial stuff. What would I need to do?

Most deeply but most simply, I think probably I should probably call this type an Interval rather than a Span. That’s simple to “fix”, in the sense of global find-and-replace. The less simple thing to change is to remove the notion of orientation.

I can imagine :start becoming :min and :end becoming :max. Both will still have the possibility of being closed or open, so there will also need to be some kind of :min-open? and :max-open? flags, I suppose. Glancing over the functions already written, I imagine there might not be much change to infrastructure either, except that :make-interval will probably sort its numeric arguments before doing pretty much what it already does? Or should it balk in some way when given numeric arguments in the “wrong” order? In other words, should (make-interval 7 2) become (->Interval 2 7), or (->Interval 7 7), or raise an exception?

I think the first choice makes me more comfortable. That is, the first two arguments would be sorted when constructing a new Interval. But then what happens if the constructor call is (make-interval 7 2 :min-open? true)? I suppose the resulting mathematical interval would be \((2,7]\), since it says “min” right there. If I were to leave this argument as :start-open?, then it feels like it would also work, but in this case it would refer to the first argument, and since those would be rearranged in the resulting Interval… no, that’s too fiddly.

Seems legit. Now… how much rejiggering would there be for the algorithms I’ve written so far? Remarkably little, I think: there would be fewer test cases, once I’d established that (make-interval 7 2) is identical to (make-interval 2 7), I could skip those with confidence. The question of open ends still arises, but there may be a few places where the algorithms get simpler because the concepts are simpler. The function span-coincide?, for example, is no longer necessary… but it does seem to have a relative I might end up using, which tells me whether one :interval has the same :min and :max as another….

I think we’ll see shortly.

The last question, of course, is whether to rewrite the codebase (for this type, of course) from scratch, or to replace things feature by feature?

I think (and thank goodness for git that I can say this) I’ll start by reworking the code in push.types.definitions.span to refer to interval throughout, and fix tests as I go.

A real name changer

First thing seems to be changing the names of things from Span to Interval, with all necessary capitalization variations. The most annoying part, since this is Clojure, is changing the namespace definitions and filenames to match one another. I do that carefully, and then (on the assumption that the words are more or less substitutable throughout) I try a slightly-less-careful global find-and-replace.

Not a wise decision. Thank you, git stash!

You know, there’s really not a huge amount of stuff here. The most useful parts are in the tests, not the codebase itself. I think very seriously about eliminating all of the codebase, but retaining the tests and mindfully changing them to work the way I expect this new type definition should work. That makes a bit more sense than just trying to patch two day’s worth of code, especially when I have been complaining about my discomfort with it.

With a bit more care, this time all I do is change the filenames and namespace definitions to say interval where they said span before, and run the tests.

Much more slowly and carefully this time, I first duplicate all four span-related files to refer to interval, and change their namespace dependencies in the minimal possible way to refer to one another instead of the older span files, and (just because it’s the only other place the path appears) change the definition of span? (the predicate function in push.types.definitions.interval) to check that the item is a push.types.definitions.interval.Span. Like I said, the minimal changes. And all the tests still pass.

Now I delete the four older span files, and run all the tests again. I did pretty well: there was one place where it tripped up, a different reference to the span? checker function in the definition of the Push type. But once that’s fixed, I’ve renamed the files and all tests pass.

I wonder what I can do next, safely and without shrieking?

Later…

Well, while I was away just now I cleaned up the rest of push.type.definitions.interval, changing variable names from span to interval, and deleting a few things that no longer make any sense. For instance, :span-coincide? can’t be carried forward as it was, since by definition an Interval record can’t be “reversed” any more. Though there is some room, potentially, for interval items that have the same :min and :max values, but different openness. Maybe. I’m skeptical, to be frank.

But all the tests for the definitions.interval file pass for the moment, with only minimal twiddling. I recall that some of my instruction tests were the ones that caught bugs in the definition files, so I won’t burden you with the code until I’ve also revived and rewritten that part. Back in a jiffy….

Later…

That actually went very well. Barbara paired with me on the tricky parts, because there were several changes that had to be made to convert the old Span record to a new Interval record: :start became :min, :end became :max, :start-open? became :min-open? (and the concepts are not identical!), :end-open? became :max-open?, and then there were several things like span-reverse and span-coincide? that no longer apply at all to the Interval concept.

Here’s where the definition file ended up:

(ns push.type.definitions.interval)


;; Interval records


(defrecord Interval [min max min-open? max-open?])


(defn interval?
  "a type checker that returns true if the argument is an Interval record"
  [item]
  (instance? push.type.definitions.interval.Interval item))



(defn make-interval
  "Takes `:min` and `:max` scalar numeric arguments and by default returns a closed `Interval` record. Optionally takes `:min-open?` and `:max-open?` boolean keyword arguments, which can create a partially or fully open interval."
  [low high & {:keys [min-open? max-open?] :or {min-open? false max-open? false}}]
  (->Interval (min low high) (max low high) min-open? max-open?)
  )



(defn make-open-interval
  "Takes `:min` and `:max` scalar numeric arguments and returns a doubly-open `Interval` record."
  [low high]
  (->Interval low high true true))



(defn min-closed?
  "Takes an Interval record, and returns true if its `:min-open?` is false, and vice versa"
  [interval]
  (not (:min-open? interval)))



(defn max-closed?
  "Takes an Interval record, and returns true if its `:max-open?` is false, and vice versa"
  [interval]
  (not (:max-open? interval)))



(defn interval-empty?
  "Takes an Interval and returns `true` if it has identical `:min` and `:max`, and either is open"
  [interval]
  (and (= (:min interval) (:max interval))
       (or (:min-open? interval) (:max-open? interval))))



(defn interval-include?
  "Takes an Interval record and a scalar, and returns true if the scalar falls strictly within the Interval"
  [interval n]
  (let [s   (:min interval)
        e   (:max interval)
        so? (:min-open? interval)
        eo? (:max-open? interval)]
    (cond
      (and (= s n) so?) false
      (and (= e n) eo?) false
      (and (not (interval-empty? interval)) (= s e n)) true
      :else (not= (compare s n) (compare e n)))))



(defn contains-interval-end?
  "Takes two intervals. Returns `true` if either end of the second one falls _strictly_ within the first, whether or not it is open."
  [interval1 interval2]
  (let [s1       (:min interval1)
        e1       (:max interval1)
        s1open   (make-open-interval s1 e1)
        s1closed (make-interval s1 e1)
        s2       (:min interval2)
        e2       (:max interval2)]
    (or
      (if (:min-open? interval2)
        (interval-include? s1open s2)
        (interval-include? interval1 s2))
      (if (:max-open? interval2)
        (interval-include? s1open e2)
        (interval-include? interval1 e2)))))




(defn interval-subset?
  "Takes two intervals, and returns `true` if the second one is a subset of the first."
  [interval1 interval2]
  (let [i1closed (make-interval (:min interval1) (:max interval1))
        s2   (:min interval2)
        e2   (:max interval2)]
    (and (if (:min-open? interval2)
            (interval-include? i1closed s2)
            (interval-include? interval1 s2))
         (if (:max-open? interval2)
            (interval-include? i1closed s2)
            (interval-include? interval1 e2))
         )))



(defn interval-overlap?
  "Takes two intervals, and returns `true` if they strictly overlap (taking into account openness of ends)."
  [interval1 interval2]
  (let [s1 (:min interval1)
        e1 (:max interval1)
        i1open (make-open-interval s1 e1)
        s2 (:min interval2)
        e2 (:max interval2)
        i2open (make-open-interval s2 e2)]
    (cond
      (interval-empty? interval1) false
      (interval-empty? interval2) false
      (= i1open i2open) true
      (contains-interval-end? interval1 interval2) true
      (contains-interval-end? interval2 interval1) true
      (= interval1 interval2) true
      :else false
      )))

And here is the current Push type definition for the :interval type:

(ns push.type.item.interval
  (:use     [push.instructions.dsl]
            [push.type.core]
            [push.instructions.core])
  (:require [push.instructions.aspects :as aspects]
            [push.type.definitions.interval :as interval]
            ))


(def interval-empty?
  (build-instruction
    interval-empty?
    "`:interval-empty?` pops the top `:interval` item and pushes `true` if it's empty: that is, if the ends are equal, AND at least one end is open"
    :tags #{:interval}
    (consume-top-of :interval :as :arg)
    (calculate [:arg] #(interval/interval-empty? %1) :as :result)
    (push-onto :boolean :result)
    ))


(def interval-include?
  (build-instruction
    interval-include?
    "`:interval-include?` pops the top `:interval` and top `:scalar`, and pushes `true` if the `:scalar` value falls (strictly!) within the `:interval`, taking openness of the ends into account"
    :tags #{:interval}
    (consume-top-of :interval :as :i)
    (consume-top-of :scalar :as :number)
    (calculate [:i :number] #(interval/interval-include? %1 %2) :as :result)
    (push-onto :boolean :result)
    ))


(def interval-overlap?
  (build-instruction
    interval-overlap?
    "`:interval-overlap?` pops the top two `:interval` items and pushes `true` if they overlap, even in a single point: that is, if their union is non-empty, taking into account which ends are open"
    :tags #{:interval}
    (consume-top-of :interval :as :arg2)
    (consume-top-of :interval :as :arg1)
    (calculate [:arg1 :arg2] #(interval/interval-overlap? %1 %2) :as :result)
    (push-onto :boolean :result)
    ))


(def interval-subset?
  (build-instruction
    interval-subset?
    "`:interval-subset?` pops the top two `:interval` items (call them B and A, respectively) and pushes `true` if B is a subset of A. That is, if both ends of B fall strictly within A, or they are identical."
    :tags #{:interval}
    (consume-top-of :interval :as :arg2)
    (consume-top-of :interval :as :arg1)
    (calculate [:arg1 :arg2] #(interval/interval-subset? %1 %2) :as :result)
    (push-onto :boolean :result)
    ))


(def interval-type
  (-> (make-type  :interval
                  :recognized-by interval/interval?
                  :attributes #{:numeric :set})
        aspects/make-equatable
        aspects/make-movable
        aspects/make-printable
        aspects/make-quotable
        aspects/make-repeatable
        aspects/make-returnable
        aspects/make-storable
        aspects/make-taggable
        aspects/make-visible 
        (attach-instruction , interval-empty?)
        (attach-instruction , interval-include?)
        (attach-instruction , interval-overlap?)
        (attach-instruction , interval-subset?)
  ))

I have a feeling of accomplishment, and I think the resulting code is conceptually a lot cleaner. For instance, Barbara noticed when I was updating :span-surrounds? that the concept doesn’t really seem to make as much sense when speaking of mathematical intervals, and so we went with :interval-subset?. This also reinforces the unidirectionality of the concept: \(B \subseteq A\) conveys a clearer and more rigorous meaning, if you ask me, than “A surrounds B”.

Continued in The Interval Push type in Klapaucius