Draft

The Interval Push type in Klapaucius

Draft of 2016.07.06

Continues “Rethinking the Push Span type”

I feel somehow as if I were on the right track again. Interesting how a bit of frustration and code smell can divert you from a path towards delivering working features, and (as several people have pointed out) it’s always good to be reminded of the First Law of Holes.

Back to the track.

In the last episode, with a very few changes and some cunning improvements in naming, I managed to rework the Span structure I had been developing into a more elegant Interval. I like this much better because it has a robust and common interpretation—the first idea I had, which became Span, was derived from this same source, but I did one of my ill-considered “why not?” moves when it came to the bounds. The Span was defined as a continuous set between a :start and :end point, and thus could be oriented in either direction, which surprisingly led to several confusions on my part and (I think) pushed the combinatorial set of edge cases for testing over a magical limit my brain couldn’t handle.

More in keeping with its mathematical origins, the new Interval has a :min and :max rather than :start and :end, and as a result it handles a lot of the mental overhead about numerical geography for me. The :min will be less than (or equal to) the :max, so all Interval records are implicitly ordered in the same way.

As a bonus, at least for me, the word “interval” is actually more appropriate, and comes with some nice robust lists of use cases and algebraic habits. For instance, interval arithmetic is feasible, in addition to my scant subset of instructions inspired by Span manipulations.

My goal today is to finish and release a working type in the master branch of Klapaucius. Once I’ve reached the point where it “feels” simple and straightforward to add new instructions to the type—which I sense I might have now—I get a sense that some or all of the “weirder” instructions might be left aside as project Issues for another day. One of the core design goals in Klapaucius itself to include sufficient diversity of instruction behavior and interconnectedness that there’s a prospect of reaching a solution by at least two different paths. This is a bit vague and philosophical, but in practice it boils down to being most concerned for interconversion of types.

I think I’ll work on that next.

Making an Interval

The Klapaucius library recently went through a big rewrite, in which the traditional Push :float and :integer were consolidated into a single :scalar type. Things like :interval become much easier to construct as a result, since one doesn’t need to pick :integer or :float (or :rational or…) endpoints to build one.

(def interval-new
  (build-instruction
    interval-new
    "`:interval-new` pops the top two `:scalar` items (`B` and `A`, respectively) and creates a new `:interval` item [A,B]. Both ends are closed."
    :tags #{:interval}
    (consume-top-of :scalar :as :arg2)
    (consume-top-of :scalar :as :arg1)
    (calculate [:arg1 :arg2] #(interval/make-interval %1 %2) :as :result)
    (push-onto :interval :result)
    ))

and an accompanying test:

(tabular
  (fact ":interval-new builds one out of two :scalar values"
    (register-type-and-check-instruction
        ?set-stack ?items interval-type ?instruction ?get-stack) => ?expected)

    ?set-stack  ?items       ?instruction     ?get-stack  ?expected

    :scalar    '(2 3)        :interval-new     :interval  (list (s/make-interval 2 3))
    ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
    :scalar    '(3 2)        :interval-new     :interval  (list (s/make-interval 2 3))
    ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
    :scalar    '(3 3)        :interval-new     :interval  (list (s/make-interval 3 3))
    ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
    )

This instruction consumes :scalar values to build an :interval, but it doesn’t leave the possibility of either end being open. In my testing I’ve made do with make-interval and make-open-interval, so I wonder if I could take that route in Push as well, maybe with something that toggles the openness of the ends. Seems like a plan.

I implement a Push instruction called :interval-newopen, which calles make-open-interval, and discover a bug left over from the conversion of Span to Interval, in which the :min and :max weren’t sorted. Feels like I should consolidate that behavior somehow, but I don’t immediately see a place to do so.

Later…

I decide to make this work using two :boolean values, for the moment. In a Slack conversation about it, Matthew Weitzel wonders (as I do) whether something like binary filemode would work as well, using a :scalar reduced to a sequence of bits, but I think that is actually a nice thing to make an Issue in the project: there are several places I can see that being a way for programs to summarize structured binary states, like with filtering collections.

But with :interval-new, :interval-newopen and :interval-rebracket

(def interval-rebracket
  (build-instruction
    interval-rebracket
    "`:interval-rebracket` pops the top `:scalar` item and two `:boolean` values (`B` and `A`, respectively). The `:min-open?` value is set to `A`, the `:max-open?` value is set to `B`, and the resulting `:interval` is pushed as a result."
    :tags #{:interval}
    (consume-top-of :boolean :as :max?)
    (consume-top-of :boolean :as :min?)
    (consume-top-of :interval :as :i)
    (calculate [:i :min? :max?]
        #(interval/make-interval 
            (:min %1) (:max %1) :min-open? %2 :max-open? %3) :as :result)
    (push-onto :interval :result)
    ))

This seems to be working pretty well. I am reminded there was a list of possible :span instructions I’d sketched back in the very first stage of this work. I want to revisit that now before tacking interval arithmetic.

Interacting intervals

There was a thing called :span-extent, which I think is better called :interval-hull now. That is, it takes two :interval items and returns a new one with the maximum extent of both, preserving the inclusiveness of the ends.

This turns out to be rather painless:

(def interval-hull
  (build-instruction
    interval-hull
    "`:interval-hull` pops the top two `:interval` items and pushes a new `:interval` whose extent is the hull of the arguments. That is, it reaches from the smallest `:min` to the largest `:max`, preserving inclusiveness of the ends."
    :tags #{:interval}
    (consume-top-of :interval :as :i2)
    (consume-top-of :interval :as :i1)
    (calculate [:i1 :i2]
        #(interval/make-interval 
            (min (:min %1) (:min %2))
            (max (:max %1) (:max %2))
            :min-open? (and (:min-open? %1) (:min-open? %2))
            :max-open? (and (:max-open? %1) (:max-open? %2))) :as :result)
    (push-onto :interval :result)
    ))

Here I’m taking advantage of the fact that the :min of the result will be the min of both :min values. Confusing nomenclature, but valid. The openness is slightly more challenging, but basically if two ends coincide, I only want the result to be open if both arguments were. That is, if the endpoint of an Interval is included as part of that interval, I want it to be part of the result as well.

Next, there was an idea in :span-net, which was thinking of connecting the start of one Span to the end of another. This depends, in a real way, on the fact that Span was oriented; I can no longer think about these with Interval, as such. Also, there were four variations on this theme: start->end, end->start, start->start and end->end.

I’m tempted to consolidate these into a sort of crossover operation on Interval. That is, it takes two as arguments, but produces all four results. Let some other process sort through them to discover the useful one. There’s a long-running suspicion of mine that far too many Push instructions reduce the number of literals on hand, too, resulting in a sort of argument starvation of running Push programs. I’m going to err on the side of diversification here.

The result is wordier than I expect, but that’s honest work; it returns a code block of four :interval items.

(def interval-crossover
  (build-instruction
    interval-crossover
    "`:interval-crossover` pops the top two `:interval` items (call them `B` and `A`, respectively) and pushes a code block containing the four FOIL `:interval` items onto `:exec`. The results (in order) are `(make-interval (:min A) (:min B))`, `(make-interval (:min A) (:max B))`, `(make-interval (:max A) (:min B))` and `(make-interval (:max A) (:max B))`, and they preserve the openness of the points included the resulting intervals."
    :tags #{:interval}
    (consume-top-of :interval :as :B)
    (consume-top-of :interval :as :A)
    (calculate [:A :B]
      #(interval/make-interval
        (:min %1)
        (:min %2)
        :min-open? (:min-open? %1)
        :max-open? (:min-open? %2)) :as :first)
    (calculate [:A :B]
      #(interval/make-interval
        (:min %1)
        (:max %2)
        :min-open? (:min-open? %1)
        :max-open? (:max-open? %2)) :as :outer)
    (calculate [:A :B]
      #(interval/make-interval
        (:max %1)
        (:min %2)
        :min-open? (:max-open? %1)
        :max-open? (:min-open? %2)) :as :inner)
    (calculate [:A :B]
      #(interval/make-interval
        (:max %1)
        (:max %2)
        :min-open? (:max-open? %1)
        :max-open? (:max-open? %2)) :as :last)
    (calculate [:first :outer :inner :last] #(list %1 %2 %3 %4) :as :foil)
    (push-onto :exec :foil)
    ))

Yes, I could extract the four calculations to a Clojure function, but that seems as if it would mask the details. In this case, I am being carefully explicit about which :min-open? and :max-open? I’m working with in each case. It’s confusing, but I don’t think it would be any less confusing if it were all in some other function. Could be that someday it irritates me enough to change. We’ll see.

Set theory stuff

I think that now Interval really represents a formal set (as opposed to the weird “oriented set” that Span captured), I’d like to have :interval-union, :interval-intersection and :interval-subtract, which are just implementations of the set-theoretic operations.

The :interval-intersection instruction feels easier, not least because the result will always be either nothing (if there is no overlap) or a set I can represent as a single :interval. The result of :interval-union is more than one :interval, and I don’t have a clear sense at the moment what that should be: should it return a code block, or a :set of :interval items, or an :intervals tuple? I don’t know.

Anyway, I’ll elide the inevitable edge-case chasing, though there was a bunch of it. Here’s the function I wrote in push.type.definitions.interval

(defn interval-intersection
  "Takes two Interval records, and returns a new one that contains their intersection, or nil if there is none."
  [i1 i2]
  (if (interval-overlap? i1 i2)
    (let [min1 (:min i1)
          min2 (:min i2)
          max1 (:max i1)
          max2 (:max i2)
          sorted (sort-by
                    first
                    [ [min1 (:min-open? i1)]
                      [min2 (:min-open? i2)]
                      [max1 (:max-open? i1)] 
                      [max2 (:max-open? i2)]])
          new-min (nth sorted 1)
          new-max (nth sorted 2)
          ]
      (make-interval
        (first new-min)
        (first new-max)
        :min-open? (if (= min1 min2)
                       (or (:min-open? i1) (:min-open? i2))
                       (second new-min))
        :max-open? (if (= max1 max2)
                       (or (:max-open? i1) (:max-open? i2))
                       (second new-max))    ))
    nil))

A lot of the work being done here arises because the Interval record (intentionally?) dissociates the numerical value of :min from whether it’s open or closed, so I build some tuples of [number open?] pairs so that when I actually sort the numbers the openness of the “overlapping” ends can be managed. I won’t go into it in detail; it’s basically a somewhat more succinct way of handling the same edge cases as the interval-overlap? function.

Indeed, I wonder if maybe there’s a simpler more Clojure-flavored way of doing this, just by returning the overlap itself instead of a strictly true/false result from interval-overlap?. Maybe. Not today.

Now for interval-union. I think that if the two Interval items don’t overlap, then the result should be a code block containing them both. If they overlap at all (or have no intervening space, like which have no gaps or overlaps), then the result should be a single Interval reaching from the smallest :min to the larges :max, and it should retain the openness of the constituent arguments.

Maybe I’ll return a list of results in any case.

Later…

It’s certainly quite a bit simpler than it was when these were Span items, but it’s not… well, it’s still not succinct. The edge case for interval-union is interesting. It’s the one I described above, like . This isn’t detected by the predicate I already have, interval-overlap?, so I needed a new predicate I called interval-snug?. Using that, I could cover all the edge cases in interval-union, like so:

(defn interval-snug?
  "Takes two Interval records, and returns `true` when they do not overlap, and the max of the first is the same as the :min of the second, and at least one of those ends is closed. `[2,3)` and `[3,4)` are snug. NOTE: this only checks whether i1 is snug to i2, not vice versa"
  [i1 i2]
  (and (not (interval-overlap? i1 i2))
       (and (= (:max i1) (:min i2))
       (or (max-closed? i1) (min-closed? i2)))))




(defn interval-union
  "Takes two Interval records. If they are strictly discontinuous, they are returned in the order given in a list. If they have no gap between them, the list will contain one Interval that is their union."
  [i1 i2]
  (let [min1 (:min i1)
        min2 (:min i2)
        max1 (:max i1)
        max2 (:max i2)
        sorted (sort-by
                  first
                  [ [min1 (:min-open? i1)]
                    [min2 (:min-open? i2)]
                    [max1 (:max-open? i1)] 
                    [max2 (:max-open? i2)]])
        new-min (first sorted)
        new-max (last sorted)]
    (cond
      (interval-overlap? i1 i2)
        (list
          (make-interval
            (first new-min)
            (first new-max)
            :min-open? (if (= min1 min2)
                         (and (:min-open? i1) (:min-open? i2))
                         (second new-min))
            :max-open? (if (= max1 max2)
                         (and (:max-open? i1) (:max-open? i2))
                         (second new-max))))
      (or (interval-snug? i1 i2) (interval-snug? i2 i1))
        (list
          (make-interval
            (first new-min)
            (first new-max)
            :min-open? (second new-min)
            :max-open? (second new-max)))
      :else
        (list i1 i2))))

But boy, that second one is a rambler, isn’t it?

I should refactor it.

Later…

I can’t seem to get any foothold, to be honest. Maybe I’m too close to it, or have spent too much time invested in this particular (rambling) structure, but whenever I try to extract a function or reduce the number of clauses, I find something that’s too tightly coupled to the part I’m trying to reduce.

For instance. there are several occurrences of code that looks (generally) like this:

  ...
            :min-open? (if (= min1 min2)
                         (and (:min-open? i1) (:min-open? i2))
                         (second new-min))
  ...

That is, “if two particular numbers are the same, then the openness of this interval is such-and-such.” But both the arguments and outcomes would need to passed in as arguments if I did that, and… meh.

It’s going to want to be simplified another day, I think. I can’t see a place to set a tool yet.

Tomorrow—since today is long gone—I will add some simple arithmetic to the Interval type, and be done I think.

Continued in “Finishing the Interval Push type in Klapaucius”