UDispatch #5: What is TSD?

So. What is TSD? It’s a small library whose purpose is to make it easy to collect, render, compare, diff, and test tree-shaped data. Once we get past basic programming, it is very common to have to create orchestras of objects that form conceptual trees. Note, these aren’t literal trees. Hmmm. Better talk about that.

Consider my mp3 collection. A Recording has a number of fields. They include primitives, psuedo-primitives (like String and Date), and references to other objects, like Performers, Label, Producers, etc. Those referenced objects themselves include both embedded fields and references to still other objects. A Performers includes a list of Performer objects, a common collective name, possibly some Session objects, and so on.

None of this is explicitly modeled as a Tree interface or class, nor does the app navigate that object graph using tree-like notation. When it wants to show who performed it, it just de-references the Performers field. One could say "OH MY GOD EVERYTHING IS A TREE", and try to code it as such. That’s fine and its doable, but it’s going to create a great deal of weight. You’re essentially now programming in a new language, TinkerPop or some such, and you gain generality and lose naturalness.

There are cases where that tradeoff says "yes", but there are plenty where it says "no no no". Generality is awesome. But easy natural expression is far more important for most programming problems.

So this data is "tree-shaped". It’s not a tree. It’s just grokkable as a tree. There’s a single root object at the top, and it spreads out and eventually bottoms out into a bunch of leaf objects at the bottom. The nodes are of different type to one another, usually "by level".

(Most object graphs aren’t true trees anyway, they’re just DAGs, directed acyclic graphs. Multiple higher-level nodes can point at the same lower-level node, normal in a DAG, forbidden in a true tree.)

Now, spoze you’re a TDD’er. I am. That means you have to repeatedly set up various object graphs like this, manipulate them, then test the results. Further, you want to grok the setup and grok the results, especially when the test fails. There are a lot of over-simple examples of these sorts of tests on the internet, used to illustrate various library features.

Some of the simplifications: 1) that’s one test, you’ll have hundreds. 2) the trees have two layers, not 5 or 9. 3) the tests are built around simple equality. 4) when the tests fail you’ll have no idea why. 5) they don’t handle irrelevance well.

Even setting aside TDD, there’s an issue with just rendering. It’s normal in all development styles to want to see the state of these tree-shaped object collections. (This is "print debugging", of course, which is unfairly maligned.)

But if you want to "dump" tree-shaped datasets like this, you’re going to have to write that code yourself. If you’re wondering, a) that’s tedious, and b) there’s housekeeping, and c) it feels like you’re not working on your story, and d) there’s no standard to the output. Now what you could do is add ‘toJson()’ to every class. Now you’ve got a json root, which is one representation of a tree, and you can print it, pretty-print it, diff it, log it, and so on.

That’s better, for sure. It was in fact the experience that got me started on TSD. But it’s not quite enough. JSON or XML are "human readable", but they’re not necessarily human grokkable, and grokkability is key.

Once those trees get big? Look out. If you’ve lived in web-land, you for sure have seen JSON with hundreds or even thousands of characters. Further, depending on context, most of those characters are noise.

So. Now you see the problem. I want to non-tediously enhance my objects so they can make real trees out of tree-shaped data. I want to make those trees easy to make, test, compare, and render.

Tsd intrudes on your existing object class by requiring a mix-in interface with one method:

fun toTsd(output: TsdOutput)

The TsdOutput is seen by your class like a kind of print stream or console that, basically, expects key-value pairs with indentation. To your class, it looks a lot like you’re printing a key-value ASCII tree using just four methods. There are overloads for all the basic types and for the toTsd interface.

Under the hood, of course, TsdOutput isn’t printing anything yet. It’s collecting the data and making it in to a true tree. Once it’s that, the rest of the library can do all sorts of things with it.

So now you have the basic concept of the tsd library. The first time I coded it, maybe 2014 or 2015, it was in java. I found it very useful. My particular favorite feature was an advanced one: a UI that popped up on fails showing me both whole trees and a diff tree.

That kinda thing is of course not bound into tsd’s core, depending on UI and Junit in non-portable ways. But I’ll be doing it again, and putting it into an adjunct library, and hopefully others can do the same for their environment.

Tsd is open source MIT license, here:

https://github.com/GeePawHill/tsd

It doesn’t do anything yet. First I had to learn how to do multi-project gradle builds. But I’ll be working on it, here and there now and then, and sharing notes. If you want to help, ping me!

The next step for me? Play around to see if Kotlin can express those four basic API’s in a way that makes the toTsd methods look and feel as obvious as possible.

Leave a Reply