Friday, July 17, 2015

Pouring concrete

If there were 10 commandments of automated testing, number 1 must surely be "Thou shalt test what, not how".

Imagine if I wrote a sort routine, and further imagine I wrote a bubble sort in particular.
In a misguided attempt to test it, I decided I would make sure it did the correct swaps. So I come up with test data, and figure out what items get swapped, and then I modify my sort to pass in a "swap" delegate, so that I can intercept this and test that the correct swaps occur.

I run my tests, and I see that is it good. Each call to swap happens just as I expect, and I know exactly what my code is doing.

Only it is not good.

As my data sets increase, bubble sort struggles, and then I read about Quicksort, or Radix sort, and I decide to replace my bubble sort.

All my tests fail. They all need to be fixed by figuring out what swaps will now occur.
My tests have poured concrete on the implementation, and made it impossible to change without dynamiting the unit tests.

Let's rewind and put in the rule that tests must test results, not actions.
If I sort data, it must be sorted. I don't care how it sorts it. I just want it sorted.

My tests have less detail, and for sure, there could be some hidden bug, that still leaves my data sorted, but does not operate in manner I expected. But my data is sorted. Now I don't need that swap delegate. I pass in my data, I get back the sorted result, and I confirm that the sorted result is indeed sorted.

When I replace my bubble sort with a New and Better Sort, lo and behold, all my tests still run, and hopefully pass.

Test that the data is sorted, not how the data gets to be sorted.

Sadly many test tools facilitate the how testing. The ability to write A.CallTo(()=>swap(a,b)).MustHaveHappened; or some such thing is nothing other than pouring concrete deep and heavy all over the implementation. Avoid the temptation, the road to hell is paved with concrete.

Tuesday, June 9, 2015

The devil is in the details...

Dijkstra: "The purpose of abstraction is not to be vague, but to create a new semantic level in which one can be absolutely precise". It's a favourite quote of a good friend. I'm growing to like it more and more by the day.

Perhaps I'm getting old, or perhaps it's because I'm working more and more on larger teams, with code that I did not write, but I find that I cannot deal with 50+ details at the same time any more.

If the details bleed from another part of the code, and I was forced to do a depth first reading of the code at every function call, I might be somewhat uncharitable towards the original author.

Imagine a circular buffer - in my world, it should have an IsEmpty() function. Few would argue this. If I had to wander into the internals of CBuffer and start comparing first, and last, and remembering if it's Empty when they are the same, or when they are 1 away from each other, don't forget to MOD by the size of the buffer, where was I going with this, hold on, let me go back over that again...

The abstraction IsEmpty() is very precise. In a sense it's far more precise than any comparison of first and last pointers, since it expresses exactly what I want to know, not how to go about finding out. And I can simply trust IsEmpty(), because it will (should) have unit tests. But if I am repeatedly adding this comparison code to my class, then every time I add it, I can goof it.

You would think that this is software 101, and nobody needs reminding about this. But again and again it creeps into code, usually more complex code than a simple buffer, but it does creep in.

It creeps in in little ways, like the train wreck antipattern. It creeps in when there are comparisons that imply something else. The little comment along side "// only set if" is a giveaway that we have been far too intimate with another class, when we should be maintaining a professional distance.

Thursday, June 4, 2015

Do one thing... sort of

This was always confusing advice. Do one thing.

So how do I open a file, read it, close it, parse it, and then process the input.
That's a lot more than one thing. Something has got to "do several things"

Like most soundbites, Do One Thing is correct, but incomplete. You have to take into account levels of abstractions. So do one thing at a given level of abstraction.

A call to ProcessConfigFile can reasonably kick off calls to open, read, close, parse and process.
But it should not actually _do_ the Reading, Parsing and the Processing for example. You should call someone else to _do_ these.

If it does Reading, Parsing and Processing, then you have them bound together. So now if you want to process already parsed data from an block of memory or from a network connection, your Process code needs to be "fished out" of the Read Parse code.

So at one level ProcessConfigFile does indeed do one thing. At a more detailed level, each function or class it calls or uses will also do one thing.

If you aggregate the functionality of others classes or functions, then you don't want to mix in any functionality of you own.  Mixing low level details and high level structures tends to hide what is important.  If you have a 30 line function which contains 20 lines of string formatting and 4 lines of structurally important code, it starts to become easy to miss the big stuff for the detail.

Friday, April 24, 2015

Let it Throw

Oh the code that calls is frightful
But the tests are so delightful
And since we've no place to go
Let It Throw! Let It Throw! Let It Throw!

It doesn't show signs of crashing
And the logs files aren't thrashing
'Cause the logging's turned way down low
Let It Throw! Let It Throw! Let It Throw!

The argument's getting old
And I want to get home tonight
But if you'll review the code
We can push this to the client site

The RC's slowly building
And, guys, the tests are passing
But we'll catch and log it so,

Let It Throw! Let It Throw! Let It Throw!