Thursday, October 6, 2011

Consistency Of Abstraction

I'm reading through Robert C Martin's "Clean Code" at the moment. It's a good book that gives examples of what not to do, explains why and show how to do it better.

Many of the things he says are, much like many great ideas, utterly obvious, after they have been pointed out. Yet clearly, they are not universally adhered to.

Consistency of Abstraction
A simple point he makes is that we should have Consistency of Abstraction in a function.

Simply put, if you have high level calls like "GetAvailableGridHosts()" then in the same function there is no place for low level code, such as iterating through the HostObjects and adding the HostNames of the hosts to a formatted string buffer.

Do the right thing and take the loop out into a separate function. Give it a GOOD NAME. Spend a while thinking about the name. Call if FormatHostNamesAsPrettyString() or FormatHostNamesAsCSV(), you may have naming conventions, you may think these names are rotten, but if you at least start to think about the name, and why you like these, or don't, then you are heading in the right direction.

Now your function is dealing with ideas at a consistent level of abstraction. The reader is not going from thinking about the high ideas to low level details and back.

Part of the advantage of small, clear, and concise functions is reuse, but a huge part of the advantage is the improved readability of the code.

This loop that does "stuff" in a larger function also has a scope problem. It's not clear exactly what data it operates on. It could be using or modifying any local variables in that function. It's scope is too large.

It's harder to test. How do you write a unit test for a loop in the middle of a bigger function?

The Downhill Slope
Add another loop or two, and suddenly the function becomes very hard to follow. Now the function does not fit on the screen. Now you can't see where a variable is set at the same time as where you next use it.

Break the function up into smaller functions and then it's easier to test, each individual function has smaller scope, and is easier to read and debug.

Potential Trade-offs
The trade off is lots of small functions. This does have some overhead in terms of the CPU, Jumps, Cache Misses and so on. Badly done, you have to go to each function to look and see what it does. But if the name is good, and if the function does one thing, and respects it's name, you probably don't have to look in all the functions, you can quickly find the functions you need to know about.

In real terms, I don't think that I have ever sat down and gone through code and thought to myself - "If only (s)he hadn't broken the code up into such small functions". There are very limited and specialized areas where the time to make the extra function calls matters. But they are few and far between.

No comments:

Post a Comment