The Universal Law Of Good Software Engineering

There is an universal law behind well written programs. Once you got it you will be shuttered each time you see someone or something not obeying this law. It’s nothing new and by no way invented or first described by me – With this article I only try to explain it to more people, with my individual approach. Maybe it’s usually not described in such a big context, but you’ll surely recognize some of the derived rules I’ll mention.

Everything should describe or do exactly one thing – do it well – and nothing more.

This is easy to understand, but in practice hard to master.

I present a bunch of motivating examples, then we dive into the application in software engineering.

Tools

I guess just like most software engineers I have a toolbox in my cellar that I seldom use, but that’s always good for bad anecdotes:

If you only got a hammer, every problem looks like a nail.

I don’t want to hammer screws. I don’t want to cut with a hammer. So I’ve some screw drivers and a knife. And they do exactly one thing well: The screwdrivers screw. The knife cuts. And there is a reason why there are no tools that try to provide many services at once: If you think a bit about it, they’d be impractical, heavy and hard to understand.

I simply don’t to want to think about how I have to pull some screwdriver on my imaginary supertool to get a wrench while I’m juggling a heavy washing machine.

Writing

Many years ago I read a few books about writing. The bottom line (fiction or non-fiction doesn’t matter):

Every word must have a special meaning. It must either advance the story or explain something important or be there for grammatic reasons.

A good editor strikes through every word that is superfluous. The same goes for unnecessary paragraphs and chapters.

Usually we don’t even notice that this rule is followed in books because it’s so common.

A textbook about Type Systems where in the Lambda Calculus chapter a medieval hero arrives and suddenly slays the lambda abstraction operator because it may look innovative to kill the main character before the end of the book? No way. Useless in the context of a computer science textbook – would never make it into print.

I think the art of writing is much closer to our craft than one might first think: In fact we’re writing programs for us and our colleagues. And to think about the problem at hand during the writing process. The machine finally needs only ones and zeros – It doesn’t care about our programming style at all.

Maths

I wouldn’t mix up the calculation of my income tax with the calorie counting of the day. It would be possible and maybe I can remain undamaged as long as I provide units (like Euro or Joule) and use them to separate the different types of values. But it would feel completely unnatural and wouldn’t make much sense.

Another example: Once a week I tutor a refugee in maths for his apprenticeship. He had the misfortune of not being able to go to school after finishing elementary. So I’m teaching him junior high school level maths that he had no chance to learn before.

While teaching I catch my self reiterating over and over again: “Make small steps to solve equations. Split the problem as small as possible. Don’t try to calculate too much at once.” That’s what’s maths like when you’re explaining it: It’s defining a problem with small bits of known facts and solving the exercise by combining them in small steps.

More formally thinking: Every operator, function or concept in maths does exactly one thing and does it well. Arithmetic operators do their thing – only that – and they do it well.

1*1 doesn’t launch any missiles or orders on Amazon. That’s why you can combine arithmetic operators to describe almost every mathematical problem in our daily lives.

Unfortunately in programming it’s very tempting to do the opposite: Because in many occasions you get or hold different kinds of data at once. Then you very easily end up mixing calculations that don’t belong to each other. It’s due to the data being available in the same place (e.g. object, file or database). Credit card numbers and birthdays are very different concepts. But they may for example suddenly appear in the same line of code because they both belong somehow to a person’s data structure. And then one has to be very careful that things don’t mix up.

Exercise: Try to find a meaningful calculation with income taxes and calories and send it to us. I guess this would be a lot of fun. 🙂

Computer Science

Unix

It’s a personal thing, but I prefer to read the most successful books and papers of the past over wasting my time with the hippest blogs and magazines du jour. There is a reason why they were successful and still popular. Things don’t change much. Even in our industry the best ideas survive for decades and it’s generally more useful to understand the core ideas than to struggle with their marketing overloaded applications. The marketing fuzz will change in a few years anyways – the ideas are there to stay.

Once you understood the principles of clean system design, functional programming, databases and distributed systems “cloud based functional-reactive microservices with NOSQL databases at a PAAS provider” don’t scare you at all. It’s not a revolution, it’s old principles newly applied to solve problems.

Speaking about microservices. I looked up a concise definition by Martin Fowler (always a great source for explanations of key concepts):

“In short, the microservice architectural style is an approach to developing a single application as a suite of small services, each running in its own process and communicating with lightweight mechanisms, often an HTTP resource API. These services are built around business capabilities and independently deployable by fully automated deployment machinery. There is a bare minimum of centralized management of these services, which may be written in different programming languages and use different data storage technologies.”
http://www.martinfowler.com/articles/microservices.html

What has this to do with Unix? Hidden in this quote is the Unix Philosophy. From the late 60ies, early 70ies!

Google it! Look at Wikipedia. It’s really worth looking up – I’ll wait here. 🙂
https://en.wikipedia.org/wiki/Unix_philosophy#Do_One_Thing_and_Do_It_Well

Same stuff – Technology changed, Principles survived.

“(i) Make each program do one thing well. To do a new job, build afresh rather than complicate old programs by adding new features.

(ii) Expect the output of every program to become the input to another, as yet unknown, program. Don’t clutter output with extraneous information. Avoid stringently columnar or binary input formats. Don’t insist on interactive input.

(iii) Design and build software, even operating systems, to be tried early, ideally within weeks. Don’t hesitate to throw away the clumsy parts and rebuild them.

(iv) Use tools in preference to unskilled help to lighten a programming task, even if you have to detour to build the tools and expect to throw some of them out after you’ve finished using them.”
The Bell System Technical Journal. Bell Laboratories. M. D. McIlroy, E. N. Pinson, and B. A. Tague. “Unix Time-Sharing System Forward”. 1978. 57 (6,part2). page 1902

The vocabulary changed. A “job” is now a “service”. “Avoid stringently columnar or binary input formats” means “use strings” (and HTTP is text based). “Centralized management” was back then an operating system with a shell terminal. And I guess “automated deployment machinery” wasn’t a topic because there was no Internet but an operator that manually changed tapes. But the key idea stayed the same.

Later the Unix Philosophy was summed up as:

“This is the Unix philosophy: Write programs that do one thing and do it well. Write programs to work together. Write programs to handle text streams, because that is a universal interface.”
Peter H. Salus. A Quarter-Century of Unix. Addison-Wesley. 1994.

It’s all very obvious and to be fair: Martin Fowler explicitly mentions the relationship of microservices and Unix in his article: “We do not claim that the microservice style is novel or innovative, its roots go back at least to the design principles of Unix.” (http://www.martinfowler.com/articles/microservices.html)

Exercise: Think about how (iii) of the Unix Philosophy relates to our “modern” Agile world.

Compilers

To many people nothing in computer science is as complicated as a compiler. The standard text book even has a dragon on it’s cover and is called the dragon book. The dragon of course stands for complexity fought by a brave knight – the reader. [1]

No doubt serious “production ready” compilers consist of hundreds of thousand lines of code. And are stuffed with a lot of theory. But if you take a deeper look compilers aren’t written by evil geniuses in tin suites but by hundreds of developers. And their complexity is split up into many very small compiler passes. And guess what: Each pass does one thing and it does it well!

The type checker pass checks only types. And it does so by checking the type of one language construct at a time. Surely with a method dedicated to check exactly that kind of construct that is now to be checked. The combination of all these small dedicated methods add up to a complete type checker. And when a new language construct is implemented only a small method is mixed in. That way a book full of heavy theory is split up in clear, simple and easy to understand thingies.

There is some movement to even split up compiler passes further to so called “nanopasses”. So you really have only one of these small thingies in one pass. “A Nanopass Framework for Commercial Compiler Development” (Andrew W. Keep, R. Kent Dybvig) is a very good introduction.

Another Example: Monadic Parser Combinators make parsing rules so elegantly combinable that I’ll surely blog about them in near future. Of course, again: Each parsing rule does one exactly thing (parse a very specific pattern) and does it well …

[1] : If you aren’t eager to fight dragons while wearing a knight’s armor: There is a very well written and good approachable book by Terence Parr: Language Implementation Patterns. It skips a lot of theory and gets you going in a breeze. And it’s patterns are suitable for most compiler problems you’ll likely face in your professional life.

Functional Programming

Functional programming is all about the combination of smallest functions. To be combinable it’s crucial that each of them – guess what – does one thing, does it well and nothing more.

Having Haskell as my secret favourite programming language I’d argue that the “purer” a language is (the more it separates and restricts side-effects) the better it’s functions can be combined. Thus the better the language is.

But, to be fair: Functional programming languages that make bigger concessions about program state and side-effects usually are very carefully designed to keep these away from the basic building blocks. So – in practice – it may not count that much.

It’s quite interesting that the basic building blocks (such as map or filter) were adopted by all mainstream languages. Let’s take that as a sign how appealing “One thing – Well” is in practice.

Principles

If you haven’t done it before: Read at least one good book about Clean Code. It will change your life. I’ve seen too many people sticking with collecting half-wisdoms at work. That’s always dangerous because they’re usually not well
reasoned and explained. You could end up doing nonsense for the rest of your life without even noticing.

Maybe I’ll write another blog post about which books I recommend to whom, for now: If you pick one from Robert C. Martin (better known as Uncle Bob), Andy Hunt & David Thomas (Pragmatic Dave) or Kent Beck you can’t make much wrong – It may only be that another book would align better with your way of thinking.

Reading more than one book won’t hurt either – having different points of view on the same topic is always a good thing.

So – having a bunch of books on my desk – let’s check how their rules relate to the universal principle. (Picking a rule from one book does not imply it’s not in the others – The point is that many rules from many books finally culminate into one central idea.)

Don’t Repeat Yourself / Code Duplication

Our head of Development, Torsten, always tells us that programming is mostly “all about removing duplication”. The “Don’t repeat yourself” rule is very similar: Don’t type the same expression twice. Something written twice means duplication, right? So it’s the same from different points of view.

The book The Pragmatic Programmer: From Journeyman to Master by Andrew Hunt and “Pragmatic Dave” (David Thomas) defines the DRY (Don’t Repeat Yourself) principle as follows

“Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.”
(TODO: page ?)

However having “one source of information” and “no copied code” is so fundamental and obviously a good thing that it’s found in every Clean Code book I know.

Following the DRY consequently leads to “abstraction” because every concept is encapsulated once so that higher levels can easily use it. And that’s what our whole industry is built upon: All the way from smallest transistor gates – layer by layer of abstraction – to GO games winning AIs.

Choose a small orthogonal set of primitives [for interfaces]

From: The Practice of Programming. by Brian W. Kernighan and Rob Pike, page 105.

In one sentence:

“An interface should provide as much functionality as necessary but no more, and the functions should not overlap excessively in their capabilities.”

Later the authors write about interfaces: “Do one thing, and do it well.” (page 105)

This is not by accident: The Brian Kernighan and Rob Pike played a central role in the development of Unix.

Functions should do one thing

From: Clean Code – Handbook of Agile Software Craftmanship by Robert C. Martin. Rule G30.

Because then they are easy to combine and reason about.

On page 35 of the same book in the paragraph: “Do One Thing”:

“The following advice has appeared in one form or another for 30 years or more.

FUNCTIONS SHOULD DO ONE THING. THEY SHOULD DO IT WELL. THEY SHOULD DO IT ONLY.”

Surprise. Surprise. 🙂

What I really like about Uncle Bobs landmark book is, that he does not only explain how things should be, but how you can easily get there by adhering simple rules.

“So, another way to know that a function is doing more than “one thing” is if you can extract another function from it with a name that is not merely a restatement of its implementation.” (page 36)

In his presentations he often refers to this as: “Extract [methods] till you drop!”

One Assert per Test

“There is a school of thought that says that every test function in a JUnit test should have one
and only one assert statement. […] Those tests come to a single conclusion that is quick and easy to understand.”
Clean Code – Handbook of Agile Software Craftmanship, page 130.

This is later refined into “Single Concept per Test” as it sometimes makes sense map one logical assertion to a few written assert statements.

I picked this one to show that “One thing – Well” is not only valid for functions or classes and interfaces. It is valid for tests, too.

It also counts for collections of tests: Test classes. They should test “one feature – Well” with a bunch of tests. If your test class consists of thousands of lines because your testing a whole domain (e.g. everything with customers) – you piled up a heap of complexity that’ll be hard to understand and maintain.

Single Responsibility Principle

The SOLID principles are a popular set of rules for object oriented design. S stands for Single Responsibility. Every class should do one job and do it well …

Again from Uncle Bob’s Clean Code book:

“The Single Responsibility Principle (SRP) states that a class or module should have one, and only one, reason to change. This principle gives us both a definition of responsibility, and a guidelines for class size. Classes should have one responsibility — one reason to change.” (page 138)

To set this into relation to the “Functions should do one thing” rule: Of course classes can do more than one thing. They encapsulate data, so “one function per class” would lead to heavy copying and to a nightmare when it comes to multi-threading. Scattered data would also break the “single source of information” (DRY) rule.

So, a class that handles the registration of a customer is very like to adhere the SRP. A facade that acts as an interface to all customer related actions is okay, too. But a class that changes due to a bunch of unrelated new requirements is a problem – most likely hard to understand, maintain and an anthill full of bugs.

Exercise: The author gets bored because almost all rules boil down to “One thing – Well”. Grab yourself a copy of a Clean Code book and see to which other rules the universal rule applies. It’ll be many …

Conclusion

I tried to put my personal point of view on an old but very current fundamental law of computing.

Now it’s your turn (Okay, it already was in the exercises 😉 ). Put a sticky note on your monitor: “Do one thing and do it well!” Keep an eye on it when you declare a variable, method, function, module, class, test, script or anything else.

Refactor code that doesn’t adhere to this rule ruthlessly. It’s really old wisdom and it wouldn’t have survived decades if it wasn’t so important!

7
Tagged ,

1 thought on “The Universal Law Of Good Software Engineering

  1. Thank you for your pioneering effort to make this blog alive with the first blog post.

    Some critics if you like.
    It would have been reasonable to divide this article into several topics and cover them in more details as well as having less gigantic statement in the title of the article. If you describe your personal experience, it would be great to read it as s story which has a beginning: how you first hear about the principle and understood it, how the understanding of the principle grew with you and which conclusion have you reached so far applying this principle.
    If to apply the principle to writing an article: cover one topic and cover it well.
    I would also be great to have consistency in typography: not change it every paragraph.

    If I put my critics aside, I also tend to see this principle being violated every now and then and the difficulty is to find an atomic thing that a program should do and what is considered as “do well” in the context.

    In OO objects are the first class citizens and building blocks, but what is an object?
    From wikidepia (https://en.wikipedia.org/wiki/Object_(computer_science)):
    – In computer science, an object can be a variable, a data structure, or a function or a method, and as such, is a location in memory having a value and possibly referenced by an identifier.
    – In the class-based object-oriented programming paradigm, “object” refers to a particular instance of a class where the object can be a combination of variables, functions, and data structures.

    According with this definitions, object is itself a combination of things and not an atom and this, I thinks, is the source “creativity” to put anything you wish inside without any super-purpose what an object should represent.

    Functions, on the other hand have, more clear purpose. In programming, it is part of code that performs a specific task. From wikidepia (https://en.wikipedia.org/wiki/Function_(mathematics)): “In mathematics, a function is a relation between a set of inputs and a set of permissible outputs with the property that each input is related to exactly one output.” For me, it makes more sense to use functions as a building blocks instead of objects.

    There is also a good quote from Alan Perlis’ Epigrams on Programming, published in 1982: “It is better to have 100 functions operate on one data structure than 10 functions on 10 data structures.” The quote has a modern interpretation found on stackoverflow (http://stackoverflow.com/questions/6016271/why-is-it-better-to-have-100-functions-operate-on-one-data-structure-than-10-fun):

    “Those 100 functions on your one data structure can be composed together in lots of unique ways, since they all operate on the same data structure, but you can’t really mix the 10 functions on 10 data structures as well, since they were defined only to work on their particular data structure.

    A more modern and simpler variation of this is thinking in terms of abstractions. If we were coding in Java, would you rather write a 100 functions on the List interface, or the same set of ten functions, once for ArrayList, once for LinkedList, once for….”

    Anyways, I think there isn’t an universal law, there are laws that people believe to be universal as they explain a lot in their experience, so it is better to share the experience the laws explained and maybe it would become a law to someone else.

Leave a Reply

%d bloggers like this: