I really enjoy reading Yegor’s blog. He brings quite interesting ideas regarding programming and business. But sometimes, the thoughts he shares there look a bit strange to me. I’m talking about one of the latest posts titled “Any Program Has an Unlimited Number of Bugs”.

I’m writing this in reply to Yegor’s points mentioned there. Please read them first. To focus on the key notes, let me quote the most sensitive part:

Let’s take this simple Java method that calculates a sum of two integers as an example:

int sum(int a, int b) {
  return a + b;
}

How about these bugs:

  • It doesn’t handle overflows
  • It doesn’t have any user documentation
  • Its design is not object-oriented
  • It doesn’t sum three or more numbers
  • It doesn’t sum double numbers
  • It doesn’t cast long to int automatically
  • It doesn’t skip execution if one argument is zero
  • It doesn’t cache results of previous calculations
  • There is no logging
  • Checkstyle would complain since arguments are not final

Well, the points highlighted above make me really uncertain about what really Yegor meant. And since he didn’t answer me in the commentary section, I will put my thoughts on that in my own blog.

Let’s look closer at some of them.

It doesn’t have any user documentation

The question is, do you really think we need documentation for sum function? What will you put there? I cannot guess nothing other than It adds two integers. So what the reason for having it in our code?

In a previous project, I had a teammate who was a fan of docstrings. Every module, a class or a method should have a docstring, he said. The code he wrote looked like a school composition. You could not ship your code without his comments something like “add docs here and there plz”.

Since our product was developing rapidly, we had to change the code all the time. Surely, we used to forget to update those docstrings because it took extra time. It was even worse because the docstrings didn’t tell the truth anymore. Nobody believed them as a result.

What we need sum documentation for? How should it look like? Who will read this documentation? A user? I doubt it. A programmer? Even programmers need not a machinery-generated HTML made of docstrings, but human-written manuals with examples and snippets.

When you open a particular GitHub project, what do you do first? Reading the code? No, you scroll down for README or click on Wiki link. Digging the source code is the last thing you would make if something works not well.

Let’s recall what does one say on DOOM III source code and ID Software code standards in general:

Code should be self-documenting. Comments should be avoided whenever possible. Comments duplicate work when both writing and reading code. If you need to comment something to make it understandable it should probably be rewritten.

Its design is not object-oriented

We all know Yegor is passionate by OOP. But really, naming a function buggy just because it’s non-OOP designed is not fair.

What would happen if we had an OOP version of sum function? I can imagine something messy like this:

Adder adder = new Adder();
adder.setA(1);
adder.setB(2);
Integer result = adder.getResult() // 3

We’ve got lot more problems here. The number of lines grew up from 1 to 4 (compare to add(1, 2)). In addition, now adder object also keeps it’s own state. How may I track whether .setA was called or not? What will happen if I forget to put .setB in my code? Will it get a Null value or an exception raised?

What will happen if I share adder object across multiple threads so they call setA, setB and getResult simultaneously? Nobody would predict the result.

Functional programming teaches us to avoid having state and keep the things simple. Having a small function without any state rather than manipulating with objects will pay off for sure.

It doesn’t sum three or more numbers

Surely, if you a Java programmer, you’ve got a problem. Because there is nothing you can do without rewriting the code. But if you are aware of FP-style and you have your add function, you simply do:

(reduce #'add (list 1 2 3) :initial-value 0) ;; common lisp
(reduce add 0 [1 2 3])                       ;; clojure

or, in Python:

reduce(add, [1, 2, 3], 0)

This is definitely the right way with reduce also known as fold. You don’t need to write a function that adds million of integers. Instead, you add just two! And reduce will care of how to spread it across a sequence of data.

That’s normal to focus on simplest cases and primitive operations first. Then you build abstractions with high-order functions based on primitives. It makes the code easier to understand and debug.

It doesn’t sum double numbers and It doesn’t cast long to int automatically

I don’t know if it is really a bug. On the one hand, it would be great to have a function that adds everything. Again, within any Lisp dialect or Python I may have it easily. Instead, with static typing you have to write more code.

But on the other hand, having exact integer types might be some kind of specific requirement made by design. It could be an utility function that adds to years or any numbers that do not exceed a couple of thousands. So it’s difficult to judge without a full context (I’ll say more on that below).

It doesn’t skip execution if one argument is zero

I really doubt it will increase the performance. Adding zero to an integer costs really nothing on modern hardware. The compilers are smart enough to deal with such cases. But putting an extra if clause will definitely increase the number of instructions because now your program makes a comparison step.

This is exactly what did Knuth say about preliminary optimization. You don’t even have any metrics about what runs faster: adding zero of extra if statement. You even don’t now how frequently your function will be called with zero argument. I might be 1 time per 100k calls so your trick is made in vain in that case.

It does not worth it, after all.

It doesn’t cache results of previous calculations

I’m sure any cache system brings as many problems as it solves. First, where would you store that cache? In application’s memory? In Memcache? User’s cookies?

Then, you program could sum million of numbers per second. Every cache call needs al least O(C) or O(logN) time to lookup a value. You will slow down your program dramatically with such approach when caching small functions.

There is no logging

The same claims are here: what exactly are you going to log? Who will read that logs and how often? For what purpose you want to store such kind of logs? Where would you store such amount of logs? Did you estimate the slowdown that you will bring to your system by adding logging? What if some of Ops team would misconfigure logging system and add email handler to that logger? Or sending SMS message?

Finally, this function is taken completely out of the scope so it’s impossible to judge on its quality. We should never examine a single function but only their composition instead. Every function has its own weaknesses and strengths. Linked together properly they build a system. In such a system, weakness of any specific function is supported by another function that validates data, filters input params, cleans up system symbols and so forth.

What I want exactly to say is there is nothing criminal in such a function that adds to integers. If it’s surrounded with another function that ensures that both numbers are really integers, there is no need to do all that validations again when adding them. In our projects, we usually separate our application on several levels. On the top level of the application, threre is one that validates the input data. If they are fine, functions on the next level operate on them as well.

In conclusion, please avoid using word “bug” when you talk about lack of logging or caching. Probably you don’t need that stuff. Keep functions free of having state and producing side effects. When you afraid of the data you use may be incorrect (Nulls, overflows), don’t rush into adding nested ifs. You’d rather update your validation logic first.