Software quality notes: nested "for" loops considered harmful
While working on Exomars as a software engineer, I was notified an issue to fix. I thought it was an interesting one, and spent some moments to think about how it could have been prevented in the first place. I care about code quality, and always try to build reliable software. For this purpose, I do not believe in rockstar developers (we are all fallible humans), but rather in (automatically enforced) processes and tools. In this article, I want to share my thoughts about this bug in particular, but more articles about quality in general will probably follow.
The bug was located in the thermal control algorithms. It is a low criticality part (thermal is slow to react and change), and less tested area than GNC.
Here is the code, slightly reworked to remove unnecessary fluff and keep only the relevant portion.
Even with sharp eyes, the bug is very difficult to see.
Moreover, the compiler and advanced static analysis tools are probably unable to help us with these arrays (the fact that the parameter
uint32_t *heaters is dynamic makes it even worse).
Here is the answer: the bug is line 21, where indexes
j are mixed up.
Actually, this is made up, it is not even the bug which was reported to me.
The bug I had was similar (issues with array indexes):
for (j=0; i<TCS_MAX_HEATERS; j++).
Again, indexes are mixed up, in an even worse way this time.
The particular bug does not matter much, as long as it involves two
I wanted to talk about this kind of bugs because double
for loops are quite common, and there are several ways to avoid entirely this class of bugs.
The easiest and most obvious measure is to use clearer variable identifiers.
On one hand, using unambiguous names such as
i_heater would probably have prevented the bug.
On the other hand, the bug is still possible because this measure is very primitive and depends entirely on the (sometimes stupid) human brain.
To fully avoid the bug, the measure needs to be enforced at all times, by a static analysis tool or, even better, the compiler.
A first attempt to use the compiler to prevent us from using the incorrect variable can be achieved with scope: one can declare the variables in the smallest scope.
In our example, this would prevent
j from being used in the outer loop.
Sadly, this is imperfect, because the other way around, using
i in the inner loop, is still possible.
Many (old) people will advise against this and require variables to be declared at the beginning of the function. This probably comes from old C standards, such as ANSI C (C90), where variables must be declared immediately after an opening brace. I believe this is wrong, because it is an outdated custom which is not relevant anymore. Indeed, since C99 and onwards, variables can be declared anywhere.
In any case, even if this measure is not very effective, it is still better than nothing. Good protection is defense in depth, and is made of several layers.
Function length and purpose
The next measure uses the scope again, but more effectively.
Functions can be made shorter, with a more limited role.
For example, a
searchHeater() function would allow splitting the two
for loops in two different functions, preventing the use of the wrong index.
An advantage of shorter functions is that it reduces the complexity, by having less variables and less branches. But too many small functions (here, 5-10 lines) could also be a disadvantage, as it could make the code more complicated with more indirections and layers of abstractions. Everything is a trade-off.
The last and, in my opinion, best protection measure is to not use indexes at all. Yes, I kinda cheat, I just remove the issue entirely.
I rewrote the above code in Rust, using Rust’s Iterator feature.
The Python version, which supports iterators as well, would look similar, with
index() instead of
It can be observed that the indexes have disappeared.
for loop now uses a reference (a pointer) to the heater and the second loop has simply been removed.
Iterators are powerful.
This kind of high level constructions avoid entire classes of bugs.
In this case, no mistakes can be made with indexes, like in C.
It can also be noted that the default value of
heaterId is not -1 (which is error prone: what if it is a
It is similar to Python’s
None, except that Rust is strongly typed, therefore
None can not be implicitly casted to 0.
Iterators might be possible to implement in C, but it would be very complex (function pointers?), and definitely not as safe as when built in the language, like in Rust.
Years after years and projects after projects, it has been demonstrated that manual reviews and inspections are fragile and unreliable. Moreover, this kind of bugs are hard to find by tools (compilers or static analyzers).
Therefore, another method must be used. The best one is to remove the issue altogether by using higher-level constructions such as Iterators. This often has a runtime cost, but in Rust’s case, it is small or even negligible when compared to C.