A few days ago, Murray blogged about Glom’s switch to non-recursive Automake. The topic attracted more feedback than I had expected, and I have been asked if I could give a more detailed account of the speed improvements mentioned by Murray.
For the background, I highly recommend to read the original paper by Peter Miller, Recursive Make Considered Harmful. Here’s an appetizer:
For large UNIX projects, the traditional method of building the project is to use recursive make. On some projects, this results in build times which are unacceptably large, when all you want to do is change one file. In examining the source of the overly long build times, it became evident that a number of apparently unrelated problems combine to produce the delay, but on analysis all have the same root cause.
First off, here are the results for the Glom build, measured before and after the switchover:
These results were obtained on a Lenovo desktop PC with an Intel Core 2 Duo E7200 at 2.53 GHz and 3 GiB RAM. The operating system is Ubuntu 9.04 (jaunty) for AMD64, installed on an ext4 volume of an encrypted LVM partition. At the time of the measurements, the full GNOME desktop interface was up and running, but otherwise idle. The entire measurement was repeated once with nearly identical results, and the timings presented here are the averages of the two runs.
As can be seen, the switch to non-recursive make reduced the time of a full Glom build by about 19 % on this machine, or more than one minute, which I think is quite decent. However, before you start to convert your own module, be aware that the performance gain you’ll see depends heavily on the specifics of your project. It’s practically guaranteed that you’ll see some improvement, but it could be a meager 0.1 % speedup. However, 40 % are just as possible. To help people gauge the possible effect on their project, I’ll try to outline the factors which I found to matter most in my experience.
Essentially, performance is improved because of two things: First, the time for make to get going and resolve dependencies includes a constant part, which introduces a significant delay even if there is nothing to rebuild. The second problem is that even with parallel make, subdirectories are normally entered in sequence, in order to prevent the number of parallel processes from increasing exponentially. A subdirectory cannot be left until all its targets have finished building. Unless the execution times of all parallel build processes overlap perfectly, one or more processor cores will idle. This effect will be more pronounced if the average number of source files per subdirectory is small. In the pathological case of one file per directory, there will be no parallelism at all. A classic example is the directory hierarchy of source code examples found in many library modules.
The programming language used also plays a role. On average, the idle periods due to imperfect overlap of parallel compilation processes will be longer if compilation is slow. Typically, a C++ source file will require substantially more time to compile than a C source file. The same holds for large source files versus small source files. The optimization level and other compiler settings also influence the compile time per object. With non-recursive make, the stalls due to imperfect overlap will only occur for targets which depend on multiple intermediate targets, such as when linking object files into a library. Even when make needs to wait for a dependency before it can continue with a target, it will still continue building any independent targets. Thus, the builds of multiple independent executables or libraries overlap, thereby hiding the stalls.
With all that mind, I think the 19 % speedup in the case of Glom should be considered above average. The directory hierarchy is deeply nested, with a small number of source files per subdirectory, although not as small as to be pathological. One important factor is that Glom used to build an intermediate static library in each subdirectory, which were later linked to produce the actual target binary. The new non-recursive build system omits these intermediate libraries and simply links all object files from all subdirectories in one go to produce the final target binary.
Automake has included support for non-recursive make for quite a while now, and it turns out to be surprisingly straightforward and easy to use. All you need to get going is
AM_PROG_CC_C_O in your
configure.ac for C compiler support, and
AUTOMAKE_OPTIONS = subdir-objects in your
Makefile.am. To refer to files in a subdirectory, just include the relative directory prefix. Automake variable names also include the directory prefix of the file names they are derived from. The
*.am files in the Glom repository may be helpful as real-world examples.
make -j∞ !