Chapter 3: Configuring your project with Autoconf

Chapter 3: Configuring your project with Autoconf


We should all be very grateful to David MacKenzie for having the foresight to--metaphorically speaking--stop and sharpen the ax. Otherwise we'd still be writing (copying) and maintaining long, complex hand-coded configure scripts today.

This chapter has downloads!

Before Automake, Autoconf was used alone, and many legacy open source projects have never really made the transition to the full Autotools suite. As a result, it would not be uncommon to find an open source project containing a file called configure.in (the older naming convention used by Autoconf) and hand-written Makefile.in templates.

Configure scripts, the Autoconf way

It's instructive for this and other reasons that will become clear shortly, to spend some time just focusing on the use of Autoconf alone. Exploring in this manner can provide a fair amount of insight into the operation of Autoconf by exposing aspects of this tool that are often hidden by Automake and other add-on tools.

The input to Autoconf is ... (drum roll please) ... shell script. Man, what an anti-climax! Okay, so it's not pure shell script. That is, it's shell script with macros, plus a bunch of macro definition files--both those that ship with an Autoconf distribution, as well as those that you or I write. The macro language used is called M4. ("M-what?!", you ask?) The M4 utility is a general purpose macro language processor that was originally written by none other than Brian Kernighan and Dennis Ritchie in 1977. (The name M4 means "m plus 4 more letters" or the word "Macro" - cute, huh? As a point of interest, this naming convention is a fairly common practice in some software engineering domains. For example, the term internationalization is often abrieviated i18n, and the term localization is sometimes replaced with l10n, for the sake of brevity. The use of the term m4 here is no-doubt a play on this concept.)

Some form of the M4 macro language processor is found on every Unix and Linux variant (as well as other systems) in use today. In fact, this proliferance is the primary reason for its use in Autoconf. The design goals of Autoconf included primarily that it should run on all systems without the addition of complex tool chains and utility sets. Autoconf depends on the existence of relatively few tools, including m4, sed and now in version 2.62, the awk utility. Most of the Autotools (Autoconf being the exception) rely on the existence of a perl processor, as well.

NOTE: Do not confuse the requirements of the Autotools with the requirements of the scripts and makefiles generated by them. The Autotools are maintainer tools, while the resulting scripts and makefiles are end-user tools. We can reasonably expect a higher level of installed functionality on development systems than we can on end-user systems. Nevertheless, the Autotools design goals still include a reliance only on a minimal set of pre-installed functionality, much of which is part of a default installation.

While it's true that configure.ac is written in shell script sprinkled with M4 syntax, the proper use of the M4 macro processor is the subject of Chapter 7. Because I want to stick to Autoconf in this chapter, I'll gloss over some key concepts related to M4, which I'll cover in more detail in Chapter 7. This chapter is designed to help you understand Autoconf concepts, however, so I will cover minor aspects of M4 as it makes sense to do so.

The smallest configure.ac file

The simplest possible configure.ac file has just two lines:

$ cat configure.ac
AC_INIT([jupiter], [1.0])
AC_OUTPUT
$

NOTE: This chapter builds on the Jupiter project begun in Chapter 2.

To those new to Autoconf, these two lines appear to be a couple of function calls, perhaps in the syntax of some obscure computer language. Don't let this appearance throw you--these are M4 macro expansions. The macros are defined in files distributed with Autoconf. The definition of AC_INIT, for example,W is found in $PREFIX/share/autoconf/autoconf/general.m4, while AC_OUTPUT is defined in status.m4, in the same directory.

M4 macros are similar in many ways to macros defined in C language source files for the C preprocessor, which is also a text replacement tool. This isn't surprising, given that both M4 and cpp were originally designed by Kernighan and Ritchie.

The square brackets around the parameters are used by Autoconf as a quoting mechanism. Such quotes are only really necessary in cases where the context of the macro call could cause an ambiguity that the macro processor may resolve incorrectly (usually without telling you). We'll discuss M4 quoting in much more detail in Chapter 7. For now, just use Autoconf quotes ([ and ]) around every argument to ensure that the expected macro expansions are generated.

As with cpp macros, M4 macros may or may not take parameters. And (also as with cpp) when they do, then a set of parentheses must be used when passing the arguments. In both M4 and cpp, the opening parenthesis must immediately follow the macro name, with no intervening white space. When they don't accept parameters, the parentheses are simply omitted. Unlike cpp, M4 has the ability to specify optional parameters, in which case, you may omit the parentheses if you choose not to pass a parameter.

The result of passing this configure.ac file through Autoconf is essentially the same file (now called configure), only with these two macros fully expanded.

Now, if you've been programming in C for many years, as I have, then you've no doubt run across a few C preprocessor macros from the dark regions of the lower realm. I'm talking about those truly evil cpp macros that expand into one or two pages of C code! You know the ones I'm talking about--they should really have been written as C functions, but the author was overly worried about performance!

Well baby, you ain't seen nothin' yet! These two M4 macros expand into a file containing over 2200 lines of Bourne shell script that's over 60K bytes in size! Interestingly, you wouldn't really know this by looking at their definitions. They're both fairly short--only a dozen or two lines each. The reason for this apparent disparity is simple--they're written in a modular fashion, each macro expanding several others, which in turn expand several others, and so on.

Executing Autoconf

Running Autoconf couldn't be simpler. Just execute autoconf in the same directory as your configure.ac file. While I could do this for each example in this chapter, I'm going to use the autoREconf (capitalization added for emphasis) command instead of the autoconf command. The reason for this is that running autoreconf has exactly the same effect as running autoconf, except that autoreconf will also do "the right thing" when you start adding Automake and Libtool functionality to your build system. autoreconf is the recommended method for executing the Autotools tool chain, and it's smart enough to only execute the tools that you need, in the order that you need them, and with the options that you need (with one exception that I'll mention here shortly).

$ autoreconf
$ ls -lp
autom4te.cache/
configure
configure.ac
$

First, notice that autoreconf operates at exactly the same level of verbosity as the tools it runs. By default, zero. If you want to see something happening, use the -v or --verbose option. If you want autoreconf to run the other Autotools in verbose mode, add -vv to the command line. (You may also pass --verbose --verbose, but this syntax seems a bit... verbose to me--sorry, I couldn't resist!)

First, notice that Autoconf creates a directory called autom4te.cache. This is the autom4te (pronounced "automate") cache directory. This cache is used to speed up access to configure.ac by successive executions of utilities in the Autotools tool chain. I'll cover autom4te in greater detail in Chapter 9, where I'll show you how to write your own Autoconf macros that are "environmentally friendly".

Executing configure

If you recall from the last section of Chapter 2, the GNU Coding Standards document indicates that configure should generate a script called config.status, whose job it is to generate files from templates. Well, this is exactly the sort of functionality found in an Autoconf-generated configure script. An Autoconf-generated configure script has two primary tasks:

  • perform requested checks
  • generate, and then call config.status

The results of all of the checks performed by the configure script are written, as environment variable settings to the top of config.status, which uses the values in these environment variables as replacement text for Autoconf substitution variables it finds in template files (Makefile.in, config.h.in, etc).

When you execute configure, it tells you that it's creating the config.status file. In fact, it also creates a log file called config.log that has several important attributes:

$ ./configure
configure: creating ./config.status
$
$ ls -lp
autom4te.cache/
config.log
config.status
configure
configure.ac
$

The config.log file contains the following information:

  • the command line used to invoke configure (very handy!)
  • information about the platform on which configure was executed
  • information about the core tests executed by configure
  • the line number in configure at which config.status is generated and then called

At this point in the log file, config.status takes over generating log information--it adds the command line used to invoke config.status. After config.status generates all of the files from their templates, it then exits, returning control to configure, which then adds the following information to the log:

  • the cache variables used by config.status to perform its tasks
  • the list of output variables that may be replaced in templates
  • the exit code returned by configure to the shell

This information is invaluable when debugging a configure script and its associated configure.ac file.

Executing config.status

Now that you know how configure works, you can probably see that there might be times when you'd be tempted to simply execute config.status yourself, rather than going to all the trouble of having configure perform all those time-consuming checks first. And right you'd be. This was exactly the intent of the Autoconf designers--and the authors of the GNU Coding Standards, by whom these design goals were originally conceived.

There are in fact, times when you'd just like to manually regenerate all of your output files from their corresponding templates. But, far more importantly, config.status can be used by your makefiles to regenerate themselves individually from their templates, when make determines that something in a template file has changed.

Rather than call configure to perform needless checks (your environment hasn't changed, has it? Just your template files), your makefiles should be written in a way that ensures that output files are dependent on their templates. If a template file changes (because, for example, you modified one of your Makefile.in templates), then make calls config.status to regenerate this file. Once the Makefile is regenerated, then make re-executes the original make command line--basically, it restarts itself. This is actually a feature of the make utility.

Let's take a look at the relevant portion of just such a Makefile.in template:

Makefile: Makefile.in config.status
        ./config.status Makefile

Another interesting bit of make functionality is that it always looks for a rule with a target named "Makefile". Such a rule allows make to regenerate the source makefile from its template, in the event that the template changes. It does this before executing either the user's specified targets, or the default target, if none was given.

This example indicates that Makefile is dependent on Makefile.in. Note that Makefile is also dependent on config.status. After all, if config.status is regenerated by the configure script, then it may generate a makefile differently--perhaps something in the compilation environment changed, such as when a new package is added to the system, so that configure can now find libraries and headers not previously found. In this case, Autoconf substitution variables may have different values. Thus, Makefile should be regenerated if either Makefile.in or config.status changes.

Since config.status is itself a generated file, it stands to reason that this line of thinking can be carried to the configure script as well. Expanding on the previous example:

Makefile: Makefile.in config.status
        ./config.status $@

config.status: configure
        ./config.status --recheck

Since config.status is a dependency of the Makefile rule, then make will check for a rule whose target is config.status and run its commands if the dependencies of config.status (configure) are newer than config.status.

Adding some real functionality

Well, it's about time we move forward and put some true functionality into this configure.ac file. I've danced around the topic of having config.status generate a makefile up to this point. Here's the code to actually make this happen in configure.ac. It constitutes a single additional macro expansion between the original two lines:

$ cat configure.ac
AC_INIT([jupiter], [1.0])
AC_CONFIG_FILES([Makefile
                 src/Makefile])
AC_OUTPUT
$

This code assumes that I have templates for Makefile and src/Makefile, called Makefile.in and src/Makefile.in, respectively. These files look exactly like their Makefile counterparts, with one exception: Any text that I want Autoconf to replace should be marked as Autoconf substitution variables, using the @VARIABLE@ syntax.

To create these files, I've merely renamed the existing makefiles to Makefile.in within the top-level and src directories. By the way, this is a common practice when "autoconfiscating" a project. Next, I added a few Autoconf substitution variables to replace my orignal default values. In fact, at the top of this file, I've added the special Autoconf substitution variable, @configure_input@ after a makefile comment HASH mark. This comment line will become the following text line in the generated Makefile:

# "Makefile.  Generated from Makefile.in by conf...

I've also added the makefile regeneration rules (from the examples above) to each of these templates, with slight file path differences in each file to account for their different positions relative to config.status and configure:

Makefile.in

# @configure_input@

# Package-related substitution variables
package        = @PACKAGE_NAME@
version        = @PACKAGE_VERSION@
tarname        = @PACKAGE_TARNAME@
distdir        = $(tarname)-$(version)

# Prefix-related substitution variables
prefix         = @prefix@
exec_prefix    = @exec_prefix@
bindir         = @bindir@
...
$(distdir):
        mkdir -p $(distdir)/src
        cp configure $(distdir)
        cp Makefile.in $(distdir)
        cp src/Makefile.in $(distdir)/src
        cp src/main.c $(distdir)/src

distcheck: $(distdir).tar.gz
        gzip -cd $+ | tar xvf -
        cd $(distdir); ./configure
        $(MAKE) -C $(distdir) all check
        $(MAKE) -C $(distdir) \
         DESTDIR=$${PWD}/$(distdir)/_inst \
         install uninstall
        $(MAKE) -C $(distdir) clean
        rm -rf $(distdir)
        @echo "*** Package $(distdir).tar.gz is\
         ready for distribution."

Makefile: Makefile.in config.status
        ./config.status $@

config.status: configure
        ./config.status --recheck
...

src/Makefile.in

# @configure_input@

# Package-related substitution variables
package        = @PACKAGE_NAME@
version        = @PACKAGE_VERSION@
tarname        = @PACKAGE_TARNAME@
distdir        = $(tarname)-$(version)

# Prefix-related substitution variables
prefix         = @prefix@
exec_prefix    = @exec_prefix@
bindir         = @bindir@
...
Makefile: Makefile.in ../config.status
        cd .. && ./config.status $@

../config.status: ../configure
        cd .. && ./config.status --recheck
...

I've removed the export statement in the top-level Makefile.in, and added a copy of all of the substitution variables into src/Makefile.in. Since config.status is generating both of these files, I can reap excellent benefits by substituting everything into both files. The primary advantage of doing this is that I can now run make in any sub-directory, and not be concerned about environment variables that would have been passed down by a higher-level makefile.

Finally, I've changed the distribution targets a bit. Rather than distribute the makefiles, I now want to distribute the Makefile.in templates, as well as the configure script. In addition, the distcheck target needed to be enhanced such that it runs the configure script before attempting to run make.

Generating files from templates

I'm now generating makefiles from Makefile.in templates. The fact is, however, that any (white space delimited) file listed in AC_CONFIG_FILES will be generated from a file of the same name with a ".in" extension, found in the same directory. The ".in" extension is the default template naming pattern for AC_CONFIG_FILES, but this default behavior may be overridden, if you wish. I'll get into the details shortly.

Autoconf generates sed or awk expressions into the resulting configure script, which then copies them into the config.status script. The config.status script uses these tools to perform this simple string replacement.

Both sed and awk are text processing tools that operate on file streams. The advantage of a stream editor (the name "sed" is actually a contraction of the phrase "stream editor") is that it replaces text patterns in a byte stream. Thus, both sed and awk can operate on huge files, because they don't need to load the entire input file into memory in order to process it. The expression list passed to sed or awk by config.status is built by Autoconf from a list of variables defined by various macros, many of which I'll cover in greater detail in this chapter.

The important thing to notice here is that the Autoconf variables are the only items replaced in Makefile.in while generating the makefile. The reason this is important to understand is that it helps you to realize the flexibility you have when allowing Autoconf to generate a file from a template. This flexibility will become more apparent as I get into various use cases for the pre-defined Autoconf macros, and later in Chapter 9 when I delve into the topic of writing your own Autoconf macros.

At this point, I've created a basic configure.ac file, and I can indeed run autoreconf, followed by the generated configure script, and then make to build the Jupiter project.

The idea that I want to promote at this point is that this simple three-line configure.ac file generates a configure script that is fully functional, according to the definition of a configure script given in Chapter 7 of the the GNU Coding Standards document.

The resulting configure script runs various system checks and generates a config.status file, which can replace a fair number of substitution variables in a set of specified template files in a build system. That's a lot of stuff for three lines of code. (You'll recall my comments in the introduction to this book about C++ doing a lot for you with just a few lines of code?)

Adding VPATH build functionality

Okay, you may recall at the end of Chapter 2, I mentioned that I hadn't yet covered a key concept--that of VPATH builds. A VPATH build is a way of using a particular makefile construct (VPATH) to configure and build a project in a directory other than the source directory. Why is this important? Well, for several reasons. You may need to:

  1. maintain a separate debug configuration,
  2. test different configurations, side by side,
  3. keep a clean source directory for patch diffs after local modifications,
  4. or build from a read-only source directory.

These are all great reasons, but won't I have to change my entire build system to support this type of remote build? As it turns out, it's quite simple using the make utility's VPATH statement. VPATH is short for "virtual path", meaning "virtual search path". A VPATH statement contains a colon-separated list of places to look for dependencies, when they can't be found relative to the current directory:

VPATH = some/path:some/other/path:yet/another/path

jupiter : main.c
        gcc ...

In this (contrived) example, if make can't find main.c in the current directory while processing the rule, it will look for some/path/main.c, and then for some/other/path/main.c, and finally for yet/another/path/main.c, before finally giving up in dispair--okay, perhaps only with an error message about not knowing how to make main.c.

"Nice feature!", you say? Nicer than you think, because with just a few simple modifications, I can now completely support remote builds in my jupiter project build system:

Makefile.in

...
# VPATH-related substitution variables
srcdir         = @srcdir@
VPATH          = @srcdir@
...
$(distdir):
        mkdir -p $(distdir)/src
        cp $(srcdir)/configure $(distdir)
        cp $(srcdir)/Makefile.in $(distdir)
        cp $(srcdir)/src/Makefile.in $(distdir)/src
        cp $(srcdir)/src/main.c $(distdir)/src
...

src/Makefile.in

...
# VPATH-related substitution variables
srcdir         = @srcdir@
VPATH          = @srcdir@
...
jupiter: main.c
        gcc -g -O0 -o $@ $(srcdir)/main.c
...

That's it. Really. When config.status generates a file, it replaces an Autoconf substitution variable called @srcdir@ with the relative path to the template's source directory. Each makefile will get a different value for @srcdir@, depending on the relative location of its template.

The rules then for supporting VPATH builds in your make system are as follows:

  1. Set a make variable, srcdir to the @srcdir@ substitution variable.
  2. Set VPATH to @srcdir@ also--don't use $(srcdir) because some older versions of make don't do variable substitution within the value of VPATH.
  3. Prefix all file dependencies used in commands with $(srcdir)/.

If the source directory is the same as the build directory, then the @srcdir@ substitution variable degenerates to ".", so all of these "$(srcdir)/" prefixes degenerate to "./", which is just so much harmless baggage.

A quick example is the easiest way to show you how this works. Now that Jupiter is fully functional with respect to VPATH builds, let's just give it a try. Start in the jupiter project directory, create a subdirectory called "build", and then change into that directory. Now run configure using a relative path, and then list the current directory contents:

$ mkdir build
$ cd build
$ ../configure
configure: creating ./config.status
config.status: creating Makefile
config.status: creating src/Makefile
$ ls -1p
config.log
config.status
Makefile
src/
...

The entire build system seems to have been constructed by configure and config.status within the build sub-directory, just as it should be. What's more, it actually works:

...
$ make
make -C src all
make[1]: Entering directory `../prj/jupiter/bui...
gcc -g -O2 -o jupiter ../../src/main.c
make[1]: Leaving directory `../prj/jupiter/bui...
$ ls -1p src
jupiter
Makefile

VPATH builds work, not just from sub-directories of the project directory, but from anywhere you can access the project directory, using either a relative or an absolute path. This is just one more thing that Autoconf does for you in Autoconf-generated configure scripts. Just imagine managing proper relative paths to source directories in your own hand-coded configure scripts!

Let's take a breather

At this point, I'd like you to stop and consider what you've seen so far: I've shown you a mostly complete build system that includes most of the features outlined in the GNU Coding Standards document. The features of the Jupiter project's make system are all fairly self-contained, and reasonably simple to grasp. The most difficult feature to implement by hand is the configure script. In fact, writing a configure script by hand is so labor intensive relative to the simplicity of the Autoconf version that I just skipped over the hand-coded version entirely in Chapter 2.

If you've been one to complain about Autoconf in the past, I'd like you to consider what you have to complain about now. You now know how to get very feature-rich configuration functionality in just three lines of code. Given what you know now about how configure scripts are meant to work, can you see the value in Autoconf?

Most people never have trouble with that portion of Autoconf that I've covered up to this point. The trouble is that most people don't create their build systems in the manner I've just shown you. They try to copy the build system of another project, and then tweak it to make it work in their own project. Later when they start a new project, they do the same thing again. Are they going to run into problems? Sure--the "stuff" they're copying was often never meant to be used the way they're trying to use it.

I've seen projects in my experience whose configure.ac file contained junk that had nothing to do with the project to which it belonged. These left-over bits came from the previous project, from which configure.ac was copied. But the maintainer didn't know enough about Autoconf to remove the cruft. With the Autotools, it's better to start small, and add what you need, than to start with a full-featured build system, and try to pare it down to size.

Well, I'm sure you're feeling like there's a lot more learn about Autoconf. And you're right, but what additional Autoconf macros are appropriate for the Jupiter project?

An even quicker start with autoscan

The simplest way to create a (mostly) complete configure.ac file is to run the autoscan utility, which, if you remember from Chapter 1, is part of the Autoconf package.

First, I'll clean up the droppings from my earlier experiments, and then run the autoscan utility in the jupiter directory. Note here that I'm NOT deleting my original configure.ac file - I'll just let autoscan tell me what's wrong with it. In less than a second I'm left with a couple of new files in the top-level directory:

$ rm config.* Makefile src/Makefile ...
$ ls -1p
configure.ac
Makefile.in
src/
$ autoscan
configure.ac: warning: missing AC_CHECK_HEADERS
   ([stdlib.h]) wanted by: src/main.c:2
configure.ac: warning: missing AC_HEADER_STDC
   wanted by: src/main.c:2
configure.ac: warning: missing AC_PROG_CC
   wanted by: src/main.c
configure.ac: warning: missing AC_PROG_INSTALL
   wanted by: Makefile.in:11
$ ls -1p
autom4te.cache/
autoscan.log
configure.ac
configure.scan
Makefile.in
src/

NOTE: I've wrapped some of the output lines for the sake of column width during publication.

autoscan creates two files called configure.scan, and autoscan.log from a project directory hierarchy. The project may already be instrumented for Autotools, or not. It doesn't really matter because autoscan is decidedly non-destructive. It will never alter any existing files in a project.

autoscan generates a warning message for each issue discovered in an existing configure.ac file. In this example, autoscan noticed that configure.ac really should be using the AC_CHECK_HEADERS, AC_HEADER_STDC, AC_PROG_CC and AC_PROG_INSTALL macros. It made these assumptions based on scanning my existing Makefile.in templates and C source files, as you can see by the comments after each warning statement. You can always see these messages (in even greater detail, in fact) by examining the autoscan.log file.

Now let's take a look at the generated configure.scan file. autoscan has added more text to configure.scan than was originally in my configure.ac file, so it's probably easier for me to just overwrite configure.ac with configure.scan and then change the few bits of information that are specific to Jupiter:

$ mv configure.scan configure.ac
$ cat configure.ac
#                -*- Autoconf -*-
# Process this file with autoconf to produce ...

AC_PREREQ(2.61)
AC_INIT(FULL-PACKAGE-NAME, VERSION,
        BUG-REPORT-ADDRESS)
AC_CONFIG_SRCDIR([src/main.c])
AC_CONFIG_HEADERS([config.h])

# Checks for programs.
AC_PROG_CC
AC_PROG_INSTALL

# Checks for libraries.

# Checks for header files.
AC_HEADER_STDC
AC_CHECK_HEADERS([stdlib.h])

# Checks for typedefs, structures, and compiler ...

# Checks for library functions.

AC_CONFIG_FILES([Makefile
                 src/Makefile])
AC_OUTPUT

NOTE: The contents of your configure.ac file may differ slightly from mine, depending on the version of Autoconf that you have installed. I have version 2.62 of GNU Autoconf installed on my system (the latest, as of this writing), but if your version of autoscan is older (or newer), you may see some minor differences.

I'll then edit the file and change the AC_INIT macro to reflect the Jupiter project parameters:

$ head configure.ac
#                -*- Autoconf -*-
# Process this file with autoconf to produce ...

AC_PREREQ([2.61])
AC_INIT([jupiter], [1.0], [bugs@jupiter.org])
AC_CONFIG_SRCDIR([src/main.c])
AC_CONFIG_HEADERS([config.h])
$

The autoscan utility really does a lot of the work for you. The GNU Autoconf manual states that you should hand-tailor this file to your project before using it. This is true, but there are only a few key issues to worry about (besides those related to the AC_INIT macro). I'll cover each of these issues in turn, starting at the top of the file.

Trying out configure

I like to experiment, so the first thing I'd do at this point would be to try to run autoreconf on this new configure.ac. and then try to run the generated configure script to see what happens. If autoscan is all it's cracked up to be, then the resulting configure script should generate some makefiles for me:

$ autoreconf
$ ./configure
checking for gcc... gcc
checking for C compiler default output file name...
checking whether the C compiler works... yes
checking whether we are cross compiling... no
checking for suffix of executables...
checking for suffix of object files... o
checking whether we are using the GNU C compiler...
checking whether gcc accepts -g... yes
checking for gcc option to accept ISO C89...
configure: error: cannot find install-sh or 
   install.sh in "." "./.." "./../.."
$

Well, we didn't get too far. I mentioned the install utility in Chapter 1, and you may have already been aware of it. It appears here that Autoconf is looking for a shell script called install-sh or install.sh.

Autoconf is all about portability, and unfortunately, the install utility is not as portable as we'd like it to be. From one platform to another, critical bits of installation functionality are just different enough to cause problems, so the Autotools provide a shell script called install-sh (deprecated name: install.sh) that acts as a wrapper around the platform install utility. This wrapper script masks important differences between various versions of install.

autoscan noticed that I used the install program in my src/Makefile.in template, and generated an expansion of the AC_PROG_INSTALL macro into the configure.scan file based on this observation. The problem is that the generated configure script couldn't find the install-sh wrapper script.

This seems to be a minor defect in Autoconf--if Autoconf expects install-sh to be in my project directory, then it should have just put it there, right? Well, autoreconf has a command line option, --install, which is supposed to install missing files like this for me. I'll give it a try. Here's a before-and-after picture of my directory structure:

$ ls -1p
autoscan.log
configure.ac
Makefile.in
src/
$ autoreconf --install
$ ls -1p
autom4te.cache/
autoscan.log
config.h.in
configure
configure.ac
Makefile.in
src/

Hmmm. It didn't seem to work, as there's no install-sh file in the directory after running autoreconf --install. This is, in my opinion, a defect in both autoreconf and autoconf. You see, when autoreconf is used with the --install command-line option, it should install all auxilliary files required by all Autoconf macros used in configure.ac. The trouble is, this auxilliary-file-installation functionality is actually a part of Automake, not Autoconf. So when you use --install on the autoreconf command-line, it passes tool-specific install-missing-files options down to each of the tools that it calls. This technique would have worked just fine, except that Autoconf doesn't provide an option to install any missing files.

Worse still, the GNU Autoconf manual tells you in Section 5.2.1, under AC_PROG_INSTALL, that "Autoconf comes with a copy of install-sh that you can use." But this is a lie. In fact, it's Automake and Libtool that come with copies of install-sh, not Autoconf.

I could just copy install-sh from the Automake installation directory (PREFIX/share/automake...), but I'll just try running automake --add-missing --copy instead. The Automake --add-missing option copies in the missing required utility scripts, and the --copy option indicates that true copies should be made. Without the --copy option, automake would actually just create links to these files where they're installed (usually /usr/(local/)share/automake-1.10):

$ automake --add-missing --copy
configure.ac: no proper invocation of AM_INIT_...
configure.ac: You should verify that configure...
configure.ac: that aclocal.m4 is present in th...
configure.ac: and that aclocal.m4 was recently...
configure.ac:11: installing `./install-sh'
automake: no `Makefile.am' found for any confi...

Ignoring the warnings indicating that I've not yet configured my project properly for Automake, I can now see that install-sh was copied into my project root directory:

$ ls -1p
autom4te.cache/
autoscan.log
configure.ac
configure.scan
install-sh
Makefile.in
src/

So why didn't autoreconf --install do this for me? Isn't it supposed to run all the programs that it needed to, based on my configure.ac script? As it happens, it was exactly because my project was not configured for Automake, that autoreconf failed to run automake --add-missing --copy. Autoreconf saw no reason to run automake because configure.ac doesn't contain the requisite macros for initializing Automake.

And therein lies the defect. First, Autoconf should ship with install-sh, since it provides a macro that requires it, and because autoscan adds that macro based on the contents of a Makefile.in template. In addition, Autoconf should provide an "add-missing" command-line option, and autoreconf should use it when called with the --install option. This is most likely an example of the "work-in-progress" nature of the Autotools.

But, taking a step backward for a moment. There is another obvious solution to this problem. The install.sh script is not really required by any code generated by Autoconf. How could it be. Autoconf doesn't generate any makefile constructs, it only substitutes variables into your makefile.in templates. Thus, there's really no reason for Autoconf to complain about a missing install-sh script. When I presented this problem on the Autoconf mailing list, I was told several times that autoconf has no business copying install-sh into a project directory, thus there is no such functionality accessible from the Autoconf command line. If that is indeed the case, then Autoconf has no business complaining about the missing file. Regardless, something needs to be fixed...

The proverbial autogen.sh script

Before autoreconf came along, maintainers used a shell script, often called autogen.sh, to run all of the Autotools required for their projects in the proper order. The autogen.sh script is often fairly sophisticated, but to solve this problem temporarily, I'll just add a simple temporary autogen.sh script to the project root directory:

$ echo "automake --add-missing --copy
> autoreconf --install" > autogen.sh
chmod 0755 autogen.sh

If you don't want to see all the error messages from automake, just redirect the stderr and stdout output to /dev/null.

Eventually, we'll be able to get rid of autogen.sh file, and just run autoreconf --install, but for now, this will solve our missing files problems. Hopefully, you read this section before scratching your head too much over the missing install-sh script. I can now run my newly generated configure script without errors. I'll cover the details of properly using the AC_PROG_INSTALL macro shortly. I'll cover Automake in much greater detail in Chapter 4.

Updating Makefile.in

Okay, so how do the additional macros added by autoscan affect my build system? Well, I have some new files to consider. For one, the config.h.in file is generated for me now by autoheader. I can assume that autoreconf now executes autoheader for me when I run it. Additionally, I have a new file in my project called install-sh.

Anything provide by, or generated by the Autotools should be copied into the archive directory so that it can be shipped with my release tarballs. So, I should add these two files to the $(distdir) target in the top-level Makefile.in template. Note that I don't need to install autogen.sh, as it's purely a maintainer tool--my users shouldn't ever need to execute it from a tarball distribution:

Makefile.in

...
$(distdir):
        mkdir -p $(distdir)/src
        cp $(srcdir)/configure $(distdir)
        cp $(srcdir)/config.h.in $(distdir)
        cp $(srcdir)/install-sh $(distdir)
        cp $(srcdir)/Makefile.in $(distdir)
        cp $(srcdir)/src/Makefile.in $(distdir)/src
        cp $(srcdir)/src/main.c $(distdir)/src
...

If you're beginning to think that this could become a maintenance nightmare, then you're right. I warned you in Chapter 2 that the $(distdir) target was painful to maintain. Luckily the distcheck target still exists, and still works as designed. It would have caught this problem, because the distribution build will not work without these additional files, and certainly the check target wouldn't work, if the build didn't work. When I discuss Automake in Chapter 4, much of this mess will be cleared up.

Initialization and package information

The first section in my new configure.ac file (copied from configure.scan) contains Autoconf initialization macros. These are required for all projects. Let's consider each of these macros individually, as they're all pretty important.

AC_PREREQ

The AC_PREREQ macro simply defines the lowest version of Autoconf that may be used to successfully process the configure.ac script. The manual indicates that AC_PREREQ is the only macro that may be used before AC_INIT. The reason for this should be obvious--you'd like to be able to ensure you're using a late enough version of Autoconf before you begin processing any other macros, which may be version dependent. As it turns out, AC_INIT is not version dependent anyway, so you may place it first, if you're so inclined. I happen to prefer the way autoscan generates the file, so I'll leave it alone.

AC_INIT

The AC_INIT macro, as its name implies, initializes the Autoconf system. It accepts up to four arguments (autoscan only generates a call with the first three), PACKAGE, VERSION, and optional BUG-REPORT and TARNAME arguments. The PACKAGE argument is intended to be the name of the package. It will end up (in a canonicalized form) as the first part of the name of an Automake-generated release distribution tarball when you run "make dist".

In fact, by default, Automake-generated tarballs will be named TARNAME-VERSION.tar.gz, but TARNAME is set to a canonicalized form of the PACKAGE string (lower-cased, with all punctuation converted to underscores), unless you specify TARNAME manually, so bear this in mind when you choose your package name and version string. Incidentally, M4 macro arguments, including PACKAGE and VERSION, are just strings. M4 doesn't attempt to interpret any of the text that it processes.

The optional BUG-REPORT argument is usually set to an email address, but it can be any text really. An Autoconf substitution variable called PACKAGE_BUGREPORT will be created for it, and that variable will be added to a config.h.in template as a C preprocessor string, as well. The intent is that you use the variable in your code (or in template text files anywhere in your project) to present an email address for bug reports at appropriate places--possibly when the user requests help or version information from your application.

While the VERSION argument can be anything you like, there are a few free software conventions that will make life a little easier for you if you follow them. The widely used convention is to pass in major.minor (eg., 1.2). However, there's nothing that says you can't use major.minor.revision if you want, and there's nothing wrong with this approach. None of the resulting VERSION macros (Autoconf, shell or make) are parsed or analysed anywhere--only used in various places as replacement text, so if you wish, you may even add non-numeric text into this macro, such as 0.15.alpha1, which is useful occasionally.

Note that the RPM package manager does indeed care what you put in the version string. For the sake of RPM, you may wish to limit the version string text to only alpha-numerics and periods--no dashes or underscores, unfortunately.

Autoconf will generate the substitution variables PACKAGE_NAME, PACKAGE_VERSION, PACKAGE_TARNAME, PACKAGE_STRING (a stylized concatenation of the package name and version information), and PACKAGE_BUGREPORT from arguments to AC_INIT.

AC_CONFIG_SRCDIR

The AC_CONFIG_SRCDIR macro is just a sanity check. Its purpose is to ensure that the generated configure script knows that the directory on which it is being executed is in fact the correct project directory. The argument can be a relative path to any source file you like - I try to pick one that sort of defines the project. That way, in case I ever decide to reorganize source code, I'm not likely to lose it in a file rename. But it doesn't really matter, because if you do rename the file or move it to some other location some time down the road, you can always change the argument passed to AC_CONFIG_SRCDIR. Autoconf will tell you immediately if it can't find this file--after all, that's the purpose of this macro in the first place!

The instantiating macros

Before we dive into the details of AC_CONFIG_HEADERS, I'd like to spend a little time on the framework provided by Autoconf. From a high-level perspective, there are four major things happening in configure.ac:

  1. Initialization
  2. File instantiation
  3. Check requests
  4. Generation of the configure script

We've pretty much covered initialization--there's not much to it, although there are a few more macros you should be aware of. (Check out the GNU Autoconf manual to see what these are--look up AC_COPYRIGHT, for an example.) Now, let's move on to file instantiation.

There are actually four so-called "instantiating macros", which include AC_CONFIG_FILES, AC_CONFIG_HEADERS, AC_CONFIG_COMMANDS and AC_CONFIG_LINKS. An instantiating macro is one which defines one or more tags, usually referring to files that are to be translated by the generated configure scripts, from a template containing Autoconf substitution variables.

NOTE: You might need to change the name of AC_CONFIG_HEADER (singular) to AC_CONFIG_HEADERS (plural) in your version of configure.scan. This was a defect in autoscan that had not been fixed yet in Autoconf version 2.61. I reported the defect and a patch was committed. Version 2.62 works correctly. If your configure.scan is generated with a call to AC_CONFIG_HEADER, just change it manually. Both macros will work, as the singular version was the older name of this macro, but the older macro is less functional than the newer one.

These four instantiating macros have an interesting signature in common:

AC_CONFIG_xxxS([tag ...], [commands], [init-cmds])

For each of these four macros, the tag argument has the form, OUT[:INLIST] where INLIST has the form, IN0[:IN1:...:INn]. Often, you'll see a call to one of these macros with only a single simple argument, like this:

AC_CONFIG_HEADERS([config.h])

In this case, config.h is the OUT portion of the above specification. The default INLIST is the OUT portion with ".in" appended to it. So the above call is exactly equivalent to:

AC_CONFIG_HEADERS([config.h:config.h.in])

What this means is that config.status will contain shell code that will generate config.h from config.h.in, substituting all Autoconf variables in the process. You may also provide a list of input files to be concatenated, like this:

AC_CONFIG_HEADERS([config.h:cfg0:cfg1:cfg2])

In this example, config.status will generate config.h by concatenating cfg0, cfg1 and cfg2, after substituting all Autoconf variables. The GNU Autoconf manual calls this entire "OUT:INLIST" thing a "tag".

So, what's that all about, anyway? Why not call it a file? Well, the fact is, this parameter's primary purpose is to provide a sort of command-line target name--much like Makefile targets. It also happens to be used as a file system name, if the associated macro happens to generate file system entries, as is the case when calling AC_CONFIG_HEADERS, AC_CONFIG_FILES and AC_CONFIG_LINKS.

But AC_CONFIG_COMMANDS doesn't actually generate any files. Rather, it runs arbitrary shell code, as specified by the user in the macro. Thus, rather than name this first parameter after a secondary function (the generation of files), the manual refers to it by its primary purpose - as a command line tag-name that may be specified on the config.status command line. Here's an example:

./config.status config.h

This config.status command line will regenerate the config.h file based on the macro call to AC_CONFIG_HEADERS in configure.ac. It will only regenerate config.h. Now, if you're curious like me, you've already been playing around a little, and have tried typing ./config.status --help to see what options are available when executing config.status. You may have noticed that config.status has a help signature like this:

$ ./config.status --help
`config.status' instantiates files from templates
according to the current configuration.

Usage: ./config.status [OPTIONS] [FILE]...

  -h, --help       print this help, then exit
...
  --file=FILE[:TEMPLATE]
...
Configuration files:
 Makefile src/Makefile

Configuration headers:
 config.h

NOTE: I left out portions of the help display irrelevant to this discussion.

I'd like you to notice a couple of interesting things about this help display. First, config.status is designed to give you custom help about this particular project's config.status file. It lists "Configuration files" and "Configuration headers" that you may use as tags. Oddly, given the "tag" nomenclature used in the manual so rigorously, the help line still refers to such tags as [FILE]s in the "Usage:" line. Regardless, where the usage specifies [FILE]s you may use one or more of the listed configuration files, headers, links, or commands displayed below it. In this case, config.status will only instantiate those objects. In the case of commands, it will execute the commands specified by the tag passed in the associated expansion of the AC_CONFIG_COMMANDS macro.

Each of these macros may be used multiple times in a configure.ac script. The results are cumulative. This means that I can use AC_CONFIG_FILES as many times as I need to in my configure.ac file. Reasons why I may want to use it more than once are not obvious right now, but I'll get to them eventually.

Another noteworthy item here is that there is a --file option. Now why would config.status allow us to specify files either with or without the --file= in front of them? Well, these are actually different usages of the [FILE] option, which is why it would make more sense for the usage text to read:

$ ./config.status --help
...
Usage: ./config.status [OPTIONS] [TAG]...

When config.status is called with tag names on the command line, only those tags listed in the help text as available configuration files, headers, links and commands may be used as tags. When you execute config.status with the --file= option, you're really telling config.status to generate a new file not already associated with any of the calls to instantiating macros in your configure.ac script. The file is generated from a template using configuration options and check results determined by the the last execution of the configure script. For example, I could execute config.status like this:

./config.status --file=extra:extra.in

NOTE: The default template name is the file name with a ".in" suffix, so this call could have been made without using the ":extra.in" portion of the option.

Let's get back to the instantiating macro signature. The tag argument has a complex format, but it also represents multiple tags. Take another look:

AC_CONFIG_xxxS([tag ...], [commands], [init-cmds])

The elipsis after tag indicates there may be more than one, and in fact, this is true. The tag argument accepts multiple tag specifications, separated by white space or new-line characters. Often you'll see a call like this:

configure.ac

...
AC_CONFIG_FILES([Makefile
                 src/Makefile
                 lib/Makefile
                 etc/project.cfg])
...

Each entry here is one tag specification, which if fully specified would look like this:

configure.ac

...
AC_CONFIG_FILES([Makefile:Makefile.in
                 src/Makefile:src/Makefile.in
                 lib/Makefile:lib/Makefile.in
                 etc/proj.cfg:etc/proj.cfg.in])
...

There's still one more point to cover. There are two optional arguments that you'll not often see used in the instantiating macros, commands and init-cmds. The commands argument may be used to specify some arbitrary shell code that should be executed by config.status just before the files associated with the tags are generated. You'll not often see this used with the file generating instantiating macros, but in the case of AC_CONFIG_COMMANDS, which generates no files by default, you almost always see this arugument used, because a call to this macro is basically useless without it! In this case, the tag argument becomes a way of telling config.status to execute a set of shell commands.

The init-cmds argument is used to initialize shell variables at the top of config.status with values available in configure.ac and configure. It's important to remember that all calls to instantiating macros share a common namespace along with config.status, so choose shell variable names carefully.

The old adage about the relative value of a picture vs. an explanation holds true here, so let's try a little experiment. Create a test version of your configure.ac file containing only the following lines:

configure.ac

AC_INIT(test, 1.0)
AC_CONFIG_COMMANDS([abc],
                   [echo "Testing $mypkgname"],
                   [mypkgname=$PACKAGE_NAME])
AC_OUTPUT

Then execute autoreconf, configure, and config.status in various ways to see what happens:

$ autoreconf
$ ./configure
configure: creating ./config.status
config.status: executing abc commands
Testing test
$ ./config.status
config.status: executing abc commands
Testing test
$ ./config.status --help
`config.status' instantiates files from templates
according to the current configuration.

Usage: ./config.status [OPTIONS] [FILE]...
...
Configuration commands:
 abc

Report bugs to <bug-autoconf@gnu.org>.
$ ./config.status abc
config.status: executing abc commands
Testing test
$

As you can see here, executing configure caused config.status to be executed with no command line options. There are no checks specified in configure.ac. so executing config.status has nearly the same effect. Querying config.status for help indicates that "abc" is a valid tag, and executing config.status with that tag simply runs the associated commands.

Okay, enough fooling around. The important points to remember here are:

  1. Both configure and config.status may be called individually to perform their individual tasks.
  2. The config.status script generates all files from templates.
  3. The configure script performs all checks and then executes config.status.
  4. config.status generates files based on the last set of check results.
  5. config.status may be called to execute file generation or command sets specified by any of the tag names given in any of the instantiating macro calls.
  6. config.status may generate files not associated with any tags specified in configure.ac.
  7. config.status can be used to call configure with the same set of command line options used in the last execution of configure.

AC_CONFIG_HEADERS

As you've no doubt concluded by now, the AC_CONFIG_HEADERS macro allows you to specify one or more header files to be generated from template files. You may write multiple template header files yourself, if you wish. The format of a configuration header template is very specific:

/* Define as 1 if you have unistd.h. */
#undef HAVE_UNISTD_H

Multiple such statements may be placed in your header template. The comments are optional, of course. Let's try another experiment. Create a new configure.ac file with the following contents:

configure.ac

AC_INIT([test], [1.0])
AC_CONFIG_HEADERS([config.h])
AC_CHECK_HEADERS([unistd.h foobar.h])
AC_OUTPUT

Now create a configuration header template file called config.h.in, which contains the following two lines:

config.h.in

#undef HAVE_UNISTD_H
#undef HAVE_FOOBAR_H

Finally, execute the following commands:

$ autoconf
$ ./configure
checking for gcc... gcc
...
checking for unistd.h... yes
checking for unistd.h... (cached) yes
checking foobar.h usability... no
checking foobar.h presence... no
checking for foobar.h... no
configure: creating ./config.status
config.status: creating config.h
$
$ cat config.h
/* config.h.  Generated from ...  */
#define HAVE_UNISTD_H 1
/* #undef HAVE_FOOBAR_H */

You can see that config.status generated a config.h file from your config.h.in template file. The contents of this header file are based on the checks executed by the configure script. Since the shell code generated by AC_CHECK_HEADERS([unistd.h foobar.h]) was able to locate a unistd.h header file in the standard include directory, the corresponding #undef statement was converted into a #define statement. Of course, no foobar.h header was found in the system include directory, as you can also see by the output of configure, so it's definition was left commented out in the template.

Thus, you may add this sort of code to appropriate C source files in your project:

#if HAVE_CONFIG_H
# include <config.h>
#endif

#if HAVE_UNISTD_H
# include <unistd.h>
#endif

Using Autoheader to generate an include file template

Maintaining your config.h.in template is more pain than necessary. After all, most of the information you need is already encapsulated in your configure.ac script, and the format of config.h.in is very strict. For example, you may not have any leading or trailing white space on the #undef lines.

Fortunately, the autoheader utility will generate an include header template for you based on your configure.ac file contents. Back to the command prompt for another quick experiment. This one is easy--just delete your config.h.in template before you run autoheader and autoconf, like this:

$ rm config.h.in
$ autoheader
$ autoconf
$ ./configure
checking for gcc... gcc
...
checking for unistd.h... yes
checking for unistd.h... (cached) yes
checking foobar.h usability... no
checking foobar.h presence... no
checking for foobar.h... no
configure: creating ./config.status
config.status: creating config.h
$ cat config.h
/* config.h. Generated from config.h.in...  */
/* config.h.in. Generated from configure.ac... */
...
/* Define to 1 if you have... */
/* #undef HAVE_FOOBAR_H */

/* Define to 1 if you have... */
#define HAVE_UNISTD_H 1

/* Define to the address where bug... */ 
#define PACKAGE_BUGREPORT ""

/* Define to the full name of this package. */
#define PACKAGE_NAME "test"

/* Define to the full name and version... */
#define PACKAGE_STRING "test 1.0"

/* Define to the one symbol short name... */
#define PACKAGE_TARNAME "test"

/* Define to the version... */
#define PACKAGE_VERSION "1.0"

/* Define to 1 if you have the ANSI C... */
#define STDC_HEADERS 1

NOTE: Here again, I encourage you to use autoreconf, which will automatically run autoheader for you if it notices an expansion of the AC_CONFIG_HEADERS macro in your configure.ac script.

You may also want to take a peek at the config.h.in template file generated by autoheader. In the meantime, here's a much more realistic example of using a generated config.h file for the sake of portability of project source code.

AC_INIT([test], [1.0])
AC_CONFIG_HEADERS([config.h])
AC_CHECK_HEADERS([dlfcn.h])
AC_OUTPUT

The config.h file is obviously intended to be included in your source code in locations where you might wish to test a configured option in the code itself using the C preprocessor. Using this configure.ac script, Autoconf will generate a config.h header file with appropriate definitions for determining, at compile time, if the current system provides the dlfcn interface. To complete the portability check, you can add the following code to a source file that uses dynamic loader functionality in your project:

#if HAVE_CONFIG_H
# include <config.h>
#endif

#if HAVE_DLFCN_H
# include <dlfcn.h>
#else
# error Sorry, this code requires dlfcn.h.
#endif
...
#if HAVE_DLFCN_H
   handle = dlopen(
      "/usr/lib/libwhatever.so", RTLD_NOW);
#endif
...

If you already had code that included dlfcn.h then autoscan will have generated a configure.ac call to AC_CHECK_HEADERS, which contains dlfcn.h as one of the header files to be checked. Your job as the maintainer is to add the conditional to your source code around the existing use of the dlfcn.h header inclusion and the libdl.so API calls. This is the crux of Autoconf-provided portability.

Your project may be able to get along at compile time without the dynamic loader functionality if it must, but it would be nice to have it. Perhaps, your project will function in a limited manner without it. Sometimes you just have to bail out with a compiler error (as this code does) if the key functionality is missing. Often this is an acceptable first-attempt solution, until someone comes along and adds support to the code base for some other dynamic loader service that is perhaps available on non-dlfcn-oriented systems.

NOTE: If you have to bail out with an error, it's best to do so at configuration time, rather than at compile time. The general rule of thumb is to bail out as early as possible. I'll cover examples of this sort of activity shortly.

One obvious flaw in this source code is that config.h is only included if HAVE_CONFIG_H is defined in your compilation environment. But wait...doesn't that definition happen in config.h?! Well, no, not in the case of this particular definition. HAVE_CONFIG_H must be either defined by you manually, if you're writing your own makefiles, or automatically by Automake-generated makefiles on the compiler command line. (Are you beginning to get the feeling that Autoconf really shines when used in conjunction with Automake?)

HAVE_CONFIG_H is part of a string of definitions passed on the compiler command line in the Autoconf substitution variable @DEFS@. Before Autoheader and AC_CONFIG_HEADERS, all of the compiler configuration macros were added to the @DEFS@ variable. You can still use this method if you don't use AC_CONFIG_HEADERS in configure.ac. but it's not the recommended method nowadays, mainly because a large number of definitions make for a very long compiler command line.

Back to VPATH builds for a moment

Regarding VPATH builds, I haven't yet covered how to get the preprocessor to properly locate my generated config.h file. This file, being a generated file, will be found in the same relative position in the build directory structure as its counterpart template file, config.h.in. The template is located in the top-level source directory (unless you choose to put it somewhere else), so the generated file will be in the top-level build directory. Well, that's easy enough--it's always one level up from the generated src/Makefile.

Consider where I might have include files in this project. I might add an internal header file to the current source directory. I obviously now have a config.h file in my top-level build directory. I might also create a top-level source include directory for library interface header files. In which order should I care about these files?

The order I place include directives (-I&lt;path>) options on the compiler command line is the order which they will be searched. The proper preprocessor include paths should include the current build directory (.), the source directory ($(srcdir)), and the top-level build directory (..), in that order:

...
jupiter: main.c
        gcc -g -O0 -I. -I$(srcdir) -I..\
         -o $@ $(srcdir)/main.c
...

It appears that I now need an additional rule of thumb for VPATH builds:

  1. Add preprocessor commands for the current build and associated source and top-level build directories, in that order.

Checks for compilers

The AC_PROG_CC macro ensures that I have a working C language compiler. This call was added to configure.scan when autoscan noticed that I had C source files in my project directory. If I'd had files suffixed with ".cxx" or ".C" (an upper-case ".C" extension indicates a C++ source file), it would have inserted a call to the AC_PROG_CXX macro, as well as a call to AC_LANG([C++]).

This macro looks for gcc and then cc in the system search path. If neither of these are found, it looks for other C compilers. When a compatible compiler is found, it sets a well-known variable, $CC to the full path of the program, with options for portability, if necessary.

AC_PROG_CC accepts an optional parameter containing an ordered list of compiler names. For example, if you used AC_PROG_CC([cc cl gcc]), then the macro would expand into shell code that searched for cc, cl and gcc, in that order.

The AC_PROG_CC macro also defines the following Autoconf substitution variables:

  • @CC@ (full path of compiler)
  • @CFLAGS@ (eg., -g -O2 for gcc)
  • @CPPFLAGS@ (empty by default)
  • @EXEEXT@ (eg., .exe)
  • @OBJEXT@ (eg., .o)

AC_PROG_CC configures these substitution variables, but unless I used them in my Makefile.in templates, I'm just wasting time running configure. I'll add a few of these as make variables to my src/Makefile.in template, and then consume them, like this:

# Tool-related substitution variables
CC             = @CC@
CFLAGS         = @CFLAGS@
CPPFLAGS       = @CPPFLAGS@
...
jupiter: main.c
        $(CC) $(CFLAGS) $(CPPFLAGS)\
         -I. -I$(srcdir) -I..\
         -o $@ $(srcdir)/main.c

Checking for other programs

Now, let's return to the AC_PROG_INSTALL macro. As with the AC_PROG_CC macro, the other AC_PROG_* macros set and then substitute (using AC_SUBST) various environment variables that point to the located utility. To make use of this check, you need to use these Autoconf substitution variables in your Makefile.in templates, just as I did with CC, CFLAGS, and CPPFLAGS above:

...
# Tool-related substitution variables
CC             = @CC@
CFLAGS         = @CFLAGS@
CPPFLAGS       = @CPPFLAGS@
INSTALL        = @INSTALL@
INSTALL_DATA   = @INSTALL_DATA@
INSTALL_PROGRAM= @INSTALL_PROGRAM@
INSTALL_SCRIPT = @INSTALL_SCRIPT@
...
install:
        $(INSTALL) -d $(DESTDIR)$(bindir)/jupiter
        $(INSTALL_PROGRAM) -m 0755 jupiter \
         $(DESTDIR)$(bindir)/jupiter
...

The value of @INSTALL@ is obviously the path of the located install script. The value of @INSTALL_DATA@ is ${INSTALL} -m 0644. Now, you'd think that the values of @INSTALL_PROGRAM@ and @INSTALL_SCRIPT@ would be ${INSTALL} -m 0755, but they're not. These are just set to ${INSTALL}. Oversight? I don't know.

Other important utility programs you might need to check for are lex, yacc, sed, awk, etc. If so, you can add calls to AC_PROG_LEX, AC_PROG_YACC, AC_PROG_SED, or AC_PROG_AWK yourself. There are about a dozen different programs you can check for using these more specialized macros. If such a program check fails, then the resulting configure script will fail with a message indicating that the required utility could not be found, and that the build may not continue until it's been properly installed.

As with the other program and compiler checks in Makefile.in templates, you should use the make variables $(LEXX) and $(YACC) to invoke these tools (note that Automake does this for you), as these Autoconf macros will set the values of these variables according to the tools it finds installed on your system if they are not already set in your environment.

Now, this is a key aspect of configure scripts generated by Autoconf--you may always override anything configure will do to your environment by exporting or setting an appropriate output variable before you execute configure.

For example, perhaps you would like to build with a very specific version of bison that you've installed in your own home directory:

$ cd jupiter
$ YACC="$HOME/bin/bison -y" ./configure
$ ...

This will ensure that YACC is set the way you want for your makefiles, and that AC_PROG_YACC does essentially nothing in your configure script.

If you need to check for the existence of a program not covered by these more specialized macros, you can call the generic AC_CHECK_PROG macro, or you can write your own special purpose macro (I'll cover writing macros in Chapter 9).

Key points to take away:

  1. AC_PROG_* macros check for the existence of programs.
  2. If a program is found, a substitution variable is created.
  3. Use these variables in your Makefile.in templates to execute the program.

A common problem with Autoconf

Here's a common problem that developers new to the Autotools consistently encounter. Take a look at the formal definition of AC_CHECK_PROG found in the GNU Autoconf manual. NOTE: In this case, the square brackets represent optional parameters, not Autoconf quotes.:

AC_CHECK_PROG(variable, prog-to-check-for, value-if-found, [value-if-not-found], [path], [reject])

Check whether program prog-to-check-for exists in PATH. If it is found, set variable to value-if-found, otherwise to value-if-not-found, if given. Always pass over reject (an absolute file name) even if it is the first found in the search path; in that case, set variable using the absolute file name of the prog-to-check-for found that is not reject. If variable was already set, do nothing. Calls AC_SUBST for variable.

I can extract the following clearly defined functionality from this description:

  1. If prog-to-check-for is found in the system search path, then variable is set to value-if-found, otherwise it's set to value-if-not-found.
  2. If reject is specified (as a full path), then skip it if it's found first, and continue to the next matching program in the system search path.
  3. If reject is found first in the path, and then another match is found besides reject, set variable to the absolute path name of the second (non-reject) match.
  4. If variable is already set by the user in the environment, then variable is left untouched (thereby allowing the user to override the check by setting variable before running autoconf).
  5. AC_SUBST is called on variable to make it an Autoconf substitution variable.

At first read, there appears to be a terrible conflict of interest here: We can see in point 1 that variable will be set to one or the other of two specified values, based on whether or not prog-to-check-for is found in the system search path. But then in point 3 it states that variable will be set to the full path of some program, but only if reject is found first and skipped. Clearly the documentation needs a little work.

Discovering the real functionality of AC_CHECK_PROG is as easy as reading a little shell script. While you could spend your time looking at the definition of AC_CHECK_PROG in /usr/share/autoconf/autoconf/programs.m4, the problem with this approach is that you're one level removed from the actual shell code performing the check. Wouldn't it be better to just look at the resulting shell script generated by AC_CHECK_PROG? Okay, then modify your new configure.ac file in this manner:

...
AC_PREREQ(2.59)
AC_INIT([jupiter], [1.0], 
   [jupiter-devel@lists.example.com])
AC_CONFIG_SRCDIR([src/main.c])
AC_CONFIG_HEADER([config.h])

# Checks for programs.
AC_PROG_CC
AC_CHECK_PROG([bash_var], [bash], [yes], 
   [no],, [/usr/sbin/bash])
...

Now just execute autoconf and then open the resulting configure script and search for something specific to the definition of AC_CHECK_PROG. I used the string "ac_cv_prog_bash_var", a shell variable generated by the macro call. You may have to glance at the definition of a macro to find reasonable search text:

$ autoconf
$ vi -c /ac_cv_prog_bash_var configure
...
# Extract the first word of "bash", so it can be
#   a program name with args.
set dummy bash; ac_word=$2
echo "$as_me:$LINENO: checking for $ac_word" >&5
echo $ECHO_N "checking for $ac_word... $ECHO_C"\
 >&6
if test "${ac_cv_prog_bash_var+set}" = set; then
  echo $ECHO_N "(cached) $ECHO_C" >&6
else
  if test -n "$bash_var"; then
  # Let the user override the test.
  ac_cv_prog_bash_var="$bash_var"
else
  ac_prog_rejected=no
as_save_IFS=$IFS; IFS=$PATH_SEPARATOR
for as_dir in $PATH
do
  IFS=$as_save_IFS
  test -z "$as_dir" && as_dir=.
  for ac_exec_ext in ''\
 $ac_executable_extensions;
  do
  if $as_executable_p\
 "$as_dir/$ac_word$ac_exec_ext"; then
    if test "$as_dir/$ac_word$ac_exec_ext" =\
 "/usr/sbin/bash"; then
       ac_prog_rejected=yes
       continue
     fi
    ac_cv_prog_bash_var="yes"
    echo "$as_me:$LINENO: found\
 $as_dir/$ac_word$ac_exec_ext" >&5
    break 2
  fi
done
done

if test $ac_prog_rejected = yes; then
  # We found a bogon in the path, so make sure
  # we never use it.
  set dummy $ac_cv_prog_bash_var
  shift
  if test $# != 0; then
    # We chose a different compiler from the
    # bogus one. However, it has the same
    # basename, so the bogon will be chosen
    # first if we set bash_var to just the
    # basename; use the full file name.
    shift
    ac_cv_prog_bash_var=\
 "$as_dir/$ac_word${1+' '}$@"
  fi
fi
  test -z "$ac_cv_prog_bash_var" &&\
 ac_cv_prog_bash_var="no"
fi
fi
bash_var=$ac_cv_prog_bash_var
if test -n "$bash_var"; then
  echo "$as_me:$LINENO: result: $bash_var" >&5
echo "${ECHO_T}$bash_var" >&6
else
  echo "$as_me:$LINENO: result: no" >&5
echo "${ECHO_T}no" >&6
fi
...

Wow! You can immediately see by the opening comment that AC_CHECK_PROG has some undocumented functionality: You can pass in arguments with the program name if you wish. But why would you want to? Well, look farther. You can probably fairly accurately deduce that the reject parameter was added into the mix in order to allow your configure script to search for a particular version of a tool. (Could it possibly be that someone might really rather use the GNU C compiler instead of the Solaris C compiler?)

In fact, it appears that variable really is set based on a tri-state condition. If reject is not used, then variable can only be either value-if-found or value-if-not-found. But if reject is used, then variable can also be the full path of the first program found that is not reject! Well, that is exactly what the documentation stated, but examining the generated code yields insight into the authors' intended use of this macro. We probably should have called AC_CHECK_PROG this way, instead:

AC_CHECK_PROG([bash_shell],[bash -x],[bash -x],,,
              [/usr/sbin/bash])

Now it makes more sense, and you can see by this example that the manual is in fact accurate, if not clear. If reject is not specified, and bash is found in the system path, then bash_shell will be set to bash -x. If it's not found in the system path, then bash_shell will be set to the empty string. If, on the other hand, reject is specified, and the undesired version of bash is found first in the path, then bash_shell will be set to the full path of the next version found in the path, along with the originally specified arguments (-x). The bash_shell variable may now be used by the rest of our script to run the desired bash shell, if it doesn't test out as empty. Wow! No wonder it was hard to document in a way that's easy to understand! But quite frankly, a good example of the intended use of this macro, along with a couple of sentences of explanation would have made all the difference.

Checks for libraries and header files

Does your project rely on external libraries? Most non-trivial projects do. If you're lucky, your project relies only on libraries that are already widely available and ported to most platforms.

The choice to use an external library or not is a tough one. On the one hand, you'll want to reuse code that provides functionality--perhaps significant functionality that you need and don't really have the time or expertise to write yourself. Reuse is one of the hallmarks of the free software world.

On the other hand, you don't want to depend on functionality that may not exist on all of the platforms you wish to target, or that requires significant porting effort on your part to make these libraries available on all of your target platforms.

Occasionally, library-based functionality can exist in slightly different forms on different platforms. These different forms may be functionally compatible, but have different API signatures. For example, POSIX threads (pthreads) versus a native threading library. For basic multi-threading functionality, many threading libraries are similar enough to be almost drop-in replacements of each other.

To illustrate this concept, I'll add some trival multi-threading capabilities to the Jupiter project. I want to have jupiter print its message using a background thread. To do this, I'm going to need to add the pthreads library to my project build system. If I weren't using the Autotools, I'd just add it to my linker command line in the makefile:

jupiter: main.c
        $(CC) ... -lpthreads ...

But what if a system doesn't support pthreads? I might want to support native threads on a non-pthreads system--say Solaris native threads, using the libthreads library.

To do this, I'll first modify my main.c file such that the printing happens in a secondary thread, like this:

src/main.c

#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>

static void * print_it(void * data)
{
   printf("Hello from %s!\n", (char *)data);
   return 0;
}

int main(int argc, char * argv[])
{
   pthread_t tid;
   pthread_create(&tid, 0, print_it, argv[0]);
   pthread_join(tid, 0);
   return 0;
}

Now, this is clearly a ridiculous use of a thread. Nonetheless, it is the prototypical form of thread usage. Consider the case where print_it did some long calculation, and main had other things to do while print_it performed this calculation. On a multi-processor machine, this could literally double the throughput of such a program.

What we now need is a way of determining which libraries should be added to the compiler command line. Enter Autoconf and the AC_CHECK_* macros. The AC_SEARCH_LIBS macro allows us to check for key functionality within a list of libraries. If the function exists within one of the specified libraries, then an appropriate command line option is added to the @LIBS@ substitution variable. The @LIBS@ variable should be used in a Makefile.in template on the compiler (linker) command line. Here is the formal definition of AC_SEARCH_LIBS, again from the manual:

AC_SEARCH_LIBS(function, search-libs, [action-if-found], [action-if-not-found], [other-libraries]) Search for a library defining function if it's not already available. This equates to calling AC_LINK_IFELSE([AC_LANG_CALL([], [function])]) first with no libraries, then for each library listed in search-libs. Add -llibrary to LIBS for the first library found to contain function, and run action-if-found. If function is not found, run action-if-not-found. If linking with the library results in unresolved symbols that would be resolved by linking with additional libraries, give those libraries as the other-libraries argument, separated by spaces: e.g., -lXt -lX11. Otherwise, this macro fails to detect that function is present, because linking the test program always fails with unresolved symbols.

Wow, that's a lot of stuff for one macro. Are you beginning to see why the generated configure script is so large? Essentially, what you get by calling AC_SEARCH_LIBS for a particular function is that the proper linker command line arguments (eg., -lpthread), for linking with a library containing the desired function, are added to a substitution variable called @LIBS@. Here's how I'll use AC_SEARCH_LIBS in my configure.ac file:

configure.ac

...
# Checks for libraries.
AC_SEARCH_LIBS([pthread_create], [pthread])
...

Of course, I'll have to modify src/Makefile.in again to make proper use of the now populated LIBS variable:

...
# Tool-related substitution variables
CC             = @CC@
LIBS           = @LIBS@
CFLAGS         = @CFLAGS@
CPPFLAGS       = @CPPFLAGS@
...
jupiter: main.c
        $(CC) $(CFLAGS) $(CPPFLAGS)\
         -I. -I$(srcdir) -I..\
         -o $@ $(srcdir)/main.c $(LIBS)
...

Note that I added $(LIBS) after the source file on the compiler command line. Generally, the linker cares about object file order, and searches them for required functions in the order they are specified on the command line. Since I want main.c to be the primary source of object code for jupiter, I'll continue to add additional objects, including libraries, after this file on the command line.

Right or just good enough?

I could just stop at this point. I've done enough to make this build system properly use pthreads on most systems. If a library is needed, it'll be added to the @LIBS@ variable, and subsequently used on my compiler command line. In fact, this is the point at which many maintainers would stop. The problem is that stopping here is just about the build-system equivalent of not checking the return value of malloc in a C program (and there are many developers out there who don't give this process the credit it deserves either). It usually works fine. It's just during those few cases where it fails that you have a real problem.

Well, I want to provide a good user experience, so I'll take Jupiter's build system to the "next level". However, in order to do this, I need to make a design decision: In case configure fails to locate a pthread library on a user's system, should I fail the build process, or build a jupiter program without multi-threading? If I fail the build, it will generally be obvious to the user, because the build has stopped with an error message--although, perhaps not a very user-friendly one. At this point, either the compile process or the link process will fail with a cryptic error message about a missing header file or an undefined symbol. If I choose to build a single-threaded version of jupiter, I should probably display some clear message that I'm moving forward without threads, and why.

There's another potential problem also. Some users' systems may have a pthread library installed, but not have the pthread.h header file installed properly. This can happen for a variety of reasons, but the most common is that the executable package was installed, but not the developer package. Executable binaries are often packaged independently of static libraries and header files. Executables are installed as part of a dependency chain for a higher level consuming application, while developer packages are often only installed directly by a user. For this reason, Autoconf provides checks for both libraries and header files. The AC_CHECK_HEADERS macro is used to ensure the existence of a particular header file.

Autoconf checks are very thorough. They generally not only ensure the existence of a file, but also that the file is in fact the one you're looking for. They do this by allowing you to make some assertions about the file, which are then verified by the macro. Additionally, the AC_CHECK_HEADERS macro doesn't just scan the file system for the requested header. It actually builds a short test program in the appropriate language, and then compiles it to ensure that the compiler can both find the file, and use it. Similarly, AC_SEARCH_LIBS is built around an attempt to link to the specified libraries, and import the requested symbols.

Here is the formal definition of AC_CHECK_HEADERS, as found in the GNU Autoconf manual:

AC_CHECK_HEADERS(header-file..., [action-if-found], [action-if-not-found], [includes = 'default-includes']) For each given system header file header-file in the blank-separated argument list that exists, define HAVE_header-file (in all capitals). If action-if-found is given, it is additional shell code to execute when one of the header files is found. You can give it a value of break to break out of the loop on the first match. If action-if-not-found is given, it is executed when one of the header files is not found.

Normally, this macro is called only with a list of desired header files in the first argument. Remaining arguments are optional and are not often used. The reason for this is that the macro is very functional when used in this manner. I'll add a check for the pthread library using AC_CHECK_HEADERS to my configure.ac file.

If you're the jump-right-in type, then you've noticed by now that configure.ac already calls AC_CHECK_HEADERS for stdlib.h. No problem--I'll just add pthread.h to the list, using a space to separate the file names, like this:

...
# Checks for header files.
AC_HEADER_STDC
AC_CHECK_HEADERS([stdlib.h pthread.h])
...

I like to make my packages available to as many people as possible, so I'll go ahead and use the dual-mode build approach, where I can at least provide some form of jupiter program to users without pthreads. To accomplish this, I'll need to add some conditional compilation preprocessor code to my src/main.c file:

src/main.c

#include <stdio.h>
#include <stdlib.h>

#if HAVE_PTHREAD_H
# include <pthread.h>
#endif

static void * print_it(void * data)
{
   printf("Hello from %s!\n", (char *)data);
   return 0;
}

int main(int argc, char * argv[])
{
#if HAVE_PTHREAD_H
   pthread_t tid;
   pthread_create(&tid, 0, print_it, argv[0]);
   pthread_join(tid, 0);
#else
   print_it(argv[0]);
#endif
   return 0;
}

In this version of main.c, I've added a couple of conditional checks for the existence of the header file. The HAVE_PTHREAD_H macro will be defined to the value 1 in the config.h.in template, if the AC_CHECK_HEADERS macro locates the pthread.h header file, otherwise the definition will be added as a comment in the template. Thus, I'll need to include the config.h file at the top of my main.c file:

#if HAVE_CONFIG_H
# include <config.h>
#endif
...

Recall that HAVE_CONFIG_H must be defined on the compiler command line, and that Autoconf populates the @DEFS@, substitution variable with this definition, if config.h is available. If you choose not to use the AC_CONFIG_HEADERS macro in your configure.ac, then @DEFS@ will contain all of the definitions generated by all of the various check macros you do use. In this example, I've used AC_CONFIG_HEADERS, so my config.h.in template will contain most of these definitions, and @DEFS@ will only contain HAVE_CONFIG_H. Again, this is a nice way to go because it significantly shortens the compiler command line. An additional benefit is that it becomes very simple to take a snapshot of the template, and modify it by hand for non-Autotools platforms, such as Microsoft Windows, which doesn't require as dynamic of a configuration process as does Unix/Linux. I'll go ahead and make the required changes to my src/Makefile.in template, like this:

src/Makefile.in

...
# Tool-related substitution variables
CC             = @CC@
DEFS           = @DEFS@
LIBS           = @LIBS@
CFLAGS         = @CFLAGS@
CPPFLAGS       = @CPPFLAGS@
...
jupiter: main.c
        $(CC) $(CFLAGS) $(DEFS) $(CPPFLAGS)\
         -I. -I$(srcdir) -I..\
         -o $@ $(srcdir)/main.c $(LIBS)
...

Now, I have everything I need to conditionally build the jupiter program. If the end-user's system has pthread functionality, she'll get a version of jupiter that uses multiple threads of execution, otherwise, she'll have to settle for serialized execution. The only thing left is to add some code to the configure.ac script that displays a message during configuration, indicating that it's defaulting to serialized execution if the library is not found.

Another point to consider here is what it means to have the header file installed, but no library. This is very unlikely, but it can happen. However, this is easily remedied by simply skipping the header file check entirely if the library isn't found. We'll reorganize things a bit to handle this case also:

configure.ac

...
# Checks for libraries.
have_pthreads=no
AC_SEARCH_LIBS([pthread_create], [pthread],
  [have_pthreads=yes])

# Checks for header files.
AC_HEADER_STDC
AC_CHECK_HEADERS([stdlib.h])

if test "x${have_pthreads}" = xyes; then
  AC_CHECK_HEADERS([pthread.h], [],
    [have_pthreads=no])
fi

if test "x${have_pthreads}" = xno; then
  echo "------------------------------------------"
  echo " Unable to find pthreads on this system.  "
  echo " Building a single-threaded version.      "
  echo "------------------------------------------"
fi
...

I'll run autoreconf and configure and see what additional output I get now:

$ autoreconf
$ ./configure
checking for gcc... gcc
...
checking for library... pthread_create... -lpthread
...
checking pthread.h usability... yes
checking pthread.h presence... yes
checking for pthread.h... yes
configure: creating ./config.status
config.status: creating Makefile
...

Of course, if your system doesn't have pthreads, you'll get something a little different. To emulate this, I'll rename my pthreads libraries (both shared and static), and then rerun configure:

$ su
Password:
# mv /usr/lib/libpthread.so ...
# mv /usr/lib/libpthread.a ...
# exit
exit
$ ./configure
checking for gcc... gcc
...
checking for library... pthread_create... no
...
checking for stdint.h... yes
checking for unistd.h... yes
checking for stdlib.h... (cached) yes
-----------------------------------------
 Unable to find pthreads on this system.
   Building a single-threaded version.
-----------------------------------------
configure: creating ./config.status
config.status: creating Makefile
config.status: creating src/Makefile
config.status: creating config.h

Of course, if I had chosen to fail the build if I couldn't find the pthread.h header file or the pthreads libraries, then my source code would have been simpler--no need for conditional compilation. I could change my configure.ac file to look like this, instead:

configure.ac

...
# Checks for libraries.
have_pthreads=no
AC_SEARCH_LIBS([pthread_create], [pthread],
  [have_pthreads=yes])

# Checks for header files.
AC_HEADER_STDC
AC_CHECK_HEADERS([stdlib.h])

if test "x${have_pthreads}" = xyes; then
  AC_CHECK_HEADERS([pthread.h], [],
    [have_pthreads=no])
fi

if test "x${have_pthreads}" = xno; then
  echo "------------------------------------------"
  echo " The pthread library and header file is   "
  echo " required to build jupiter. Stopping...   "
  echo " Check 'config.log' for more information. "
  echo "------------------------------------------"
  (exit 1); exit 1;
fi
...

I could have used a couple of macros provided by Autoconf for the purpose of printing messages to the console: AC_MSG_WARNING and AC_MSG_ERROR, but I don't really care for these macros, because they tend to be single-line-oriented. This is especially a problem in the case of the warning message, which merely indicates that it's continuing, but it's building a single-threaded version of jupiter. Such a single-line message could zip right by in a large configuration process, without even being noticed by the user.

In the case where I decide to terminate with an error, this is less of a problem, because--well, I terminated. But, for the sake of consistency, I like all of my messages to look the same. There is a note in the GNU Autoconf manual indicating that some shells are not able to properly pass the value of the exit parameter to the parent shell, and that AC_MSG_ERROR has a work-around for this problem. Well, the funny code after the echo statements in this last example is this very work-around, copied right out of a test configure script that I created using AC_MSG_ERROR.

This last topic brings to light a general lesson regarding Autoconf checks. Checks do just that--they check. It's up to the maintainer to add code to do something based on the results of the check. This isn't strictly true, as AC_SEARCH_LIBS adds a library to the @LIBS@ variable, and AC_CHECK_HEADERS adds a preprocessor definition to the config.h.in template. However, regarding the flow of control within the configure process, all such decisions are left to the developer. Keep this in mind while you're designing your configure.ac script, and life will be simpler for you.

Supporting optional features and packages

Alright, I've covered the cases in Jupiter where a pthreads library exists, and where it doesn't exist. I'm satisfied, at this point, that I've done just about all I can to manage both of these cases very well. But what about the case where the user wants to deliberately build a single-threaded version of jupiter, even in the face of an existing pthreads library? Do I add a note to Jupiter's README file, indicating that the user should rename her pthreads libraries in this case? I don't think so.

Autoconf provides for both optional features, and optional sub-packages with two new macros: AC_ARG_ENABLE and AC_ARG_WITH. These macros are designed to do two things: First, to add help text to the output generated when you enter "configure --help", and second, to check for the specified options, "--enable-feature[=yes|no]", and "--with-package[=arg]" on the configure script's command line, and then set appropriate environment variables within the script. The values of these variables may be used later in the script to set or clear various preprocessor definitions or substitution variables.

AC_ARG_WITH is used to control the use of optional sub-packages which may be consumed by your package. AC_ARG_ENABLE is used to control the inclusion or exclusion of optional features in your package. The choice to use one or the other is often a matter of perspective and sometimes simply a matter of preference, as they provide somewhat overlapping sets of functionality. For instance, in the Jupiter package, it could be justifiably argued that Jupiter's use of pthreads constitutes the use of an external package. However, it could just as well be said that asynchronous processing is a feature that might be enabled.

In fact, both of these statements are true, and which type of option you use should be dictated by a high-level architectural perspective on the software in question. For example, the pthreads library supplies more than just thread creation functions. It also provides mutexes and condition variables, both of which may be used by a library package that doesn't create threads. If a project provides a library that needs to act in a thread-safe manner within a multi-threaded process, then it will probably use one or more mutex objects. But it may never create a thread. Thus, a user may choose to disable asynchronous execution within this library package at configuration time, but the package may still need to link the pthread library in order to access the mutex functionality from an unrelated portion of the code.

From this perspective, it makes more sense to specify "--enable-async-exec" than "--with-pthreads". Indeed, from a purist's perspective, this rationale is always sound, even in cases where a project only uses pthreads to create threads. When writing software, you won't often go wrong by siding with the purist. While some of their choices may seem arbitrary--even rediculous, they're almost always vindicated at some point in the future.

So, when do you use AC_ARG_WITH? Generally, when a choice should be made between implementing functionality one way or another. That is, when there is a choice to use one package or another, or to use an external package, or an internal implementation. For instance, if jupiter had some reason to encrypt a file, it might be written to use either an internal encryption algorithm, or an external package, such as openssl. When it comes to encryption, the use of a widely understood package can be a great boon toward gaining community adoption of your package. However, it can also be a hindrance to those who don't have access to a required external package. Giving your users a choice can make all the difference between them having a good or bad experience with your package.

These two macros have very similar signatures, so I'll just list them here together:

AC_ARG_WITH(package, help-string, [action-if-given], [action-if-not-given])

AC_ARG_ENABLE(feature, help-string, [action-if-given], [action-if-not-given])

As with many Autoconf macros, these may be used in a very simple form, where the check merely sets environment variables:

  • ${withval} and ${with_package}
  • ${enableval} and ${enable_feature}

They can also be used in a more complex form, where these environment variables are used by shell script in the optional arguments. In either case, as usual, the resulting variable must be used in order to act on the results of the check, or performing the check is pointless.

Coding up the feature option

Okay, I've now decided that I should use AC_ARG_ENABLE. Do I enable or disable the "async-exec" feature by default? The difference in how these two cases are encoded is limited to the help text and to the shell script that I put into the action-if-not-given argument. The help text describes the available options and the default value, and the shell script indicates what I want to have happen if the option is NOT specified. Of course, if it is specified, I don't need to assume anything.

Say I decide that asynchronous execution is a risky feature. In this case, I want to disable it by default, so I might add code like this to my configure.ac script:

configure.ac

...
AC_ARG_ENABLE([async-exec],
  [  --enable-async-exec     enable async exec],
  [async_exec=${enable_val}],
  [async_exec=yes])
...

On the other hand, if I decide that asynchronous execution is a fairly fundamental part of Jupiter, then I'd like it to be enabled by default. In this case I'd use code like this:

configure.ac

...
AC_ARG_ENABLE([async-exec],
  [  --disable-async-exec    disable async exec],
  [async_exec=${enable_val}],
  [async_exec=no])
...

There are a couple of really neat features of this macro that I'd like to point out:

  • Regardless of the help text, the user may always use the syntactical standard formats, "--enable-option[=yes|no]" or "--disable-option[=yes|no]". In either case, the "[=yes|no]" portion is optional.

  • Inverse logic is handled transparently--that is, the value of ${enableval} always represents the user's answer to the question, "Should it be enabled?". For instance, even if the user enters something like "--disable-option=no", the value of ${enableval} will still be set to yes.

These features of AC_ARG_ENABLE and AC_ARG_WITH make a maintainer's life a lot simpler.

Now, the only remaining question is, do I check for the library and header file regardless of the user's desire for this feature, or do I only check for them if the user indicates that the "async-exec" feature should be enabled. Well, in this case, it's purely a matter of preference, as I'm using the pthreads library only for this feature. Again, if I were also using the pthreads library for non-feature-specific reasons, then this question would be answered for me--I'd have to check for it.

In cases where I need the library even if the feature is disabled, I add the AC_ARG_ENABLE macro, as in the example above, and then an additional AC_DEFINE macro to define a config.h definition specifically for this feature. Since I don't really want to enable the feature if the library or header file is missing--even if the user specifically requested it--I also need to add some shell code to turn the feature off if either of these are missing:

configure.ac

...
# Checks for headers.
AC_HEADER_STDC

# Checks for command line options
AC_ARG_ENABLE([async-exec],
  [  --disable-async-exec    disable async exec],
  [async_exec=${enableval}],
  [async_exec=yes])

have_pthreads=no
AC_SEARCH_LIBS([pthread_create], [pthread],
  [have_pthreads=yes])

if test "x${have_pthreads}" = xyes; then
  AC_CHECK_HEADERS([pthread.h], [],
    [have_pthreads=no])
fi

if test "x${have_pthreads}" = xno; then
  if test "x${async_exec}" = xyes; then
    echo "---------------------------------------"
    echo "Unable to find pthreads on this system."
    echo "Building a single-threaded version.    "
    echo "---------------------------------------"
  fi
  async_exec=no
fi

if test "x${async_exec}" = xyes; then
  AC_DEFINE([ASYNC_EXEC], 1, [async exec enabled])
fi

# Checks for headers.
AC_CHECK_HEADERS([stdlib.h])
...

I've also added an additional test for a "yes" value in async_exec around the echo statements within the last test for have_pthreads. The reason for this is that this text really belongs to the feature, not the pthreads library test. Remember, I'm trying to create a logical separation between testing for pthreads, and testing for the requirements of the feature.

Of course, now I also have to modify src/main.c such that it uses this new definition, as follows:

src/main.c

...
#if HAVE_PTHREAD_H
# include <pthread.h>
#endif

static void * print_it(void * data)
{
   printf("Hello from %s!\n", (char *)data);
   return 0;
}

int main(int argc, char * argv[])
{
#if ASYNC_EXEC
   pthread_t tid;
   pthread_create(&tid, 0, print_it, argv[0]);
   pthread_join(tid, 0);
#else
   print_it(argv[0]);
#endif
   return 0;
}

Notice that I left the HAVE_PTHREAD_H check around the inclusion of the header file. This is so as to facilitate the use of pthread.h in other ways besides for this feature.

In order to check for the library and header file only if the feature is enabled, I merely have to wrap the original check code in a test of async_exec, like this:

configure.ac

...
if test "x${async_exec}" = xyes; then
  have_pthreads=no
  AC_SEARCH_LIBS([pthread_create], [pthread],
    [have_pthreads=yes])

  if test "x${have_pthreads}" = xyes; then
    AC_CHECK_HEADERS([pthread.h], [],
      [have_pthreads=no])
  fi

  if test "x${have_pthreads}" = xno; then
    echo "---------------------------------------"
    echo "Unable to find pthreads on this system."
    echo "Building a single-threaded version.    "
    echo "---------------------------------------"
    async_exec=no
  fi
fi
...

This time, I've removed the test for async_exec from the echo statements, or more appropriately, I've moved the original check from around the echo statements, to around the entire set of checks.

Checks for typedefs and structures

I've spent a fair amount of time during my career writing cross-platform networking software. One key aspect of networking software is that the data sent in network packets from one machine to another needs to be formatted in an architecture-independent manner. If you're trying to use C-language structures to format network messages, one of the first road blocks you generally come to is the complete lack of basic C-language types that have the same size from one platform to another. The C language was purposely designed such that the sizes of its basic integer types are implementation-defined. The designers did this to allow an implementation to use sizes for char, short, int and long that are optimal for the platform. Well, this is great for optimizing software for one platform, but it entirely discounts the need for sized types when moving data between platforms.

In an attempt to remedy this shortcoming in the language, the C99 standard provides just such sized types, in the form of the intX_t and uintX_t types, where X may be one of 8, 16, 32 or 64. While many compilers provide these types today, some are still lagging behind. GNU C, of course, has been at the fore front for some time now, providing the C99 sized types along with the stdint.h header file in which these types are supposed to be defined. As time goes by, more and more compilers will support C99 types completely. But for now, it's still rather painful to write portable code that uses these and other more recently defined integer-based types.

To alleviate the pain somewhat, Autoconf provides macros for determining whether such integer-based types exist on a user's platform, defining them appropriately if they don't exist. To ensure, for example, that uint16_t exists on your target platforms, you may use the following macro expansion in your configure.ac file:

AC_TYPE_UINT16_T

This macro will ensure that either uint16_t is defined in the appropriate header files (stdint.h, or inttypes.h), or that uint16_t is defined in config.h to an appropriate basic integer type that actually is 16 bits in size and unsigned in nature.

The compiler tests for such integer-based types is done almost universally by a generated configure script using a bit of C code that looks like this:

...
int main() 
{
   static int test_array 
      [1 - 2 * !((uint16_t) -1 >> (16 - 1) == 1)];
   test_array[0] = 1;
   return 0;
}

Now, if you study this code carefully, you'll notice that the important line is the one on which test_array is declared (Note that I've wrapped this line for publication format purposes). Autoconf is relying on the fact that all C compilers will generate an error if you attempt to define an array with a negative size. An even more thorough examination of the bracketed expression will prove to you that this expression really is a compile-time expression. I don't know if this could have been done with simpler syntax or not, but it's a fact proven over the last several years, that this code does the trick on all compilers currently supported by Autoconf--which is most of them. The array is defined with a non-negative size if (and only if) the following two conditions are met:

  1. uint16_t is in fact defined in one of the included header files.
  2. the actual size of uint16_t really is 16 bits; no more, no less.

Code that relies on the use of this macro might contain the following construct:

#if HAVE_CONFIG_H
# include <config.h>
#endif
#if HAVE_STDINT_H
# include <stdint.h>
#endif
...
#if defined UINT16_MAX || defined uint16_t
// code using uint16_t
#else
// complicated alternative using >16-bit unsigned
#endif

There are a few dozen such type-checks available in Autoconf. You should familiarize yourself with Section 5.9 of the GNU Autoconf manual, so that you have a working knowledge of what's available. I recommend you don't commit such checks to memory, but rather just know about them, so that they'll come to mind when you need to use them. Then go look them up for the exact syntax, when you do need them.

In addition to these type-specific checks, there is also a generic type check macro, AC_CHECK_TYPES, which allows you to specify a comma-separated list of questionable types that your project needs. Note that this list is comma-separated, not space separated, as in the case of most of these sorts of check lists. This is because type definitions (like struct fooble) may have embedded spaces. Since they are comma-delimited, you will need to always use the square bracket quotes around this parameter--that is, if you list more than one type in the parameter.

AC_CHECK_TYPES(types, [action-if-found], [action-if-not-found], [includes = 'default-includes'])

If you don't specify a list of include files in the last parameter, then the default includes are used in the compiler test. The default includes are used via the macro AC_INCLUDES_DEFAULT, which is defined as follows (in version 2.62 of Autoconf):

#include <stdio.h>
#ifdef HAVE_SYS_TYPES_H
# include <sys/types.h>
#endif
#ifdef HAVE_SYS_STAT_H
# include <sys/stat.h>
#endif
#ifdef STDC_HEADERS
# include <stdlib.h>
# include <stddef.h>
#else
# ifdef HAVE_STDLIB_H
# include <stdlib.h>
# endif
#endif
#ifdef HAVE_STRING_H
# if !defined STDC_HEADERS && defined HAVE_MEMORY_H
# include <memory.h>
# endif
# include <string.h>
#endif
#ifdef HAVE_STRINGS_H
# include <strings.h>
#endif
#ifdef HAVE_INTTYPES_H
# include <inttypes.h>
#endif
#ifdef HAVE_STDINT_H
# include <stdint.h>
#endif
#ifdef HAVE_UNISTD_H
# include <unistd.h>
#endif

If you know that your type is not defined in one of these header files, then you should specify one or more include files to be included in the test, like this:

AC_CHECK_TYPES([struct doodah], [], [], [
#include<doodah.h>
#include<doodahday.h>])

The interesting thing to note here is the way I wrapped the last parameter of the macro over three lines in configure.ac, with no indentation. This time I didn't do it for publication reasons. This text is included verbatim in the test source file. Since some compilers have a problem with placing the POUND SIGN (#) anywhere but the first column, it's a good idea to tell Autoconf to start each include line in column one, in this manner.

Admittedly, these are the sorts of things that developers complain about regarding Autoconf. When you do have problems with such syntax, your best friend is the config.log file, which contains the exact source code for all failed tests. You can simply look a this log file to see how Autoconf formatted the test, possibly incorrectly, and then fix your check in configure.ac accordingly.

The AC_OUTPUT macro

The AC_OUTPUT macro expands into the shell code that generates the configure script, based on all the data specified in all previous macro expansions. The important thing to note here is that all other macros must be used before AC_OUTPUT is expanded, or they will be of little value to your configure script.

Additional shell script may be placed in configure.ac after AC_OUTPUT is expanded, but this additional code will not affect the configuration or the file generation performed by config.status.

I like to add some echo statements after AC_OUTPUT to indicate to the user how the system is configured, based on their specified command line options, and perhaps additional useful targets for make. For example, one of my projects has the following text after AC_OUTPUT in configure.ac:

...
echo \
"-------------------------------------------------

 ${PACKAGE_NAME} Version ${PACKAGE_VERSION}

 Prefix: '${prefix}'.
 Compiler: '${CC} ${CFLAGS} ${CPPFLAGS}'

 Package features:
   Async Execution: ${async_exec}

 Now type 'make @<:@<target>@:>@'
   where the optional <target> is:
     all                - build all binaries
     install            - install everything

--------------------------------------------------"

This is a really handy configure script feature, as it tells the user at a glance just what happened during configuration. Since variables such as debug are set on or off based on configuration, the user can see if the configuration he asked for actually took place.

By the way, in case you're wondering what those funny character sequences are around the word &lt;target>, they're called quadrigraph sequences or simply quadrigraphs, and serve the same purpose as escape sequences. Quadrigraphs are a little more reliable than escaped characters or escape sequences because they're never subject to ambiguity. They're converted to proper characters at a very late stage by M4, and so are not subject to mis-interpretation.

The sequence, @&lt;:@ is the quadrigraph sequence for the open square bracket ([) character, while @:>@ is the quadrigraph for the close square bracket (]) character. These quadrigraphs will always be output by Autoconf (M4) as literal bracket characters. This keeps Autoconf from interpreting them as Autoconf quote characters.

There are a few other quadrigraphs. I'll show you some of them in Chapter 9 when I begin to discuss the process of writing your own Autoconf macros. If you're interested, check out section 8.1.5 of the GNU Autoconf manual.

NOTE: Version 2.62 of Autoconf does a much better job of deciphering the user's intent with respect to the use of square brackets than previous versions of Autoconf. Where you might have needed to use a quadrigraph in the past to force Autoconf to display a square bracket, you may now use the character itself. Most of the problems the occur are a result of not properly quoting arguments.

Does (project) size matter?

An issue that might have occurred to you by now is the size of my toy project. I mean, c'mon! One source file?! But, I've used autoscan to autoconfiscate projects with several hundred C++ source files, and some pretty complex build steps. It takes a few seconds longer to run autoscan on a project of this size, but it works just as well. For a basic build, the generated configure script only needed to be touched up a bit--project name, version, etc.

To add in compiler optimization options for multiple target tool sets, it took a bit more work. I'll cover these sorts of issues in Chapter 6 where I'll show you how to autoconfiscate a real project.

Summary

In this chapter, I've covered about a tenth of the information in the GNU Autoconf manual, but in much greater analytical detail than the manual. For the developer hoping to quickly bootstrap into Autoconf, I believe I've covered one of the more important "tenths". But this statement in no way alleviates a responsible software engineer from studying the other nine tenths--as time permits, of course.

For example, I didn't go into detail about the differences between searching for a function and searching for a library. In general, AC_SEARCH_LIBS should be used to check for a function you need, but expect in one or more libraries. The AC_FUNC_* macros are available to check for very specfic portability-related functionality, such as AC_FUNC_ALLOCA, which exists on some platforms, but not others. The AC_CHECK_FUNC macro should be used, if a particular function is not supported by one of the more specific AC_FUNC_* macros. I recommend reading through Section 5.5 of the GNU Autoconf manual to familiarize yourself with what's available within these special function checks.

Another topic on which I didn't spend much time was that of checking for compiler charactaristics. Section 5.10 of the GNU Autoconf manual covers these issues completely. Given what you've learned after reading this chapter, reading these sections of the manual should be pretty straight-forward.

In fact, once you're comfortable with the material in this and the preceding chapters of this book, I'd highly recommend spending a fair amount of time in Chapter 5 of the GNU Autoconf manual. Doing so will make you the Autoconf expert you never thought you could be, by filling in all of the missing details.

The next chapter takes us aways from Autoconf for a while, as we get into Automake, an Autotools tool chain add-on enhancement for the make utility.

Source archive

Download the attached source archive for the original sources associated with this chapter.

Category: 

Comments

levon33's picture
Submitted by levon33 on

Dear Mr. Calcote,

As it would be noticed by my question, I am new to the subject.
I would like to ask you if I would like to use, the GNU Autotools in a way, that I have to preserve the already existing GNUAutomake files, and their functionality. I would like to simply use the Makefile.am file as a substitution list of variables, which are defined in the Makefile.in(former GNUAutomake).
"...The important thing to notice here is that the Autoconf variables are the only items replaced in Makefile.in while generating the makefile. ..."

Doest this mean, that there is no way to implement my need by simply writing in the former GNUAutomake, now Makefile.in (only changed name), definition of an example variable named,
UNIT = @UNIT@, and this variable to be defined in the Makefile.am file.

Thanks in advance.

Author information

John Calcote's picture

Biography

John Calcote has worked in the software industry for over 25 years, the last 17 of which were at Novell. He's currently a Sr. Software Engineer with the LDS Church working on open source projects. He's the project maintainer of the openslp project, the openxdas project, and the dnx project on sourceforge.net. He blogs on open source, programming and software engineering issues in general at http://jcalcote.wordpress.com.