In the C programming language you can ask the macro preprocessor to
keep or remove part of a source file. This is done with #ifdef
. The
equivalent in Scheme is called cond-expand
. R7RS Scheme has two
different instances of cond-expand
, while R6RS Scheme does not have
it all. What does R6RS do instead, and is cond-expand
a bad idea?
Use cases
What is #ifdef
, an its cousins #if
and #ifndef
, used for? And why
might we want its equivalent in Scheme? There are two major use cases
for #ifdef
: build-time configuration and portability. The build
system will usually have some sort of configuration script that
detects what system it’s running on. Usually these scripts also let
the user enable or disable features.
Chez Scheme runs on top of a chunk of fairly portable C and uses
#ifdef
as described:
$ grep -h '^#ifdef' ChezScheme/c/*.[ch] |awk '{print $2}' \
| sort -u | xargs
ARCHYPERBOLIC ARMV6 BSDI CHAFF CHECK_FOR_ROSETTA CLOCK_HIGHRES
CLOCK_MONOTONIC CLOCK_MONOTONIC_HR CLOCK_PROCESS_CPUTIME_ID
CLOCK_REALTIME CLOCK_REALTIME_HR CLOCK_THREAD_CPUTIME_ID DEBUG
DEFINE_MATHERR DISABLE_CURSES EINTR ENABLE_OBJECT_COUNTS
FEATURE_EXPEDITOR FEATURE_ICONV FEATURE_PTHREADS FEATURE_WINDOWS FLOCK
FLUSHCACHE FunCRepl GETWD HANDLE_SIGWINCH HPUX I386 IEEE_DOUBLE ITEST
KEEPSMALLPUPPIES LIBX11 LITTLE_ENDIAN_IEEE_DOUBLE LOAD_SHARED_OBJECT
LOCKF LOG1P LOOKUP_DYNAMIC MACOSX MAP_32BIT __MINGW32__ MMAP_HEAP
NAN_INCLUDE NO_DIRTY_NEWSPACE_POINTERS NOISY
NO_LOCKED_OLDSPACE_OBJECTS PPC32 PROMPT PTHREADS SA_INTERRUPT
SA_RESTART SAVEDHEAPS segment_t2_bits segment_t3_bits SIGBUS SIGQUIT
SOLARIS SPARC SPARC64 TIOCGWINSZ USE_MBRTOWC_L WIN32 _WIN64 WIPECLEAN
X86_64
Macros like FEATURE_EXPEDITOR
turn on and off functionality, while
macros like HPUX
and I386
are used for portability. So,
configuration and portability.
Portability?
Does #ifdef
truly help with portability? It can certainly seem this
way, but there’s a different way to think about this issue. This is
what Rob Pike had to say on the TUHS main list:
C with #ifdefs is not portable, it is a collection of 2^n overlaid programs, where n is the number of distinct #if[n]def tags. It’s too bad the problems of that approach were not appreciated by the C standard committee, who mandated the #ifndef guard approach that I’m sure could count as a provable billion dollar mistake, probably much more. The cost of building #ifdef’ed code, especially with C++, which decided to be more fine-grained about it, is unfathomable.
For each C file with #ifdef
you need to understand what happens if
the condition is true versus if it’s false. It’s simple with just one
#ifdef
, but the problem grows exponentially.
Configuration?
You can use #ifdef
for conditional compilation, to turn on and off
features. But this can also create a mess like that described by Pike
above. The GNU Coding Standards have this to say:
When supporting configuration options already known when building your program we prefer using
if (... )
over conditional compilation, as in the former case the compiler is able to perform more extensive checking of all possible code paths.
The same sentiment is echoed by Douglas McIlroy in the message that preceded Pike’s message above.
This approach is generally a good idea. All those conditionals give
you a lot of code paths. Checking that all them even compile is
difficult to do by hand, and you need all the help you can get. When
you use #ifdef
you hide the code from the compiler. The compiler
can’t check code that it can’t see. Using if (... )
when the
expression is constant at compile time should give the same result as
conditional inclusion, at least if you have an optimizing compiler.
Also Considered Harmful
It’s not just people on the Internet saying these things about
#ifdef
, and the complaints are not new either.
We believe that a C programmer’s impulse to use
#ifdef
in an attempt at portability is usually a mistake. Portability is generally the result of advance planning rather than trench warfare involving#ifdef
. In the course of developing C News on different systems, we evolved various tactics for dealing with differences among systems without producing a welter of#ifdef
s at points of difference. We discuss the alternatives to, and occasional proper use of,#ifdef
.
Source: SPENCER, Henry; COLLYER, Geoff. #ifdef considered harmful, or portability experience with C News. In: USENIX Summer 1992 Technical Conference (USENIX Summer 1992 Technical Conference). 1992.
You can use Google Scholar to find papers that cite this paper, if you’re interested in more reading.
cond-expand is not as bad…
The first standardization of cond-expand
that I know of
is Marc Feeley’s SRFI-0. It is also part
of R7RS Scheme, where I believe it has seen wider adoption
than plain SRFI-0.
There is one major difference between #ifdef
and cond-expand
. The
former is handled by a preprocessor that does not understand the
lexical syntax of the language is it working with. You can even use
cpp with other languages than C, e.g. assembly. This means you can
easily introduce latent syntax errors with #ifdef
.
There is a cond-expand
available from inside define-library
and
another one available as syntax in (scheme base)
. Both of these are
handled after the source file has been parsed. This means that
cond-expand
cannot create an unbalanced syntax tree. What I mean by
this is that you can’t somehow use cond-expand
wrong in such a way
that the parenthesis become unbalanced. To make this mistake with
#ifdef
is trivial; simply place }
before #endif
when it should
have been after, or vice versa.
You get bonus points, so to speak, if you do this near code that handles portability to operating systems that you can’t test your changes on.
… but not really better
Apart from the differences in how the compiler handles them, they do
actually express the same thing. One can look at cond-expand
as
morally equivalent to a series of #if
, #else
and #endif
directives. Therefore the very same problems that happen with #ifdef
also happen with cond-expand
.
I’m not optimistic about the future landscape of R7RS code if
cond-expand
is not recognized for the problems it brings. It may be
that each R7RS library will become a jungle of 2n overlaid
libraries. I have seen some indication of this process already
beginning when looking at the packages in Snow Fort.
Fortunately I have not seen any examples where cond-expand
is used
to change which identifiers are exported from a library, but that day
may yet come.
Back to R6RS
So in the beginning of this article I wrote that R6RS Scheme does not
have cond-expand
. Does that mean it has another way to handle these problems?
No, but in practice: yes. In the R6RS report there are only libraries as a suggested way to handle this. And there isn’t really a way to conditionally import libraries at compile time.
This situation has given rise to some creative solutions. The configuration and portability problems do not disappear just like that, so people have tried to solve it within the restrictions of the language.
Portability between R6RS implementations
I believe that all R6RS Scheme implementations implement the de facto
standard of importing libraries by first looking for them in files
that end with the .<impl>.sls
suffix before they try .sls
. If Chez
Scheme sees (import (foo))
then it first tries foo.chezscheme.sls
before it tries foo.sls
. This mechanism, even though it’s not in
R6RS, is widely implemented.
This mechanism is used to create compatibility libraries. One striking
example is the (xitomatl common)
library from Derick
Eddington’s xitomatl. It contains a few procedures that
traditionally appear in Scheme implementation, but which are not in
the reports. Here is common.chezscheme.sls
:
;; Copyright 2009 Derick Eddington. My MIT-style license is in the file named
;; LICENSE from the original collection this file is distributed with.
(library (xitomatl common)
(export
add1 sub1
format printf fprintf pretty-print
gensym
time
with-input-from-string with-output-to-string
system
;; TODO: add to as needed/appropriate
)
(import
(chezscheme))
)
There are matching libraries for Guile, Ikarus, Larcency, Racket, Mosh and Ypsilon. They export the same identifiers, but they all have some tweaks to adapt them to the various implementations.
This is the “Plan 9” approach to portability, as briefly described in the mailing list thread referenced above. Define APIs and let the implementation of the API hide the portability problems from the rest of the program.
Configuration for R6RS code
What is to be done for configuration? When I need this in my own code, I create a library that exports the configuration as identifier syntax. Here’s an abbreviated example from Loko Scheme:
(library (loko arch amd64 config)
(export
; ...
use-popcnt)
(import
(rnrs))
(define-syntax define-const
(syntax-rules ()
((_ name v)
(define-syntax name
(identifier-syntax v)))))
; ...
(define-const use-popcnt #f))
I can then use this identifier syntax as a regular variable, like (if
use-popcnt <do-this> <do-that>)
. But thanks to define-const
it
becomes inlined at the place where it is used. So if it’s set to #f
then the expanded conditional is actually (if #f <do-this> <do-that>)
,
which is trivial to optimize.
Akku supports this stuff
I have included support in Akku for both the
R6RS and R7RS approach to portability. Akku will keep track of which
Scheme implementation an R6RS library is meant for and adapt the way
it installs the library. It will use the .<impl>.sls
extension and
even escape the file name correctly for that implementation.
With R7RS libraries it merely needs to install the .sld
file to the
right location. This is simple enough to do. But Akku also translates
R7RS libraries into R6RS libraries. Akku has to do some interesting
juggling when cond-expand
appears at the define-library
level.
Akku checks the list of features that appear cond-expand
and looks
to see if it recognizes any implementation names. For each
implementation it then creates a copy of the library that is specific
to that implementation. For each such copy it expands all
cond-expand
expressions at the define-library
level, as best as it
can, and installs .<impl>.sls
files. Kind of dirty, but it mostly works.
For example, this library will be installed as hello.chezscheme.sls
,
hello.loko.sls
, hello.sld
and hello.sls
:
(define-library (hello)
(export hello)
(cond-expand
((library (rnrs))
(import (rnrs)))
(else
(import (scheme base))))
(cond-expand
(chezscheme
(begin
(define (hello)
(display "Hello Chez!\n"))))
(loko
(begin
(define (hello)
(display "Hello Loko!\n"))))
(else
(begin
(define (hello)
(display "Hello world!\n"))))))
(The hello.loko.sls
file it creates is actually a symlink to
hello.sld
because Akku knows that Loko supports R7RS).
Maybe a way forward
The picture I’ve painted seems quite damning for cond-expand
. But
the problem is not really cond-expand
itself. The problem is when
the misuse of cond-expand
leads to a mess. I’m not advocating for
its removal, but I would like it to be better understood for what it
is. Some widely distributed guidelines for how to use it would go far
in reducing the damage.
The “Plan 9” approach of making APIs can be done with cond-expand
just as well as with the R6RS approach of .<impl>.sls>
. It just
requires that you know what you’re doing; that it’s a bad idea to
sprinkle all your code with cond-expand
and that you should keep
this code in isolated libraries.
Finally, I’d like to say that cond-expand
is actually more powerful
than the .<impl>.sls
approach. This power unfortunately makes static
analysis of library declarations more difficult, but it also gives you
access to feature identifiers that are more interesting than just the
name of the Scheme implementation, such as x86-64
and clr
. But
please do hide your use of these behind an appropriate API.