This chapter describes the
m4
macro preprocessor, a front-end filter that lets you define macros by placing
m4
macro definitions at the beginning of your source files.
You
can use the
m4
preprocessor with either program source
files or document source files.
Macros ease your programming or writing tasks by allowing you to substitute a simple word or two for a great amount of material. Macro calls in a source file have the following form:
name
[ (
arg1[ ,
arg2] ) ]
For example, suppose you
have a C program in which you want to print the same message at several points.
You could code a series of
printf
statements like the
following:
printf("\nThese %d files are in %s:\n",cnt,dir);
As your program evolves, you decide to change the wording; but you have to edit each instance of the message. Defining a macro like the following will save you a great deal of work:
define(filmsg,`printf("\nThese %d files are in %s:\n",$1,$2)')
Then, everywhere you want to output this message, you use the macro this way:
filmsg(cnt,dir);
With this implementation, you need only edit the message in one place.
A
macro definition
consists of a symbolic
name (called a
token) and the character string that
is to replace it.
A
token is any string
of alphanumeric characters (letters, numbers, and underscores) beginning with
a letter or an underscore and delimited by nonalphanumeric characters (punctuation
or white space).
For example,
N12
and
N
are both tokens but
A+B
is not a token.
When you process
your file through
m4
, each occurrence of a recognized macro
is replaced by its definition.
In addition to replacing symbolic names with
text,
m4
can also perform the following operations:
Arithmetic calculation
File manipulation
Conditional macro expansion
String and substring functions
System command execution
The
m4
program
reads each token in the file and determines if the token is a macro name.
Macro names that are embedded in other tokens are not recognized; for example,
m4
does not interpret
N12
as containing an occurrence
of the token
N
.
If the token is a macro name,
m4
replaces it with its defining text and pushes the resulting string
back onto the input to be rescanned.
Macro expansion is thus recursive; macro definitions can include nested occurrences of other macros to any depth of nesting. You can call macros with arguments, in which case the arguments are collected and substituted into the right places in the defining text before the defining text is rescanned.
The
m4
preprocessor is a standard UNIX filter.
It
accepts input from standard input or from a list of input files and writes
its output to standard output.
The following lines illustrate correct
m4
usage:
% grep -v '#include' file1 file2 | m4 > outfile % m4 file1 file2 | cc
The
m4
program processes each argument in order.
If there
are no arguments, or if an argument is a minus sign ( -
),
m4
reads standard input as its input
file.
You create a macro definition with
the
define
command, one of about 20 built-in macros provided
by
m4
.
For example:
define(N,100)
The open parenthesis must follow the word
define
with no intervening space.
Given this macro definition, the token
N
will be
replaced by
100
wherever it appears in the file being processed.
The defining text can be any text, except that if the text contains parentheses,
the number of open (left) parentheses must match the number of close (right)
parentheses unless you protect an unmatched parenthesis by quoting it.
See
Section 5.2.1
for an explanation of quoting.
Built-in and user-defined macros work the same way except that some of the built-in macros change the state of the process. Refer to Section 5.3 for a list of the built-in macros.
You can define macros in terms of other macros. For example:
define(N,100) define(M,N)
This example defines both
M
and
N
to be
100
.
If you later change
the definition of
N
and assign it a new value,
M
retains the value of
100
, not the new value
you give
N
.
The value of
M
does not
track that of
N
because the
m4
preprocessor
expands macro names into their defining text as soon as possible.
The overall
result, as far as
M
is concerned, is the same as using
the following input in the first place: define(M,100) If you want the value
of
M
to track the value of
N
, you can
reverse the order of the definitions, as follows:
define(M,N) define(N,100)
Now
M
is defined to be the
string
N
.
When the value of
M
is requested
later, the
M
is replaced by
N
, which
is then rescanned and replaced by whatever value
N
has
at that time.
Macro definitions made with the
define
command do
not delete characters following the close parenthesis.
For example:
Now is the time for all good persons. define(N,100) Testing N definition.
This example produces the following result:
Now is the time for all good persons. Testing 100 definition.
The blank
line results from the presence of a newline character at the end of the line
containing the
define
macro.
The built-in
dnl
macro deletes all characters that follow it, up to and including
the next newline character.
Use this macro to delete empty lines.
For example:
Now is the time for all good persons. define(N,100)dnl Testing N definition.
This example produces the following result:
Now is the time for all good persons. Testing 100 definition.
To delay the expansion
of a
define
macro's arguments, enclose them in a matched
pair of quote characters.
The default quote characters are left and right
single quotation marks (`
and
'
), but
you can use the built-in
changequote
macro to specify different
characters.
(See
Section 5.3.) Any text surrounded by quote
characters is not expanded
immediately,
but the quote characters are removed.
The value of a quoted string is the
string with the quote characters removed.
Consider the following example:
define(N,100) define(M,`N')
The quote characters around the
N
are removed as the argument is being collected.
The result of
using quote characters is to define
M
as the string
N
, not
100
.
This example makes the value of
M
track that of
N
, and it is thus another way
to accomplish the effect of the following definitions, shown in
Section 5.2:
define(M,N) define(N,100)
The general rule is that
m4
always strips off one level of quote characters whenever it evaluates something.
This is true even outside macros.
For example, to make the word "define"
appear in the output, enter the word in quote characters, as follows:
`define' = 1
Because of the way
m4
handles quoted strings, you must be careful about nesting macros.
For example:
define(dog,canine) define(cat,animal chased by `dog') define(mouse,animal chased by cat)
When the definition of
cat
is processed,
dog
is not replaced with
canine
because it is quoted.
But when
mouse
is
processed, the definition of
cat
(animal chased
by dog
) is used; this time,
dog
is not quoted,
and the definition of
mouse
becomes
animal chased
by animal chased by canine
.
When you redefine an existing macro, you must quote the first argument (the macro name), as follows:
define(N,100)
.
.
.
define(`N',200)
Without the quote characters, the second
define
macro sees
N
, recognizes it, and substitutes
its value, producing the following result:
define(100,200)
The
m4
program ignores this statement
because it can only define names, not numbers.
The simplest form of macro processing
is replacing one string with another (fixed) string as illustrated in the
previous sections.
However, macros can also have arguments, so that you can
use a given macro in different places with different results.
To indicate
where an argument is to be used within the replacement text for a macro (the
second argument of its definition), use the symbol
$
n
to indicate the
nth
argument.
For example, the symbol
$1
refers to the first
argument of a macro.
When the macro is used,
m4
replaces
the symbol with the value of the indicated argument.
For example:
define(bump,$1=$1+1)
.
.
.
bump(x);
In this example,
m4
will replace
the
bump(x)
statement with
x=x+1
.
A macro can have as many arguments
as needed.
However, you can access only nine arguments by using the
$
n
symbols ($1
through
$9
).
To access arguments past the ninth argument, use the
shift
macro, which drops the first argument and reassigns the remaining arguments
to the
$
n
symbols (second argument
to
$1
, third to
$2
, and so on).
Using
the
shift
macro more than once allows access to all arguments
used with the macro.
The symbol
$0
returns
the name of the macro.
Arguments that are not supplied are replaced by null
strings, so that you can define a macro that concatenates its arguments as
follows:
define(cat,$1$2$3$4$5$6$7$8$9)
.
.
.
cat(x,y,z)
This example replaces the
cat(x,y,z)
statement with
xyz
.
Arguments
$4
through
$9
in this example are null because corresponding
arguments were not provided.
When scanning a macro, the
m4
program discards leading unquoted blanks, tabs, or newline characters
in arguments, but keeps all other white space.
For example:
define(a, "$1 $2$3")
.
.
.
a(b, c, d)
This example expands the
a
macro to
be "b cd
".
In the
define
macro, however,
newline characters are meaningful.
For example:
define(a,$1 $2$3)
.
.
.
a(b,c,d)
This latter example expands the
a
macro as follows:
b cd
Macro arguments are separated by commas. Use parentheses to enclose arguments containing commas, so that the commas are not misinterpreted as ending the arguments containing them. For example, the following statement has only two arguments:
define(a, (b,c))
The first argument is
a
, and the second is
(b,c)
.
To use a single parenthesis
in an argument, enclose it in quote characters:
define(a,b`)'c)
In this example,
b)c
is
the second argument.
The
m4
program provides a set of macros that are
already defined (built-in macros).
Table 5-1
lists
all of these macros and describes them briefly.
The following sections further
explain many of the macros and how to use them.
Macro | Description |
changecom( l, r) |
Changes the left and right comment characters to the characters represented by l and r. The two characters must be different. |
changequote( l, r) |
Changes the left and right quote characters to the characters represented by l and r. The two characters must be different. |
decr( n) |
Returns the value of n-1. |
define( name, replacement) |
Defines a new macro, named
name , with a value of
replacement. |
defn( name) |
Returns the quoted definition
of
name . |
divert( n) |
Changes the output stream to the temporary file number n. |
divnum |
Returns the number of the currently active temporary file. |
dnl |
Deletes text up to a newline character. |
dumpdef( `name'[, `name'...]) |
Prints the names and current definitions of the named macros. |
errprint( str) |
Prints str to the standard error file. |
eval( expr) |
Evaluates expr as a 32-bit arithmetic expression. |
ifdef( `name', arg1, arg2) |
If macro name is defined, returns arg1; otherwise, returns arg2. |
ifelse( str1, str2, arg1, arg2) |
Compares the strings
str1
and
str2 .
If they match,
ifelse
returns the value of
arg1 ; otherwise, it returns
the value of
arg2 . |
include( file)
sinclude( file) |
Returns the contents of
file.
The
sinclude
macro does not report
an error if it cannot access the file. |
incr( n) |
Returns the value of n+1. |
index( str1, str2) |
Returns the character position
in string
str1
where
str2
starts, or -1
if
str1
does not contain
str2 . |
len( str)
dlen( str) |
Returns the number of characters
in
str .
The
dlen
macro operates on strings
containing 2-byte representations of international characters. |
m4exit( code) |
Exits
m4
with
a return code of
code. |
m4wrap( name) |
Runs macro
name
before exiting, after completing all other processing. |
maketemp( strXXXXX str) |
Creates a unique file name by
replacing the literal string
XXXXX
in the argument string
with the current process ID. |
popdef( name) |
Replaces the current definition
of
name
with the previous definition, saved with the
pushdef
macro. |
pushdef( name, replacement) |
Saves the current definition of
name
and then defines
name
to be
replacement
in the same way as
define . |
shift( param_list) |
Shifts the parameter list leftward one position, destroying the original first element of the list. |
substr( string, pos, len) |
Returns the substring of string that begins at character position pos and is len characters long. |
syscmd( command) |
Executes the specified system command with no return value. |
sysval |
Gets the return code from the
last use of the
syscmd
macro. |
traceoff( macro_list) |
Turns off trace for any macro
in the list.
If
macro_list
is null, turns off all tracing. |
traceon( name) |
Turns on trace for the named macro.
If
name
is null, turns trace on for all macros. |
translit( string, set1, set2) |
Replaces any characters from set1 that appear in string with the corresponding characters from set2. |
undefine( `name') |
Removes the definition of the named macro. |
undivert( n, n[, n...]) |
Appends the contents of the indicated temporary files to the current temporary file. |
To
include comments in your
m4
programs, delimit the comment
lines with the comment characters.
The default left comment character is
the number sign ( #
); the default right
comment character is the newline character.
If these characters are not convenient,
use the built-in
changecom
macro.
For example:
changecom({,})
This example makes the left and
right braces the new comment characters.
To restore the original comment
characters, use
changecom
as follows:
changecom(#, )
Using
changecom
with no arguments disables
commenting.
The
default quote characters are the left and right single quotation marks (`
and
'
).
If these characters are not convenient,
change the quote characters with the built-in
changequote
macro.
For example:
changequote([,])
This example makes the left and right brackets the new quote characters.
To restore the original quote characters, use
changequote
without arguments, as follows:
changequote
The
undefine
macro removes macro definitions.
For example:
undefine(`N')
This example removes the definition
of
N
.
You must quote the name of the macro to be undefined.
You can use
undefine
to remove built-in macros, but once
you remove a built-in macro, you cannot recover that macro for later use.
The built-in
ifdef
macro determines if a macro is currently defined.
The
ifdef
macro accepts three arguments.
If the first argument is defined, the value
of
ifdef
is the second argument.
If the first argument
is not defined, the value of
ifdef
is the third argument.
If there is no third argument, the value of
ifdef
is null.
The
m4
program provides the following built-in
functions for doing arithmetic on integers only:
incr |
Increments its numeric argument by 1 |
decr |
Decrements its numeric argument by 1 |
eval |
Evaluates an arithmetic expression |
For example, you can create a variable
N1
such
that its value will always be one greater than
N
, as follows:
define(N,100) define(N1,`incr(N)')
The
eval
function can evaluate expressions containing
the following operators (listed in decreasing order of precedence):
unary + (plus), unary - (minus)
**
or
^
(exponentiation)
*
,
/
,
%
(modulo)
+
,
-
==
,
!=
,
<
,
<=
,
>
,
>=
!
(NOT)
&
or
&&
(logical AND)
|
or
||
(logical OR)
Use parentheses to group operations where needed.
All operands of an expression must be numeric.
The numeric value of a true
relation such as
1>0
is 1, and false is 0 (zero).
The
precision in
eval
is 32 bits.
For example, to define
M
as
2==N+1
, use
eval
as follows:
define(N,3) define(M,`eval(2==N+1)')
Use quote characters around the text that defines a macro, unless the text is simple and contains no instances of macro names.
To
merge a new file in the input, use the built-in
include
macro as follows:
include(myfile)
This example inserts the contents of
myfile
in place of
the
include
command.
As the included file is read,
m4
scans it for macros as if it were part of the primary input.
With the
include
macro, a fatal error occurs if the
named file cannot be accessed.
To avoid an error, use the alternative form,
sinclude
(silent include).
The
sinclude
macro
continues without error if the named file cannot be accessed.
You can redirect the output
of
m4
to temporary files during processing, and the collected
material can be output upon command.
The
m4
program can
maintain up to nine temporary files, numbered 1 through 9.
To redirect output,
use the
divert
macro as in the following example:
divert(4)
When this comand is encountered,
m4
begins writing its output to the end of temporary file 4.
The
m4
program discards the output if you redirect the output to a temporary
file other than 1 through 9; you can use this feature to make
m4
omit a portion of the input file.
Use
divert(0)
or
divert
with no argument to return the output to the
standard output stream.
At the end of its processing,
m4
writes all redirected
output to the standard output stream, reading from the temporary files in
numeric order and then destroying the temporary files.
To retrieve the information from all temporary files in numeric order
at any time before processing is completed, use the built-in
undivert
macro with no arguments.
To retrieve selected temporary files in a specified
order, use
undivert
with arguments.
When using
undivert
,
m4
discards the temporary files that
are recovered and does not search the recovered information for macros.
The value of
undivert
is not the diverted text.
The built-in
divnum
macro returns the number of
the currently active temporary file.
If you do not change the output file
with the
divert
macro,
m4
puts all output
in temporary file 0 (zero).
You can run any program in the operating system from
a program by using the built-in
syscmd
macro.
If the system
command returns information, that information is the value of the
syscmd
macro; otherwise, the macro's value is null.
For example:
syscmd(date)
Use the built-in
maketemp
macro to make a unique
file name from a program.
If the literal string
XXXXX
is present in the macro's argument,
m4
replaces the
XXXXX
with the process ID of the current process.
For example:
maketemp(myfileXXXXX)
If the current process ID
is 23498, this example returns
myfile23498
.
You can use
this string to name a temporary file.
The built-in
ifelse
macro performs conditional testing.
The simplest form is the following:
ifelse(a,b,c,d)
This example compares the two
strings
a
and
b
.
If they are identical,
ifelse
returns string
c
.
If they are not identical,
it returns string
d
.
For example, you can define a macro
called
compare
to compare two strings and return
yes
if they are the same or
no
if they are different,
as follows:
define(compare, `ifelse($1,$2,yes,no)')
The quote characters prevent the evaluation of
ifelse
from occurring too early.
If the fourth argument is missing,
it is treated as empty.
The
ifelse
macro can have any number of arguments,
and it therefore provides a limited form of multiple path decision capability.
For example:
ifelse(a,b,c,d,e,f,g)
This statement is logically the same as the following fragment:
if(a == b) x = c; else if(d == e) x = f; else x = g; return(x);
If the final argument is omitted, the result is null.
The built-in
len
macro
returns the byte length of the string that makes up its argument.
For example,
len(abcdef)
is 6, and
len((a,b))
is 5.
The built-in
dlen
macro returns the length of the displayable
characters in a string.
In certain international usages, 2-byte codes are
displayed as one character.
Thus, if the string contains any 2-byte international
character codes, the result of
dlen
will differ from the
result of
len
.
The built-in
substr
macro returns the substring
(beginning at the character position specified by the second argument) from
a specified string (first argument).
The third argument specifies the length
in bytes of the returned substring.
For example:
substr(Krazy Kat,6,5)
This example returns "Kat", which is the 3-character substring beginning at character position 6 of the string "Krazy Kat". The first character in the string is at position 0 (zero). If the third argument is omitted or if the string is not long enough to satisfy the third argument, as in this example, the rest of the string is returned.
The built-in
index
macro returns the byte position,
or index, in a string (first argument) where a substring (second argument)
begins.
If the substring is not present,
index
returns -1.
As with
substr
, the origin for strings is 0 (zero).
For
example:
index(Krazy Kat,Kat)
This example returns 6.
The built-in
translit
macro performs one-for-one
character substitution, or transliteration.
The first argument is a string
to be processed.
The second and third arguments are lists of characters.
Each instance of a character from the second argument that is found in the
string is replaced by the corresponding character from the third argument.
For example:
translit(the quick brown fox jumps over the lazy dog,aeiou,AEIOU)
This example returns the following:
thE qUIck brOwn fOx jUmps OvEr thE lAzy dOg
If the third argument is shorter than the second argument, characters from the second argument that are not in the third argument are deleted. If the third argument is missing, all characters present in the second argument are deleted.
Note
The
substr
,index
, andtranslit
macros do not differentiate between 1- and 2-byte displayable characters and can return unexpected results in some international usages.
The built-in
errprint
macro
writes its arguments to the standard error file.
For example:
errprint (`error')
The built-in
dumpdef
macro dumps the current names and definitions of items named as arguments.
Names must be quoted.
If you supply no arguments,
dumpdef
prints all current names and definitions.
The
dumpdef
macro writes to the standard error file.