Macros
Not surprisingly, given Paul Graham's enthusiasm for macros, Arc makes heavy use of macros. A few idioms appear commonly in the code."Decorated body"
The most common use of macros in Arc is something I call the "decorated body", which wraps a chunk of body code with something to extend it, analogous to the decorator design pattern. Typically the decorator will set something up before executing the body code, as in w/outstring, w/table, and fromstring. (Note: the link for each operation goes to the documentation. Clicking on the code icon in the documentation will show the code for the operation.) The pattern is also used to set something up before execution and then clean up afterwards. This is especially useful for file operations, as in w/infile, w/outfile, and w/instring. These macros often have names starting with "w/", short for "with/", indicating what they are providing to the body. The HTML library makes heavy use of this idiom, for example in tag, form, and cdata, which wrap the body in various HTML opening and closing tags.To examine the implementation of this idiom, consider let, which is defined as:
(mac let (var val . body) `(with (,var ,val) ,@body))The typical features of this idiom are a macro definition with argument
. body
to pick up the remainder of the arguments, a macro definition starting with quasiquote `
, and unquote-splicing ,@body
to substitute the body code into the macro. The macro performs simple substitution:
arc> (macex1 '(let foo 42 body1 body2)) (with (foo 42) body1 body2)
Macro for control of evaluation
Unlike functions, macros receive their arguments unevaluated. This allows macros to be used for conditional execution of code, or to execute code multiple times iteratively. Some examples are when, unless, while, and for.(You might expect if to be a macro. It is actually a built-in special form, since conditional macros need to be based on some underlying conditional form.)
The code for repeat illustrates that this idiom also uses the ,@body
structure as "decorated body":
(mac repeat (n . body) `(for ,(uniq) 1 ,n ,@body))
Macro as lambda adapter
A variant of the body decorator is to use a macro to convert body code into a lambda function, which is similar to an adapter design pattern. This idiom is used when a function expects to receive a lambda function, but passing in body code is more convenient, especially when the base function is a Scheme primitive. Examples are thread, atomic, w/stdin, and w/stdout:(mac w/stdout (str . body) `(call-w/stdout ,str (fn () ,@body)))
arc> (macex '(w/stdout foo body1 body2)) (call-w/stdout foo (fn nil body1 body2))
Macro to create functions with parameters
Another common macro idiom is to take a list of parameters and body code, and build a function out of these. This idiom is typically used to define a function that has special properties, and is often given a name that starts with "def". Some examples are rfn, afn, defmemo, defop, defopr, linkf, and rlinkf.
The code for defmemo
shows the key component of this idiom: (fn ,parms ,@body)
to build the function out of the parameters and body.
(mac defmemo (name parms . body) `(safeset ,name (memo (fn ,parms ,@body))))
Complex macros
One irony is that even though though macros are extremely powerful, most of the macros in Arc perform simple substitutions. There are, however, some macros that perform complex manipulation of their arguments. Assignment is built from the setform macro, which implements generalized variables through intricate macro processing of the arguments. Slightly less complex is and, which implements short-circuit evaluation by turning its arguments into nestedif
statements. Another interesting example is with; the macro splits up the arguments into variables and values:
arc> (macex '(with (a 1 b 2) (prn a b))) ((fn (a b) (prn a b)) 1 2)The litmatch macro performs string matching by generating code to perform a sequence of character comparisons.
arc> (macex '(litmatch "abc" foo)) ((fn (gs1442 gs1443) (unless (> (+ gs1443 3) (len gs1442)) (and (is #\a (gs1442 (+ gs1443 0))) (is #\b (gs1442 (+ gs1443 1))) (is #\c (gs1442 (+ gs1443 2)))))) foo 0)
Recursion
Tail recursion is a standard Scheme technique for efficient recursion. If the last operation of a function is a recursive call, the compiler will use tail-call optimization to turn the recursion into iteration, avoiding creation of a stack frame on each call.Iteration in Arc is built on top of tail recursion. Basic loop operations such as while, loop, and whilet use tail recursion on a function built from rfn:
(mac while (test . body) (w/uniq (gf gp) `((rfn ,gf (,gp) (when ,gp ,@body (,gf ,test))) ,test)))
cdr-scanning
A typical idiom for scanning a list is to tail-recursively call the function on the cdr
of the list. Some examples are
reclist,
nthcdr,
last,
and rev.
The code for
assoc
illustrates this idiom. If there is no match, assoc
is called on the cdr
, until the whole list has been scanned:
(def assoc (key al) (if (atom al) nil (and (acons (car al)) (is (caar al) key)) (car al) (assoc key (cdr al))))
Tail-recursive string functions
One idiom in Arc for using tail recursion with strings is to add an index parameter to the function. This index parameter indexes into the string, and by incrementing it in each call, the function can be made tail-recursive and doesn't require modifying the string. The index parameter is often made an optional parameter so "normal" uses of the function don't need to specify the index. Some examples of this idiom are recstring, findsubseq, indented-code, parabreak, and urlend.The code for headmatch illustrates this idiom; in this case the tail-recursion uses an anaphoric function:
(def headmatch (pat seq (o start 0)) (let p (len pat) ((afn (i) (or (is i p) (and (is (pat i) (seq (+ i start))) (self (+ i 1))))) 0)))
Recursion with cons
Many operations in Arc need to perform operations on a sequence and create a result list. The common Arc idiom is to recurse through a sequence, building up the result withcons
.
Specifically, a macro (op seq)
will return the cons of something done to (car seq)
and (op (cdr seq))
. Thus, op
recursively steps through seq
, processing an element at a time and building up the result.
This idiom provides an interesting contrast to the previous recursion examples because this idiom is not tail-recursive. Because the cons
is applied to the result of the recursive function call, tail-call optimization cannot be applied.
Several examples of this idiom are
join,
rem,
reinsert-sorted,
pair,
firstn,
firstn-that,
tuples,
and range.
The code for map1 illustrates this idiom:
(def map1 (f xs) (if (no xs) nil (cons (f (car xs)) (map1 f (cdr xs)))))Note that
f
is applied to (car xs)
, map1
is called recursively on (cdr xs)
, and the two are cons'd together.
Miscellaneous
The majority of the idioms I discuss are based on macros or recursion. This secion discusses some miscellaneous idioms that don't fit those categories.Push and reverse
The "push and reverse" idiom is an efficient, but somewhat surprising way to generate a list. In this idiom, successive elements are pushed onto the front of the list, and when the entire list is generated, it is reversed to obtain the desired order. This is a well-known Lisp idiom and is more efficient than the obvious method of appending elements to the end of the list. Several examples of "push and reverse" are drain, dedup, parse-format, litmatch, endmatch, and handle-post.
The code for n-of illustrates the idiom. Items are added with
push
to an accumulator (ga
in this case), and the accumulator is reversed with
rev
at the end:
(mac n-of (n expr) (w/uniq ga `(let ,ga nil (repeat ,n (push ,expr ,ga)) (rev ,ga))))
String stdout
Arc makes extensive use of an idiom I'll call "string stdout" because a string is used as stdout. The code writes multiple chunks to stdout, and the results are gathered into a string. This idiom avoids having to manually concatenate the components into the result string. The tostring macro is used to implement this idiom. Arc has a surprisingly large number of macros that use this idiom, including readline, code-block, unmarkdown, varfield, markdown, eschtml, striptags, parafy, subst, multisubst, and num.
Typically, code using this idiom will use produce output with pr
or writec
, and use tostring
to collect the output.
The code for multisubst illustrates this:
(def multisubst (pairs seq) (tostring (forlen i seq (iflet (old new) (find [begins seq (car _) i] pairs) (do (++ i (- (len old) 1)) (pr new)) (pr (seq i))))))
The flip side of this idiom is the "string stdin" idiom, where a string is used as stdin. The "string stdin" idiom is rarely used; one example is parse-format.
Summary
The Arc code illustrates several common Arc idioms, some of which are well-known Scheme or Lisp idioms. Since Paul Graham is a Lisp expert, carefully studying his Arc code is a rewarding way to learn more about programming Lisp-like languages.Undoubtedly I've missed some important idioms. Please let me know if your favorite idiom is missing, or if I've mangled the descriptions of any idioms.
2 comments:
Excellent post, there aren't enough blog posts dissecting someone else's style, I like it.
When I first visited this post, it sounded like a piece of official documentation to me. Surprisingly insightful, thank you!
Post a Comment