Recursion & Macros
Table of Contents
How can macros help with recursive function definitions? For the short version, skip ahead to here, here, and here. The code blocks combine Racket's macro facilities and functional programming foundation to form recursion-and-macros.rkt, starting with the following preamble.
#lang racket/base (require racket/match) (require racket/vector)
Warmup
A recursive function — like the factorial function — is one that applies itself.
(define (factorial n) (if (= n 0) 1 (* n (factorial ; recur (- n 1))))) (println (factorial 5)) ; 120
A lambda abstraction is inherently anonymous, which makes the
"applying itself" part of recursion cumbersome (though
not impossible).
A macro can succinctly capture a name to facilitate the recursive
applications.
(define-syntax-rule (lambda/rec name arguments body ...) (letrec ((name (lambda arguments body ...))) name)) (println ((lambda/rec factorial~ (n) (if (= n 0) 1 (* n (factorial~ ; recur, using the captured name (- n 1))))) 5)) ; 120
Compared to lambda, the extra name that comes before the
arguments can appear in recursive applications in the body
expressions. The resulting function remains anonymous in the
enclosing scope.
Note that lambda/rec is "just" syntax, and the letrec form that it
expands to could instead be written out by hand — similarly to how
let can be
rewritten
in terms of lambda. The goal is to find macros that provide
useful syntactic abstractions.
Substance: Memoization
Memoization can be used to avoid repeated application of a function to the same argument.
Some utilities that simulate resource usage help while experimenting with memoization.
(define util.counter (make-parameter (void))) (define (util.increment!) (util.counter (+ 1 (util.counter)))) (define-syntax-rule (util.profile x) (parameterize ((util.counter 0)) (let ((result x)) (printf "ran: ~a\ncost: ~a\nresult: ~v\n" (quote x) (util.counter) result)))) (define-syntax-rule (util.repeat n x) (for ((_ (in-range n))) x)) (util.profile (util.repeat 99 (util.increment!))) ; cost: 99
(These macros, per se, have nothing to do with recursion.)
Simple
id is an identity function that increments a counter each time it's
applied. Applying it repeatedly to the same argument increments the
counter multiple times.
(define (id x) (util.increment!) x) (util.profile (util.repeat 99 (id 0))) ; cost: 99
The repeated computation can be avoided by memoizing id.
(define (memoize f) (let ((cache (make-hash))) ; "let... (lambda (x) ; ...over lambda": a closure (hash-ref cache x (lambda () ; what to do if `x` was not already in `cache` (let ((y (f x))) (hash-set! cache x y) y)))))) (define id/memo (memoize id)) (util.profile (util.repeat 99 (id/memo 0))) ; success! ; cost: 1
Recursive
Disappointingly, memoize doesn't work very well on a recursive
function, like
the Fibonacci function.
(define (fib n) (util.increment!) (cond ((< n 2) n) (else (+ (fib (- n 1)) (fib (- n 2)))))) (util.profile (fib 7)) ; cost: 41 ; result: 13 (util.profile ((memoize fib) 7)) ; failure! ; cost: 41 ; result: 13
It memoizes the overall application of the recursive function, but doesn't affect the recursive applications, which involve the unmemoized function.
(util.profile (let ((f (memoize fib))) (util.repeat 99 (f 7)))) ; cost: 41
To memoize a recursive function, each recursive application needs to use the result of the memoization.
(define fib~ (memoize (lambda (n) (util.increment!) (cond ((< n 2) n) (else (+ (fib~ (- n 1)) (fib~ (- n 2)))))))) (util.profile (fib~ 7)) ; redemption! ; cost: 8 ; result: 13
The point of abstracting over memoization in the form of a function
like memoize is for it to be composable, in the sense that it can
be applied to a function that was defined separately. That doesn't
work with a recursive function, so it has to appear in the definition
of fib~; at that point, the boilerplate with memoize and lambda
might as well be generated via a macro, without any further loss of
composability.
(define-syntax-rule (define/memoized (f x) body ...) (define f (memoize (lambda (x) body ...)))) (define/memoized (fib/memo n) ; this looks like `fib`, but acts like `fib~` (util.increment!) (cond ((< n 2) n) (else (+ (fib/memo (- n 1)) (fib/memo (- n 2)))))) (util.profile (fib/memo 7)) ; cost: 8 ; result: 13
In define/memoized, each f that appears in one of the body
expressions refers to the f being defined, which is the memoized
function. The memoize appears outside the lambda, so combining
that memoization with the (define (f x) ...) shorthand form of a
function definition calls for a macro.
A similar effect can be achieved in
Haskell,
OCaml,
or
Python,
for example. The Haskell version is like the
define-memoize-lambda construction used here, stopping short of
the macro. The OCaml version that uses lazy_memo_rec is like the
Haskell version, but with some extra machinery to "tie the recursive
knot". The Python version is like the macro, using Python's
"decorator" functionality.
Style: Recursion Schemes
Recursion schemes can be used to separate the specific part of a recursive function from its general recursive structure.
Separation
In double, the specific part is to multiply a number by 2, and the
general structure is to recursively apply double to each element of
a list.
(define (double t) (match t ((? number?) ; a number (* 2 t)) ((list) ; an empty list t) ((cons u v) ; a non-empty list (cons (double u) (double v))))) (define one--two-three (list 1 (list 2 3))) (println one--two-three) ; '(1 (2 3)) (println (double one--two-three)) ; '(2 (4 6))
Those two parts can be separated
(define (double/norec t) (match t ((? number?) (* 2 t)) (_ t))) (define (recursively f/norec) (lambda/rec f (t) (f/norec (match t ((cons u v) (cons (f u) (f v))) (_ t)))))
and then recombined
(println ((recursively double/norec) one--two-three)) ; '(2 (4 6))
to achieve the same effect. recursively can also be used with other
specific operations — for example, to reverse a tree.
(define (reverse-lists/norec t) (match t ((cons u v) (append ; "snoc" --- it's inefficient, but that's not the point here v (list u))) (_ t))) (println ((recursively reverse-lists/norec) one--two-three)) ; '((3 2) 1)
Multiple operations can be composed in a single traversal.
(define double-reverse (recursively (compose reverse-lists/norec double/norec))) (println (double-reverse one--two-three)) ; '((6 4) 2)
Abstraction
recursively works, but it's specific to the "trees as nested lists"
data type; it can be improved further by abstracting over the data
type.
(define (recursively/generic step f/norec) (lambda/rec f (x) (f/norec (step f x)))) (define (lists.step f x) (match x ; compare this to the `match` in `recursively` ((cons y z) (cons (f y) (f z))) (_ x))) (define (lists->vectors/norec x) (match x ((list) (vector)) ((cons y z) (vector-append (vector y) z)) (_ x))) (define lists->vectors (recursively/generic lists.step lists->vectors/norec)) (define (vectors.step f x) (if (vector? x) (vector-map f x) x)) (define vectors.double (recursively/generic vectors.step double/norec)) (println (vectors.double (lists->vectors one--two-three))) ; '#(2 #(4 6))
Note that neither lists.step nor vectors.step — nor any of the
/norec functions — is recursive. Even recursively/generic is
not recursive; instead, it constructs a recursive function from
those simpler pieces, using the lambda/rec macro from before.
Direction
recursively/generic abstracts over
- the specific operation being applied (via
f/norec) and - the data type being recurred over (via
step),
but it encodes a particular recursion pattern. That pattern worked on the preceding examples, but it doesn't work when converting a Boolean expression to negation normal form (NNF).
(struct :boolean () #:transparent) ; a poor man's algebraic data type (struct :var :boolean (name) #:transparent) (struct :not :boolean (arg) #:transparent) (struct :or :boolean (arg1 arg2) #:transparent) (struct :and :boolean (arg1 arg2) #:transparent) (define v (:var "v")) (define tautology (:or v (:not v))) (define (:boolean.step f x) (match x ((:var _) x) ((:not y) (:not (f y))) ((:or y z) (:or (f y) (f z))) ((:and y z) (:and (f y) (f z))))) (define (nnf/norec x) (match x ((:not (:not y)) y) ((:not (:or y z)) (:and (:not y) (:not z))) ((:not (:and y z)) (:or (:not y) (:not z))) (_ x))) (println ((recursively/generic :boolean.step nnf/norec) (:not tautology))) ; no! ; (:and (:not (:var "v")) (:not (:not (:var "v"))))
The first part looks fine, but an uncanceled double :not survived in
the second part. The issue is that NNF conversion needs to be done
"top down", so that a :not that was introduced by one of the rules
will later be "pushed down" through the remainder of the expression.
(define (top-down step f/norec) (lambda/rec f (x) (step f (f/norec x)))) (define nnf (top-down :boolean.step nnf/norec)) (println (nnf (:not tautology))) ; yes! ; (:and (:not (:var "v")) (:var "v"))
At this point, it becomes clear that recursively/generic is really
"bottom up": compare its (f/norec (step f x)) — which first recurs
via f, then operates non-recursively on the result via f/norec —
to top-down's (step f (f/norec x)) — which first applies
f/norec, then f.
(define bottom-up recursively/generic)
Aside: Dynamism
The following example involves some subtle [ab]use of dynamic typing.
(define (size/norec x) (match x ((:var _) 1) ((:not y) (+ 1 y)) ((or (:or y z) (:and y z)) (+ 1 y z)))) (define size (bottom-up :boolean.step size/norec)) (let ((contradiction (:not tautology))) (println (size contradiction)) (println (size (nnf contradiction)))) ; 5 ; 4
Apparently, the number 1 is being added to the argument[s] of the
Boolean operators :not, :or, and :and! In reality, the
:boolean variants are being used more like generic containers "on
the way up" to hold numbers — the sizes of their subformulas —
that will be used later in the computation. Racket's dynamic typing
facilitates this; for details on how to describe types like these
statically, refer to
https://blog.sumtypeofway.com/posts/recursion-schemes-part-2.html#cb9.
Efficiency
Memoization can be included in a recursion scheme.
(define (:boolean.step! f x) (util.increment!) (:boolean.step f x)) (define size/nomemo (bottom-up :boolean.step! size/norec)) (util.profile (size/nomemo (:or tautology (:not tautology)))) ; cost: 10 ; result: 10 (define-syntax-rule (lambda/memoized f (x) body ...) (letrec ((f (memoize (lambda (x) body ...)))) f)) (define (bottom-up/memoized step f/norec) (lambda/memoized f (x) (f/norec (step f x)))) (define size/memo (bottom-up/memoized :boolean.step! size/norec)) (util.profile (size/memo (:or tautology (:not tautology)))) ; cost: 5 ; result: 10
The memoization appears in the recursion scheme, and doesn't interfere
with its step or f/norec. Note the introduction of
lambda/memoized — yet another macro, related to lambda/rec and
define/memoized; if you're still here, it's worth thinking about
whether that should instead be named something like
lambda/rec/memoized, to leave space for an anonymous
lambda/memoized.
Discussion
Clarity
Without the macros, bottom-up/memoized could be implemented as
follows.
(define (bottom-up/memoized/expanded step f/norec) (letrec ((f (memoize (lambda (x) (f/norec (step f x)))))) f)) (util.profile ((bottom-up/memoized/expanded :boolean.step! size/norec) (:or tautology (:not tautology)))) ; cost: 5 ; result: 10
The body from lambda/memoized is buried under layers of
boilerplate code; the macro serves to hide that boilerplate and shift
the focus to the relevant details, and how they differ from those in
bottom-up, top-down, etc.
Composability
The conventional wisdom is to use functions where possible, and to
leave macros for "actual" syntax transformations. To that end,
memoize, top-down, the various .step and /norec definitions,
etc. are functions (not macros), which makes it possible to pass them
around as terms. That level of composability is missing from
lambda/rec and lambda/memoized, for example: factoring the
repeated body out of bottom-up and bottom-up/memoized would
require more macro definitions.
Types
This was all developed in a dynamically-typed setting. Recursion
schemes are well-trodden territory in the world of statically-typed
functional programming, as in Haskell. In that setting, :boolean
might be an actual algebraic data type, with :boolean.step coming
from it being an instance of the Functor type class, etc. Static
type checking could also catch mistakes like using a type-mismatched
step and f/norec.