Here’s a quote from a computer scientist living in what was clearly simpler times:

[…]. Hence I have written the following simple routine, which may separate the man-compilers from the boy-compilers: […]

– Donald Knuth

Here is Knuth’s program in ALGOL*60*:

```
begin real procedure A(k, x1, x2, x3, x4, x5);
value k; integer k;
begin real procedure B;
begin k := k - 1;
B := A := A(k, B, x1, x2, x3, x4)
end;
if k ≤ then A : = x4 + x5 else B
end;
outreal(A(10, 1, -1, -1, 1, 0))
end;
```

This seemingly innocent program uses a lot of resources to compute its
result. It takes a parameter *k* that is used to scale up the
difficulty of the problem. It’s an interesting program to use for
testing language implementations, since it’s easy to verify the result
and also to compare the performance with other language
implementations. People have translated the program into many
languages over at Rosetta code.

But what if you have a groundbreaking compiler that manages to compute
the solution to a larger *k*-value than you can find in the tables?
You would want to know if you got the right answer. The “last word”
from Knuth in ALGOL bulletin #19, page 8, gives us a way to
easily compute the answer much faster than running the actual program:

Since then my right hand has observed that the value of A(k, x1, x2, x3, x4, x5) is equal to c1 × x1 + c2 × x2 + c3 × x3 + c4 × x4 + c5 × x5 where the coefficients are given in the following table:

k c1(k) c2(k) c3(k) c4(k) c5(k) ≤0 0 0 0 1 1 1 0 0 1 1 0 2 0 1 1 0 0 3 1 1 0 0 0 4 2 1 0 0 0 5 3 2 1 0 0 6 5 3 3 2 0 7 8 6 9 6 0 8 14 15 22 13 0 9 29 37 48 26 0 10 66 85 102 54 0 When k ≥ 5, these values may be obtained by the relations

c1(k + 1) = c1(k) + c2(k)

c2(k + 1) = c2(k) + c3(k)

c3(k + 1) = c3(k) + c4(k + 1)

c4(k + 1) = c1(k) + c4(k) - 1

c5(k) = 0

By using Knuth’s table and relations it’s possible to compute the man or boy function quite quickly. I wrote a program that does this. A walk-through follows.

The first part of the program is a gratuitous use of macros. The
`define-coefficients`

macro defines one of the coefficients mentioned
in Knuth’s last word letter (*c1* to *c5*). Each coefficient is
parameterized by the *k*-value, which means that we should define a
procedure that takes *k* as an argument and returns *cn(k)*. Knuth’s
table gives us the first values it should return (*consts*). For the
rest of the values we’ll use the corresponding relation. As a bonus
the macro also creates a hashtable where it stores previously computed
values, which is key to speeding up the algorithm.

```
(define-syntax define-coefficients
(syntax-rules ()
((_ (name k) consts equation)
(define name
(let ((h (make-eqv-hashtable)))
(lambda (k)
(let ((v consts))
(cond ((< k 0)
(vector-ref v 0))
((< k (vector-length v))
(vector-ref v k))
((hashtable-ref h k #f))
(else
(let ((t equation))
(hashtable-set! h k t) ;memoize
t))))))))))
```

Next the macro is used to define the coefficients.

```
(define-coefficients (c1 k) '#(0 0 0 1 2 3)
(+ (c1 (- k 1))
(c2 (- k 1))))
(define-coefficients (c2 k) '#(0 0 1 1 1 2)
(+ (c2 (- k 1))
(c3 (- k 1))))
(define-coefficients (c3 k) '#(0 1 1 0 0 1)
(+ (c3 (- k 1))
(c4 k)))
(define-coefficients (c4 k) '#(1 1 0 0 0 0)
(+ (c1 (- k 1))
(c4 (- k 1))
-1))
(define-coefficients (c5 k) '#(1)
0)
```

The equations given as the last argument to the macro are the
relations adjusted to *cn(k)* instead of *cn(k+1)*. Now that the
coefficients have been defined it is easy to define *A*, just as Knuth observed:

```
(define (A k x1 x2 x3 x4 x5)
(+ (* (c1 k) x1)
(* (c2 k) x2)
(* (c3 k) x3)
(* (c4 k) x4)
(* (c5 k) x5)))
```

Computing what is commonly referred to as *A(k)* is now achieved by
calling `(A k 1 -1 -1 1 0)`

. In Chez Scheme the computation takes no
time at all, even for very large *k* that no existing machine could
possibly handle using the original algorithm:

```
> (time (A 11 1 -1 -1 1 0))
(time (A 11 ...))
no collections
0.000004543s elapsed cpu time
0.000004149s elapsed real time
96 bytes allocated
-138
> (time (A 128 1 -1 -1 1 0))
(time (A 128 ...))
no collections
0.000007392s elapsed cpu time
0.000007114s elapsed real time
304 bytes allocated
-4635937302118988398753530618599286738522250
> (time (A 1024 1 -1 -1 1 0))
(time (A 1024 ...))
no collections
0.000744346s elapsed cpu time
0.000743677s elapsed real time
635472 bytes allocated
-134525946204897748012677078309699584745858677996302504232754322006054514469295899995329805478290190232956296857851394596551319734862436102284482400522332522964879376131124504760141628006776332968808079671234906412428151832076596502192653046509414077174915139024920128557185075083924034657098210492453447406606370793777979291350811243549339280709645840417
```

Only one question is left unanswered. Whether or not a sufficiently smart compiler can (or indeed should) transform Knuth’s original algorithm into the fast algorithm presented here.

*This was written as the first part of a series of articles on stuff
that has been lying around on this website for a long time without any
real commentary. The code it talks about was written in 2012.*