Among the many pseudo-magical features of Haskell is
something called lazy evaluation, which means that the values of variables
aren’t computed until they’re actually needed.
Humour aside, though, this can be quite useful for data processing. It allows
a programmer to define theoretically infinite arrays, then take only as many
elements as he needs, or even stream the results to an output device (which can
be modelled as an infinite buffer). The canonical example of a lazily-evaluated
list in Haskell is this ‘infinite’ list of the Fibonacci numbers:
fibs :: [Integer]
fibs = 1 : 1 : zipWith (+) fibs (tail fibs)
The first line, for those unfamiliar with Haskell, declares fibs to be a list
of integers. The second line is its definition. The : operator, called ‘cons’,
joins an element to the beginning of a (singly-linked) list. zipWith is a
function that takes a binary function and two lists, evaluates the function on
each pair of corresponding elements, and returns a list of the results. Finally,
the tail function returns everything but the first element of a list.
If this seems difficult to understand, the best way to wrap your head around
it is to observe the construction of this list in action. We start with just
two elements:
fibs: 1 1 ...
tail fibs: 1 ...
zipped:
Now do the first iteration of zipWith:
fibs: 1 1 ...
tail fibs: 1 ...
zipped: 2
However, although we are imagining these three lists as separate, they’re
actually all part of the same list! So when elements are added to zipped,
they’re added to fibs (and tail fibs) at the same time. The last operation
actually did this:
fibs: 1 1 2 ...
tail fibs: 1 2 ...
zipped: 2
Repeat the operation ad infinitum, and you have an ‘infinite’ list of
Fibonacci numbers:
fibs: 1 1 2 3 5 8 13 ...
tail fibs: 1 2 3 5 8 13 ...
zipped: 2 3 5 8 13 ...
There are lots of real-world problems with this same fundamental structure,
where such features could be useful. The reason Haskell is so rarely used in
production is that the kinds of programmers who work on such problems don’t like
magic—and really don’t like linked lists.
Haskell is a compiled language; you can look at the
assembly the Haskell compiler generates from
this code. I can’t make heads or tails of it, but I do know it’s not an optimal
implementation of a Fibonacci generator, because I create something much closer
to that later on in this post.
We’ll return to a (probably easier, and definitely more efficient)
implementation of this concept in C. First, however, let’s take a look at some
more conventional C implementations of the Fibonacci sequence.
unsigned fib (unsigned n)
{
if (n < 2) return 1;
else return fib(n - 1) + fib(n - 2);
}
This is the canonical recursive algorithm, in fact the mathematical definition
of the series, often used to
teach the concepts of recursion to introductory computer science students. The
problem is that it’s slow. Specifically, it takes exponential time to compute,
because every iteration has to make two recursive calls. Almost all of this
computation is redundant, but the compiler is not (yet) smart enough to
recognise that fact:
fib: # @fib
pushq %rbp
pushq %rbx
pushq %rax
movl $1, %ebp
cmpl $2, %edi
jb .LBB0_3
movl %edi, %ebx
movl $1, %ebp
.LBB0_2: # =>This Inner Loop Header: Depth=1
leal -1(%rbx), %edi
callq fib
addl $-2, %ebx
addl %eax, %ebp
cmpl $1, %ebx
ja .LBB0_2
.LBB0_3:
movl %ebp, %eax
addq $8, %rsp
popq %rbx
popq %rbp
retq
This code will probably cause a stack overflow for any significant values of
n, and will certainly take far longer than it needs to. The trick to
optimising it is realising that fib(n - 2) has already been computed by the
call to fib(n - 1), which had to call fib((n - 1) - 1). If you can cache
this value, you can get an algorithm that runs in linear time, albeit a much
messier one:
unsigned fib_linear (unsigned n)
{
unsigned a = 1;
unsigned b = 0;
while (n--)
{
unsigned temp = a;
a += b;
b = temp;
}
return a;
}
No longer can you see a simple definition of the series just from looking at the
code. It’s also almost four times longer, in terms of the number of statements.
But the assembly looks much better:
fib_linear: # @fib_linear
testl %edi, %edi
je .LBB1_1
movl $1, %ecx
xorl %edx, %edx
.LBB1_3: # =>This Inner Loop Header: Depth=1
movl %edx, %eax
addl %ecx, %eax
movl %ecx, %edx
movl %eax, %ecx
decl %edi
jne .LBB1_3
retq
.LBB1_1:
movl $1, %eax
retq
No recursive calls, no stack overflows, no exponential time. But why did we have
to break our clean, concise recursive function to get this? Haskell provides the
best of both worlds, despite the overhead.
If we’re trying to generate all Fibonnaci numbers from 1 to fib(n), as we
were in the Haskell program, we can actually do something in C which is even
more concise and readable than the Haskell program. The key pattern we need is
something I’ve termed a generator. A generator is a pure (stateless) function
which takes a pointer to an item in an array (as well as, potentially, other
parameters), and writes a value to that address.
A generator for the Fibonacci numbers would look like this:
void fib_gen (unsigned *a)
{
a[0] = a[-1] + a[-2];
}
(We can only write to a[0]. If we were to write to, say a[-1], it would be
legal C, but would no longer be functionally pure and thus break the pattern.)
The assembly looks like this; each iteration ought to run in constant time,
making the entire algorithm linear:
fib_gen: # @fib_gen
movl -8(%rdi), %eax
addl -4(%rdi), %eax
movl %eax, (%rdi)
retq
Unfortunately, the generator cannot encode the base case; we must do that at the
call site, which might look something like this:
void fib_gen_test ()
{
unsigned fibs[100] = {1, 1};
for (int *it = fibs + 2; it < fibs + 100; it++)
fib_gen(it);
}
The syntax is a bit unwieldy compared with Haskell’s, but that’s the nature of
C; a C++ template function with a lambda could clean it up quite nicely, so we
could do something like this:
unsigned fibs[100];
generate(fibs, 100, {1, 1}, [](unsigned *a){ a[0] = a[-1] + a[-2]; });
(The implementation of generate is left as an exercise for the reader.)
More importantly, however, this is done with near-zero overhead—certainly no
linked lists. Fibonacci numbers are a fairly trivial example, but in theory, any
task for which Haskell’s lazy lists, JavaScript’s or Python’s yield, or Rust’s
iterators are a good fit can be done this way with less overhead. For example,
here’s a (unsafe, not production-worthy) generator for prime numbers using the
Sieve of Eratosthenes:
// Requires initial input {1, 2} (yes, I know 1 isn't a prime number,
// but it might as well be).
void primes_gen (unsigned *a)
{
for (unsigned trial = a[-1] + 1; trial <= UINT_MAX; trial++)
{
bool success = true;
// Count backwards through the list of already tested primes,
// and test if the trial number is divisible by any of the
// primes less than trial.
for (int i = -2; a[i] > 1; i--);
{
if (trial % a[i] == 0)
{
// Is composite, try again.
success = false;
break;
}
}
if (success)
{
a[0] = trial;
return;
}
}
}
This, in fact, cannot be done with any iterator implementation (JavaScript,
Python, Rust, or any other) that I’ve ever heard of. I believe Haskell lazy
lists can do it, albeit with more complexity (Haskell, being a pure functional
language, cannot iterate well).
Functional programming, and in particular functional array programming, is a
powerful tool for data processing (and all programming is fundamentally data
processing), but the languages that work well for high-efficiency computing
aren’t designed for such abstractions. Patterns like the generator can help C
and C++ programmers get the best of functional programming without sacrificing
the efficiency and control that those languages provide.
Godbolt link for all C code
Reddit discussion
comments powered by