This showed up in my local bookstore

So I bought it. And there was at least one more face out copy, besides the one you see here.

And this is where I bought it:

Penn Books on the LIRR Level of Penn Station. Small store, but many shelves of SF, particularly if you know to look under the display table across from the main set of shelves. The staff are outstanding. I got the first recommendation for Vinge’s Rainbows (sic) End there. They also have an excellent classics and literature section. As in, when I mentioned that ‘March Up Country’ was (sort of) based on ‘The Anabasis’, they knew where it should be shelved, wondered why it wasn’t there, and offered to order it. Which might just be good salesmanship, except, after picking up my copy, I was looking in the same section for a copy of the Eddas, which they had, and spotted another copy of the Anabasis.

Buy books there, if you can.

Using the Standard C++ Library in a Functional style.

The standard C++ library offers a number of algorithms that have nearly exact analogues in functional languages, and are used in almost the same way. In particular std::accumulate and std::transform are powerful, and very general, algorithms. In functional programming literature, accumulate is usually referred to as fold, or foldl, and transform is known as map, or zipWith, depending on whether transform is the unary or binary function form.

To quickly review, accumulate takes an iterator range, and either sums the range, or applies a functor to each value together with the result of the last application of the functor. This can be thought of as putting the operation between each pair of elements in the range. When the operation is plus or times, the result is the sum or product of all of the elements in the list. This is useful, but not really exciting to most programmers. However, when you consider carefully what can be used as the functor argument, you realize that fold is a an extremely powerful algorithm.

A quick example:

std::list& cons(std::list& xs,int x)



return xs;


int init[] = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10};

vector v(init, init + 10);

std::list empty;

std::list v2 = std::accumulate(v.begin(), v.end(), empty, cons);

> foldr (:) [] [1..10]

(Note, the lines of code starting with > are Haskell equivalents to the C++ code, and SHOULD work at a ghci Prelude> prompt. Since that’s where I’m cutting and pasting from. Although it’s sort of like literate haskell, the whole article probably won’t compile since early definitions may conflict with later ones.)

Cons is a list constructor function. It pushes its argument onto the end of the linked list, and then returns a reference to the list that the element was appended to. The fact that this function takes and returns a reference to a list is a compromise I’ve made between the C++ library and functional purity. Really, I should be taking and returning lists by value. Except that the overhead is unconscionable, turning an O(n) operation into an O(n^2) operation. Since there is no way of accessing the intermediate state, in this case, I’m safe, but you need to be careful in this analysis. Using cons() in your own code will modify the value, and you lose the referential transparency we’re trying to maintain.

So the application of cons() using accumulate against an empty list copies the list into a new list. If we had supplied a non-empty list, we would have appended the lists together. Interesting, for an algorithm that’s usually considered a purely numerical one.

Functional languages usually work with singly linked lists. A singly linked list has an append at the head only, and therefore the tail remains unchanged, and any references to the tail can not tell that there is a new list that uses it. In fact, a list might be the tail of several other lists, and there is no way of telling the difference. It would be possible to create a version of list that is singly linked, where cons returned by value, and in that case, two lists could share a tail, but have distinct heads. So long as the list held only immutable values (and properly, all values are immutable), you would be completely safe.

Transform, or map, as it’s know in FP circles, is another fundamental algorithm. Transform takes an iterator range and applies a functor to each element in the range and copies the result into a destination output iterator, or, in its second form, takes two ranges and a binary operator, and outputs the result of pair -wise evaluation of the elements in each range, through the output iterator.

int func(int x)


return x*x + x + 1;


// …

std::transform(v.begin(), v.end(), v2.begin(), func);

> let func x = x*x + x + 1

> map func [1 .. 10]

Functions and function objects as first-class entities

As you can see, the core STL algorithms are fairly weak by themselves. That is, they don’t really do anything by themselves. What they do is capture conventional patterns of function applications to collections. The real power is in the functions themselves. However, C++ functions are fairly limited. Let’s say you wanted to write a function that added a constant K to its argument. One way is to write int adder(int x, int K); Unfortunately, it’s rather difficult to apply this function with the same constant to every element in a collection.

{Actually, in a few moments we’ll see how we can do exactly that almost trivially, but let’s hold on to that thought.}

Function objects, aka functors in the OO community (the name means something _entirely_ different in functional programming) , give us a way of extending what functions can do by letting us associate some additional information with the function call. We do this by creating an object with an operator()() method. That lets the object be used where a function is called for, and also to supply the additional information as part of the object the operator is part of.

class adder


int K;


adder(int k) : K(k)


int operator()(int i) const

{return K+i;}


Then we can do something like:

std::transform(v.begin(), v.end(), v3.begin(), adder(5));

> let adder x = (x +)

> let adder5 = adder 5

> map adder5 [1 .. 10]

But, as you can see, that’s a lot of work for a small function that we’re probably only going to use in one place.

Still, it’s nice to see that we now have an entity that can be used just like a function, but is also a first class entity in the C++ language.

Lambda abstraction and Partial Application

So, how can the adder example be made more convenient. Something that you might actually use?
How about

std::transform(v.begin(), v.end(), v4.begin(), _1 + 5);

> map (x -> x + 5) [1 .. 10]

> — or, even cooler

> map ( + 5) [1 .. 10]

This is using the Boost Lambda library to constructed a function object in place. One that takes a single parameter, and returns that plus 5. That _1 is what as known as a placeholder. Its type is boost::lambda::placeholder1_type. (And, yes, the leading underscore followed by a number is a perfectly legal variable name.) The functor acts in this context as though it has an operator() with a signature like int operator()(int arg1), and when transform applies it, the output is the same as our earlier example using adder. Except we didn’t have create a class just for this single use.

The reason I say that it acts as though it has that signature in this context, is that the actual signature is much more complicated, involving quite a bit of template metaprogramming in order to deduce the correct signature given the actual types of the arguments. If, v, for example, was a vector of doubles, the lambda expression would need to calculate that the return type is also a double, since the type of (double + int) is double. And for non-builtin types, it can be much more complicated, particularly if the lambda expression has more than one free variable.

For example:

k = (_1 * _2)(i,j);

> let k = x y -> x * y

That lambda expression can be applied to any pair of arguments that has an operator*() defined for it, and then as long as k has an assignment operator that can accept the result of that operator*(), everything will just work. However, that flexibility comes with a cost. The actual type of (_1 *_2) is insanely complicated. With the version of boost that I have, with gcc 4.0.2, the type is:

const boost::lambda::lambda_functor, boost::tuples::tuple >, boost::lambda::lambda_functor >, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type> > >

Not a variable or member you would like to declare. Fortunately, it works together with boost::function. You can assign that functor to a boost::function for later use. e.g.

boost::function a = (_1 * _2);

boost::function b = (_1 * _2);

This is also one of the powerful aspects of ML and Haskell, and most modern functional languages. They are strictly typed, but the types can be inferred by the compiler. So the programmer is, for the most part, freed from having to annotate the types of everything. But the compiler will still give you an error if you say something that is really impossible. You get most of the benefits of duck-typing, without having to worry that there is some case that you haven’t covered in tests that will result in ‘Does Not Understand’.

The boost::lambda library even supports currying, although with a syntax that is a little hard to work with:

boost::function c = bind(protect(_1 * _2), i, _1);

The protect call protects the evaluation of the lambda expression from the bind operation. The bind operation attaches an argument to an expression to be used when the expression is evaluated later. In this case, one of the arguments is a lambda placeholder, so that the result is equivalent to the lambda expression (i * _1).

The documentation for boost::lambda is quite good, and is worth extensive study. So I won’t repeat it. Instead, let’s focus on ways of using it that you might not have thought of otherwise.

Pipes and Filters Architecture

Pipes and Filters is one of the fundamental architectural patterns identified by Mary Shaw. On a Unix command line, this is expressed with | (pipe), < (redirect stdin), > (redirect stdout), and tools like tee and xargs. In a functional programing system, piping the output of one function to the input of another is via function composition, that is with functions f and g, you compute f(g(x)). This is also sometimes written as f . g. read as f compose g.

Doing this can also be very useful in complex template programs, where the intermediate type is not known, or difficult to express, but when passed directly to another templated function, the compiler computes the type for us.

The goal here is to take a sequence of functions, compose the entire sequence, and then apply that composed function to a sequence of values. Where we end up is












The interesting part is the final stanza. What are foldl1 and composer? Lets work our way outward. The composer is a functor that takes two boost::functions and returns a boost::function that composes them. It assumes that each of the functions has the same signature, which in this case is reasonable, since they’ve come out of a single sequence. It binds together the two functions and a lambda placeholder, then converts that to a new boost::function.

class composer





operator()(boost::function f,

boost::function g)


using boost::lambda::_1;

return bind(f, bind(g, _1));



Constructing a version of composer that takes functions with different signatures and generates the correct return type is a little more complicated. Take a look at the example code. The trick is to use a nested struct that does the type computation based on the template arguments of the member function. A conventional template metaprogramming approach, but quite strange the first time you come across it.

The next part is the function foldl1. The unusual name is borrowed from Haskell. Once you’ve learned C++, Java, and Ruby, Haskell is the next language you should learn. Foldl1 is an abbreviation for fold left with one argument. It’s very similar to std::accumulate, with one key difference. Instead of starting with a base value, the first value in the sequence is used. This means that it can not be applied to an empty list, but it does avoid having to come up with an identity value, such as 0 for addition, or 1 for multiplication. We could use the function identity(x){return x;}, but that’s an extra function call we don’t need, and an extra argument to constructing the composed function. And, not all sets have an identity.


typename BinOp>

typename Iter::value_type

foldl1(Iter first,

Iter last,

BinOp oper)


typename Iter::value_type n

= *first++;

return std::accumulate(first,





So, assuming that fns is, say, a std::vector >, then it’s just a matter of pushing the elements of the sequence v through the composed function, and streaming them out through the ostream_iterator.

What is currying? an aha! moment

Since it really is an aha! moment, the best I can do is tell you what led me to it, and hope that helps.

There is a difference between partial application and currying that most experts ignore, because they already understand, but they lie in wait ready to tell you that you have it all wrong.

And you won’t understand because the concepts are >thatI’ll start with Haskell syntax for the type of a function that takes two integers and returns one.

Integer -> Integer -> Integer

in C that would be

int (*T)(int , int )

Now. what about a function that takes an int and returns a function
that takes an int and returns an int?

(using cdecl syntax)
declare T as function (int) returning pointer to function (int) returning int

int (*T(int ))(int )

or, in Haskell
Integer -> Integer -> Integer


Shouldn’t those have different types in Haskell? They look really much more different in C !

I can’t tell the difference in Haskell between a function that takes two ints and returns an int, and a function that takes one int and returns a function that takes one int and returns an int.




a -> b -> c

a function that takes an a and returns a function that maps b to c, or a function that takes an a and a b and returns a c. They are entirely equivalent.

Functional Programming in C++ Part 1


C++ is known to be a multi-paradigmed language. This is often construed
to mean you can program in both a procedural and and object oriented
style. This is too limiting a view. C++, particularly with modern
library support, is more than capable of supporting programming in
the functional style. Even without modern libraries like boost, or
std::tr1, the Standard Template Library embodies many functional

It also turns out that the template system in C++ defines a
metaprogramming language that is a pure, non-strict, lazy, functional
language. Of course, since this was not by design, it is not a
particularly elegant language, as anyone who has had to do template
metaprogramming can attest. For right now, though, I’m going to stay
away from metaprogramming, and look just at what can be done in the
language with the facilities provided.

What is functional programming

At the base, it is performing computations by the pure application of
functions, with no side-effects. Of course C has allowed this since
its inception, and it is never considered a functional language.
There is a lot of debate, and at least a little acrimony, in
determining if a language is functional. There are certainly many
claims of ‘my language is more functional that yours.’

Languages that are widely regarded as pure functional languages are Haskell and
ML. Haskell is also getting a lot of alpha geek buzz, for a number of
reasons, and has finally hit the stage where people are learning it
in order to work on something other than a Haskell compiler.

The flavor of ML most commonly encountered is O’caml. The optimizing
compilers for the language are quite powerful, and programs often
compete, and beat, equivalent C programs in efficiency and

The LISP family has deep functional roots, with Scheme being the version
that focuses the most on functional purity, as well as simplicity and
ease of teaching. Common Lisp has a much broader range of libraries
and facilties defined for it and standardised for it.

JavaScript, Ruby and Python also all have a strong functional component to them.

Examining these languages leads to the observation that the real keys seem to

  1. Functions as first class objects
  2. Referential transparency of expressions
  3. Partial application, a.k.a currying.
  4. Unnamed functions, a.k.a lambda expressions

All of these features have been implemented in C++, in the language
itself, in the Standard Library, in TR1, in Boost, or in custom
libraries such as FC++. This article will examine these in more
depth, and show how they can be profitably incorporated into your
existing practice.

First let’s examine what the four cornerstones of functional programming
really mean.

Functions as first class objects means that functions can be used in the same
way as all other entities in the language. In particular, that they
can be passed as values to functions, returned as values from
functions, and new ones can be created at runtime. In C, function
pointers can be passed and returned as values, but creating new
functions is not (portably) possible. In C++, however, we can create
objects that overload operator(), that satisfy all of the desired
properties. Function pointers can also be treated much more generally
with std::tr1::function objects.

Referential transparency of expressions means that an expression, or any of its
sub-expressions can be replaced by the value of that expression or
sub-expression without changing the behavior of the program as a
whole. A computation will always return the same result, and will
only depend on the value of its arguments. This implies that there
are no side-effects to evaluating an expression. No globals that can
be changed, producing non-local changes elsewhere. This property
gives the programmer, and the compiler and runtime, strong
capabilities to reason about the program, optimize it, and
demonstrate its correctness.

Partial application is one of the most unusual, and most powerful, features
of functional programming. This is also known as currying, after
Haskell Curry, a mathematician in the mid 20th Century who
did some of the fundamental work in combinatory logic and the lambda
calculus. Currying takes a function of N parameters and turns it into
a function of N-1 parameters by fixing the value of one of the
formers arguments. Or, another way of looking at it, is to say that a
function that takes N arguments is equivalent to a chain of N
functions, each taking 1 argument and returning a function that
consumes the next argument. Haskell Curry invented it as a way of
talking about multi-variate functions in a theory of functions of one
variable. Well, Shoenfinkel invented it first, but shoenfinkling
never really caught on.

Unnamed functions don’t sound very useful. However, combined with
higher-order functions, that is functions that take and return
functions as parameters, they become very powerful, and quite useful.
If you’ve ever wanted to supply a simple function inline in a one of
the STL algorithms, but were frustrated with the syntax:





you’ve wanted a better way of writing lambda expressions. Lambda calculus is one of the fundamental theories of computation, and while I’m temptedto start a lengthy exposition, I think a few of quick examples willgive you the flavor.


\x -> x*x + x + 1


(lambda (k) (+(+(* k k) k) 1) )

(Boost.Lambda (c++))

_1 * _1 + _1 + 1

The key is that you can provide a definition of a function and then bind
it to a variable, rather than a fixed name. Ideally with nearly the
same syntax as you would use if you did name the function. Functional
languages are often defined in terms of lambda expressions and the
lambda calculus, with syntactic sugar sprinkled on top to make it
more palatable. I’m using syntactic sugar in the techincal sense of a
well defined transformation from one syntactic form to a more
primitive syntactic form. The thing to keep in mind is that too much
syntactic sugar causes truth decay.

Objects vs Values

Just like an int” has been a mainstay of C++ programming since the very
beginning. The phrase indicates that you can create a new type with
all of the capabilities of a built-in type. What many programmers
don’t realize is that the behavior of ints is a polar opposite of the
behavior expected of objects. That is because ints are quintessential
value types. And whereas an object is an entity with state and
identity, values have neither. This 7 here is indistinguishable from
that 7 over there. And 7 never changes (unless you;re using an old
version of FORTRAN) into 8. You evaluate an expression such as 7+1,
and produce an 8, but the original 7 is unaffected. Objects are
different. This object is always distinct from that object, even if
they have the same state. You can change Bob, and everyone who has a
reference to Bob sees the change also. Unfortunately, the mechanism
by which you produce new value types in C++ is exactly the same as
how you produce new object types. You declare a new class.

What you permit and accept by default with that class determines whether
instances of the class behave like values or like objects. The key
pieces of machinery in the language are familiar to all C++
programmers: default construction, copy construction, copy assignment
and destruction. The compiler will provide ones that by default act
(mostly) like a value. Mostly, because of the shallow vs deep issue,
and the fact that pointers are values, even if what they point to
isn’t. So what all the compiler instantiated methods do is treat each
member of the class as a value, and then create, copy, assign or
destroy it, in the order they are declared. Unfortunately, for the
programmer concerned with efficiency, and almost all C++ programmers
are, these copies are, or can be very expensive.

Objects, on the other hand, typically either do not accept the default
implementations, or they avoid invoking them. Usually when you say
object2 = object1, you want them to continue to refer to the same
entity. In fact, assignment is fraught with danger, with potential
slicing when assigning a derived to a base. Programmers will either
use a Handle/Body or pimpl_ pattern to separate the implementation
side of the class from the reference side of it, or use a smart
pointer such as std::tr1::shared_ptr. In the evil past, some
programmers would attempt to use raw pointers, and manage the
allocated resources by hand. In practice, it turns out that no one is
smart enough to do so successfully in any program of interesting
size. Or at the very least, if the original programmer is that smart,
the next person to attempt to enhance the system, or fix a bug, is
not that smart.

Lazy functional programming languages, such as Haskell, avoid the price of
copying values by only evaluating an expression when it is needed.
This is guaranteed to be correct because of referential transparency.
Although it can lead to puzzling runtime behavior. I’ve coded some
benchmarks which seem to take practically no time, because it turns
out that the calculation is never fully evaluated. This takes not
paying for what you don’t use to the n-th degree. This call-by-need
behavior is why they are characterized as lazy.

Getting this behavior in C++ is tricky, but far from unheard of. One
technique, used in numerical libraries such as Blitz++, and pioneered
by Todd Veldhuizen, is known as expression templates.

The problem expression templates were introduced to solve is the creation
of unneeded temporaries. Take for example the code

Array A,B,C,D;

… stuff

D = A * B + C

Where each of A, B, C and D are two-d arrays, with the usual definitions
for multiplication and addition. The straightforward implementation
of operators *() and +() lead to the creation and destruction of
temporaries, with associated memory allocation and destruction. e.g.

Array operator *(const Array& lhs, const Array& rhs);

has to create and return a new array, which might be quite large, only to
have it used temporarily in Array operator +(const Array&, const
Array&) and then discarded. And even once the expression on the
right hand side of the assignment statement is evaluated, it has to
be passed into the copy assignment operator and very likely discarded
afterwards. (note that this is not an initialization, so that
permitted optimization does not apply).

Template expressions change the operators to return something else, an
expression that encapsulates the arguments, and evaluates them only
when necessary.

ArrayExpression operator +(const Array&, const Array&amp;);

Where ArrayExpression looks something like

ArrayExpression {

const Array& lhs;

const Array& rhs;

more stuff

double operator(int,int); //compute the value at the point


Now, for lazy languages, like Haskell, this can be built into the language
itself, and in this case the concept is called a ‘thunk’. Because by
the time you need to think about it, the thought has already been
thunk. Yes, it’s a really awful joke. You have to remember, we work
in an industry where an extremely popular language is named after an
obscure, although influential, comedy troupe. Other industries do not
consider understanding why Norwegian Blue Parrots are funny a job
qualification. Nor do most of our managers, unless they came up
through the ranks themselves.

Referential transparency makes thunks extremely powerful because not only can
they be passed around instead of fully evaluated values, once they
are fully evaluated, the system can remember that value and use it
from then on. This is very much like common sub-expression
elimination, only it is a runtime feature, as well as a compile time

Monads, REST and C++ Template Metaprogramming

OK, with that title, I’m sure to please almost no one. If you want to know how to do REST-ful programming in Haskell, or tie REST to C++, move along, there’s nothing to see here. The connection isn’t at the implementation level. Which is the whole point.
So what do REST, Monads and (successful) Template Metaprogramming have in common?
Strict maintenance of levels of abstraction, and extremely narrow interfaces between those levels.
What does that get you?
Extremely (and possibly maximally) reusable abstractions (code).
What triggered the realization for me was a paper on Arrows by Hughes, which asked the question ‘Why are Monads so reusable when the have such a small interface’, although not in those exact words. It dawned on me that they are so reusable exactly because they have such a narrow, but still very well defined, interface. They can do that because Monads are all about abstracting out features. A particular Monad will delagate all the real work down to the type it is being run over. Its real job is constraining the ways the values of the types it is a Monad over can be combined.
The analogue I have in mind for C++ MP is the swap() method. In a C++ template, we can assume that we can call swap on two values, that we otherwise know nothing about, and that the two values will be exchanged in a way that does not throw any exceptions, and is the best way of exchanging two values of that type. The abstract notion of swapping two values is delegated to the concrete implementation of the type being templated over.
A REST-ful web application exposes PUT, GET, POST, and DELETE on resources named by a URI. Nominally not much to have a really rich experience. OTOH, Google Map mashups contradict that.
It’s not the richness of the interface of the metalevel, it’s the richness allowed at level below. The narrow, but well defined, interface, where that interface delegates to a lower level abstraction, is exactly what enables composeability.
Monads, being firmly grounded in Functors in Category Theory, take this to an extreme level of rigor. There are laws that must be satisfied for a thing to be called a Monad, and those laws ensure that other Monads can interoperate, and achieve the Principle of Least Surprise.
I’m not intending this to be Yet Another Monad Tutorial. So I won’t go too deep here. But monad transformers show a degree of generic reusability that is awfully hard to achieve with C++ templates, even though C++ templates have similar expressive power (with a horribly clumsy syntax).
Monads are functions (functors) that take types to types, and so correspond to mpl metafunctions. But they preserve the underlying structure on the types they operate over. In particular, composition is preserved. So if I have an function of type a -> M b, and a function b -> c, I get a working a -> M c for free, just by using fmap. In C++ it would be like
template foo(A)
C bar(B)
getting you to
template foo(A) by saying template fmap(a,b), or even better just fmap(a,b) and letting the compiler figure out types.
(where foo is something like
template foo(A a);)
That looks so impossible in C++, that most C++ programmers have no intuition about it.
But every time you wish that you didn’t have to write some type that the compiler already knows, like the iterator or value_type or return_value, particularly when they get really messy, you’re looking for what Monads, or at least what Hindley-Milner type inference, provide.

Types and Programming Languages: Chapter 4

I’m working through Types and Programming Languages, by Benjamin Pierce.

I’m up to somewhere around chapter 13, References, but it’s starting not to make sense. Which means it’s time to back up and do more of the work, instead of just nodding as though I really understand it.

One of the things he does is build typecheckers for the languages he describes, in the language ocaml, or Objective Caml, a version of the language ML. For a variety of reasons, I’m trying to implement in Haskell. Mostly to give me something concrete to work on that isn’t too large in scope to learn the language a little bit. aSomething more than ‘Hello, World!’, strip_spaces, or traverse a linked list, but less than a ‘real application’.

For those not familiar, a typechecker is somewhere between a compiler and a grqammar driven parser. A type checker inspects what you give it for type correctness and consitency. It makes sure that you don’t assign a Foo to a Bar, unless there’s a rule in the typesystem that says you can. It may, in the process of typechecking, do some steps that a compiler also does, like reduce expressions, but it does this in the interest of determining what the type, not the value, is. Of course, if I were writing a compiler, it would make sense not to throw away that information, and do a bit of both at once.

That does lead me to a bit of a sub-rant. The first step in the process I’m working on is parsing the textual expression. Which means using a parsing library. (Pierce does, so it isn’t cheating) Haskell has two; happy, a yacc analogue, and parsec, a ‘monadic parser combinator’ library. Since the point of doing this in Haskell is to get a better idea what phrases like ‘monadic parser combinator’ libraries mean, I was a bit biased towards Parsec. I already know and loathe yacc.

So I start in on the documentation. At least it has some. That’s a nice benfit of the fact that Haskell grew up in the acadamic community. They need to publish or perish, and the publication serves as documentation. Somewhat, sort of. Although Parsec doesn’t really suffer in that respect. The docs are pretty clear. They just suffer from the same problem that all parser lib docs do. They want to show that you can implement an entire compiler inside the parser.

And that’s usually a bad idea.

The docs show you how you can attach interesting semantic actions to events in the parser, like evaluating the expression that’s currently being parsed. However, in practice, that’s hardly ever what you want to do. You want the parse to return something that abstracts the textual expression into a data structure, usually some kind of abstract syntax tree, where the nodes in the tree reflect elements in the grammar. Then you can write other stuff that accepts the AST and processes it in interesting ways, like compiling or typechecking it.

That’s certainly what Pierce’s code does in ML. And I’m trying to avoid being too inventive.

In any case, it turned out to be pretty trivial to return an AST from a Parsec parser, and in fact, all the examples of real parsers that come with Parsec take that approach. Which gave me some comfort about being on the right track.

Now, the arith language that we’re starting with is pretty primitive. It has booleans and non-negative integers, aka natural numbers . And the latter are all of the form successor 0′ or ‘successor successor 0’, meaning 1 and 2, respectively. Not a real language, but a place to start. Complicated enough that a few theorems could be proved non-trivially, but not so complicated you couldn’t easily work everything by hand.

The language’s syntax can be described with the following grammar

t ::=
if t then t else t
succ t
pred t
isZero t

This tranlates to the Haskell datatype ArithExpr:

data ArithExpr
= TmTrue
| TmFalse
| TmIfExpr ArithExpr ArithExpr ArithExpr
| TmZero
| TmSucc ArithExpr
| TmPred ArithExpr
| TmIsZero ArithExpr
deriving (Show, Eq)

Note that’s almost a mechanical transliteration, and that’s exactly what I’m aiming for. I’m a firm believer in the rule that there are two kinds of source code, that which obviously has no defects, and that which has no obvious defects.

So now we need a parser that will return those terms. In Parsec, that looks like this:

arithExpr :: Parser ArithExpr
arithExpr =
parens arithExpr

So the arithExpr is a parser of ArithExpr, and the parser returns either a trueExpression, falseExpression, ifExpression, etc.

Those look like:

trueExpression =
do{ reserved "true"
; return TmTrue

falseExpression =
do{ reserved "false"
; return TmFalse

ifExpression :: Parser ArithExpr
ifExpression =
do{ reserved "if"
; condition <- arithExpr
; reserved "then"
; thenClause <- arithExpr
; reserved "else"
; elseClause <- arithExpr
; return (TmIfExpr condition thenClause elseClause)

Hopefully, even if the Haskell syntax is unusual and unfamiliar, the intent is pretty clear. true, false, if, then, else are parsed as reserved words, and true and false just return a TmTrue and TmFalse respectively. The ifExpression is a bit more interesting. It parses if, then it looks for an arithExpression, a then followed by another arithExpr, and finally an else followed by a third arithExpr, and returns the three of them wrapped up in a TmIfExpr.

So with those out of the way, we can look at how to evaluate the expression returned by the parser, testing them against a set of evaluation rules. For this small language, there are only a few, in two sets.

For Booleans we have

t ::=
if t then t else t


v ::=

evaluation rules

if true then t2 else t3 -> t2

if false then t2 else t3 -> t3

t1 -> t1'
if t1 then t2 else t3 -> if t1' then t2 else t3

Then arithmetic is added (the terms from above are elided)

Arithmetic Expressions

new terms

t ::= ...
succ t
pred t
iszero t

new values

v ::= ...

nv ::=
succ nv

new evaluation rules

t1 -> t1'
succ t1 -> succ t1'

pred 0 -> 0

pred (succ nv1) -> nv1

t1 -> t1'
pred t1 -> pred t1'

iszero 0 -> true

iszero (succ nv1) -> false

t1 -> t1'
iszero t1 -> iszero t1'

So now we want to transliterate those evaluation rules into a single step evaluator. Now, Pierce writes his in such a way that it throws an exception when no rule matches. I haven’t figured out exceptions in Haskell, and in any case he does note in a footnote that they really aren’t considered good style in ML in the way that he uses them, to terminate a recursive functions.

Instead, I’m choosing to represent the eval function as possibly returning a value, and in Haskell, that’s returning a Maybe. So the signature of my eval1 is

eval1 :: ArithExpr -> Maybe ArithExpr

and the evaluation rules can be written like so

eval1 (TmIfExpr TmTrue t _) = Just t

eval1 (TmIfExpr TmFalse _ t) = Just t

eval1 (TmIfExpr t1 t2 t3) =
let t1' = eval1 t1
in case t1' of
{ Just t1'' -> Just $ TmIfExpr t1'' t2 t3
; Nothing -> Nothing

That is, if we’re eval’ing an if expression, and the first clause is true or false, we can reduce it to either the then clause or the else clause immediately. On the other hand, if it doesn’t match those, we can instead evaluate the first term, and return an if expression with the first term evaluated. If the first term doesn’t evaluate, then we return nothing. I’m taking advantage of pattern matching in Haskell to select the right function based on the details of the argument supplied. They all take a single argument, but I’m asking to distinguish what constructor was used to create that argument. The ‘_’ character means that I don’t are what the type or value of that part of the argument is, match anything.

This is the way I wrote it at first, at least. I ran this by the haskell-cafe mailling list, and recieved some suggestions that I wasn’t taking good advantage of Maybe being a monad, and that it might make more sense to do the sub-eval in a do block. In particular the advice in Nomaware’s All About Monads tutorial is exactly on point. Those Nothings and Justs don’t need to be there.

I’ll be reworking the code to adopt those suggestions before going much further.

To do the full evaluation, we run the eval1 until we can’t anymore. That’s

eval t =
let t' = eval1 t
in case t' of
{ Just t'' -> eval t''
; Nothing -> t

Here's all the code so far:

module ArithParser ( parseArith
, parseArithFromFile
, arithExpr
, ParseError
) where

import Char
import Monad
import Arith

-- Parsec
import Text.ParserCombinators.Parsec
import Text.ParserCombinators.Parsec.Expr
import qualified Text.ParserCombinators.Parsec.Token as P
import Text.ParserCombinators.Parsec.Language (haskellStyle)

parseArithFromFile :: String -> IO (Either ParseError ArithExpr)
parseArithFromFile fname =
parseFromFile arithExpr fname

parseArith sourceName source =
parse arithExpr sourceName source

arithExpr :: Parser ArithExpr
arithExpr =
<|> ifExpression
<|> zeroExpression
<|> succExpression
<|> predExpression
<|> isZeroExpression
<|> parens arithExpr

trueExpression =
do{ reserved "true"
; return TmTrue

falseExpression =
do{ reserved "false"
; return TmFalse

zeroExpression :: Parser ArithExpr
zeroExpression =
do{ reserved "0"
; return TmZero

ifExpression :: Parser ArithExpr
ifExpression =
do{ reserved "if"
; condition <- arithExpr
; reserved "then"
; thenClause <- arithExpr
; reserved "else"
; elseClause <- arithExpr
; return (TmIfExpr condition thenClause elseClause)

succExpression =
do{ reserved "succ"
; expr <- arithExpr
; return (TmSucc expr)

predExpression =
do{ reserved "pred"
; expr <- arithExpr
; return (TmPred expr)

isZeroExpression =
do{ reserved "iszero"
; expr <- arithExpr
; return (TmIsZero expr)

-- Tokens
-- Use qualified import to have token parsers on toplevel
tokenParser = P.makeTokenParser haskellStyle

parens = P.parens tokenParser
braces = P.braces tokenParser
semiSep1 = P.semiSep1 tokenParser
whiteSpace = P.whiteSpace tokenParser
symbol = P.symbol tokenParser
identifier = P.identifier tokenParser
reserved = P.reserved tokenParser
natural = P.natural tokenParser
charLiteral = P.charLiteral tokenParser
stringLiteral = P.stringLiteral tokenParser


module Arith where

data ArithExpr
= TmTrue
| TmFalse
| TmIfExpr ArithExpr ArithExpr ArithExpr
| TmZero
| TmSucc ArithExpr
| TmPred ArithExpr
| TmIsZero ArithExpr
deriving (Show, Eq)

isNumericalVal :: ArithExpr -> Bool
isNumericalVal TmZero = True
isNumericalVal (TmSucc t) = isNumericalVal t
isNumericalVal (TmPred t) = isNumericalVal t
isNumericalVal _ = False

isVal :: ArithExpr -> Bool
isVal TmTrue = True
isVal TmFalse = True
isVal t
| isNumericalVal t = True
| not (isNumericalVal t) = False
isVal _ = False

eval1 :: ArithExpr -> Maybe ArithExpr

eval1 (TmIfExpr TmTrue t _) = Just t

eval1 (TmIfExpr TmFalse _ t) = Just t

eval1 (TmIfExpr t1 t2 t3) =
let t1' = eval1 t1
in case t1' of
{ Just t1'' -> Just $ TmIfExpr t1'' t2 t3
; Nothing -> Nothing --Just $ TmIfExpr t1 t2 t3

eval1 (TmSucc t) =
let t' = eval1 t
in case t' of
{ Just t'' -> Just $ TmSucc t''
; Nothing -> Nothing --Just $ TmSucc t

eval1 (TmPred TmZero) = Just TmZero

eval1 (TmPred (TmSucc t))
| isNumericalVal t = Just t

eval1 (TmPred t) =
let t' = eval1 t
in case t' of
{ Just t'' -> Just $ TmPred t''
; Nothing -> Nothing -- Just $ TmPred t

eval1 (TmIsZero TmZero) = Just TmTrue

eval1 (TmIsZero (TmSucc t))
| isNumericalVal t = Just TmFalse

eval1 (TmIsZero t) =
let t' = eval1 t
in case t' of
{ Just t'' -> Just $ TmIsZero t''
; Nothing -> Nothing -- Just $ TmIsZero t

eval1 _ = Nothing

eval t =
let t' = eval1 t
in case t' of
{ Just t'' -> eval t'' --if (t /= t'') then eval t'' else t
; Nothing -> t

Solaris network install using Linux DHCP server

This weekend’s tech project was getting an old Sun Ultra 5 up and running with a new version of Solaris, in this case Solaris Nevada b33, so I can play with toys like opensolaris, dtrace, zfs,etc.

This particular machine doesn’t have a cdrom, so in order to get things working I had to do a network install. Or I could have installed a cdrom, since it’s an IDE based machine, but that wouldn’t have been nearly as much fun.

I hosted the install on a Netra T1, so most of the installation instructions were just cut-and-paste from the Sun documentation. Solaris 10 Installation Guide: Network-Based Installations

(Note: The T1 will eventually be providing network services, and live in the basement. It’s a little loud sitting on my desk. That’s why it’s not going to be the Sparc play machine.)

The complicated part was the DHCP server. I already have one on my network, on a Debian Linux box, and didn’t want to try having two of them. That would probably be bad.

In order to supply all of the information for the install, I needed to add a new name space to the dhcp.conf, a class and subclass for the particular hardware type, and some information specific to the machine.

Here’s the relevant pieces from dhcpd.conf:
First the SUNW option namespace, used by the Sun net installation:

option space SUNW;
option SUNW.SrootOpt code 1 = text;
option SUNW.SrootIP4 code 2 = ip-address;
option SUNW.SrootNM code 3 = text;
option SUNW.SrootPTH code 4 = text;
option SUNW.SswapIP4 code 5 = ip-address;
option SUNW.SswapPTH code 6 = text;
option SUNW.SbootFIL code 7 = text;
option SUNW.Stz code 8 = text;
option SUNW.SbootRS code 9 = integer 16;
option SUNW.SinstIP4 code 10 = ip-address;
option SUNW.SinstNM code 11 = text;
option SUNW.SinstPTH code 12 = text;
option SUNW.SsysidCF code 13 = text;
option SUNW.SjumpsCF code 14 = text;
option SUNW.Sterm code 15 = text;
option SUNW.SbootURI code 16 = text;
option SUNW.SHHTPProxy code 17 = text;

Then the class and subclass based on the vendor-class-identifier, which is sent out by the Ultra 5 when it’s trying to DHCP an address.

class "vendor-classes" {
match option vendor-class-identifier;

subclass "vendor-classes" "SUNW.Ultra-5_10" {
vendor-option-space SUNW;
option SUNW.SbootURI = "tftp:// ";
option SUNW.SinstIP4;
option SUNW.SinstNM = "heimdall";
option SUNW.SinstPTH = "/export/solaris11/install";
option SUNW.SrootIP4;
option SUNW.SrootNM = "heimdall";
option SUNW.SrootPTH = "/export/solaris11/install/Solaris_11/Tools/Boot" ;

Then, the particular information for the machine I’m trying to boot and install from the net:

host chimera {
hardware ethernet 08:00:20:a2:22:66;
option domain-name "";
option host-name "chimera";

The machine I’m installing onto is chimera, which has the MAC address 08:00:20:a2:22:66. It will get the address The install and boot server are both heimdall, which had IP addresses respectively. The ‘next-server’ tells chimera to netboot from heimdall. I’m calling that out in particular because I wasted about an hour figuring out that I needed that.

Once all that was done, it was a ‘simple’ matter of running boot net:dhcp – install from the openboot OK prompt.

The machine isn’t exactly a screamer by today’s standards, it has a 333Mhz UltraSparc IIi chip in it, but it does have 512Mb of RAM, which covers a multitude of sins. I think I may start over with a larger HD, since the 7G drive that’s in there now doesn’t leave much room for experimentation. I’ll probably go ahead and put a DVDRW drive in there too, even though I don’t need it now.