Sunday, May 24, 2015


I implemented another pass but now have doubts about whether I am tackling the problem right. These doubt stem from switching from a declarative description to implementing the compiler impurely in C++ and the transform I implemented.

For the bootstrap compiler, I was forced to reason within a declarative pure framework. Because a compiler is bookkeeping plus term rewriting, I more or less naturally decided to shift as much of the work to term rewriting as I could because of the lack of imperative manners of bookkeeping. The compiler rewrites the term, hoists a number of local definitions to the top level, and the only minimal bookkeeping it does is keeping an environment table around to look up the definitions of applied identifiers, it furthermore uses an impure counter to create fresh variables.

This is a fine and robust manner when working declaratively. All information is neatly represented in the tree and each pass can be visually inspected; the environment can be, and is, constructed on the fly when needed.

But I wonder whether I will end up with a rather odd beast when implemented impurely. I could also implement it more imperatively and shift work back to bookkeeping. People from the C++ world will probably find it a very odd compiler that it doesn't do the bookkeeping directly by impurely creating tables with references around to the various constructs in the tree and rewriting the tree in-situ.

The second doubt is whether I got the current transform right. I really appreciate minimalism but in a flurry of code I shifted from a push model to a pull model as basis for rewriting the tree. I.e., in the bootstrap compiler a typical transform looks like this:

    [ decl_space po c id aa dd -> 
        (rr0, e) = transform_child transform_metintro bindings.empty e;
        (rr0, e) = add_bindings rr0 e;
    | part po c id im dd -> 
        (rr0, e) = transform_child transform_metintro bindings.empty e;
        (rr0, e) = add_bindings rr0 e;
    | decl_interface po c id aa t k -> 
        rr0 = reduce_method_constraint id t bindings.empty;
            (bindings.union rr rr0, e)
    | e -> (rr,e) ]

It handles some special cases and 'pushes' the rewriting process further by 'asking' the transform to rewrite a node and its children.

In C++ I now have rewrites like this:

class RewriteBind: public Rewrite {
    RewriteBind(): _state(AstEmpty().clone()) {

    // state manipulation
    void set_state(const AstPtr& a) {
        _state = a;

    // rewrites
    AstPtr rewrite_decl_type(const Position& p, const AstPtr& n, const AstPtr& t) override {
        auto t0 = rewrite(t);
        return AstDeclType(p, n, t0).clone();

    // cuts
    AstPtr rewrite_decl_definition(const Position& p, const AstPtr& n, const AstPtr& t, const AstPtr& e) override {
        return AstDeclDefinition(p, n, t, e).clone(); // cut

    AstPtr _state;

The transform rewrites everything and it is manually directed with various cut-offs where not to continue rewriting. The drawback is a lot of cuts, I somewhere prefer the original solution. I guess nobody looks at the source code of a compiler so I'll guess I'll stick to it. But doubts.

No comments:

Post a Comment