Thursday, May 7, 2015

Log 050715

So combinator parsing in C++ turns out to be a bad idea, not only is it near impossible to get right most C++ implementations need several seconds for large files whereas the 'dumb' method usually finishes in milliseconds, needed to unlearn the functional style. Back to straight-forward recursive descent with switches on the current token.

I think I got how to write a far more trivial type checker than what I once started on. Straightforward Hindley-Milner with a subtyping relation on types implementing interfaces. It shouldn't be hard, I am also dropping all other ideas I had. Bottom-up HM+subtyping is good enough.

I've got a start of an AST description now going. Going to reference count everything through the use of C++ smart pointers; that's a bit overkill but it beats thinking about copying.

For downcasting it's going to be tag based switches followed by a static cast.

Everything is of course far more verbose than the bootstrap which I am using as a reference implementation.

So far so good. Making progress but this is probably still the easy part.

ADDENDUM: I thought I would share the AST solution I chose for people who might have ended up here by chance.

1) Define tags for your expressions:

typedef enum {
     VAR,
     ABSTRACTION,
     APPLICATION,
} expr_t;

2) Define a base object:

class Expr;
typedef std::shared_ptr<Expr> ExprPtr;

class Expr {
public:
     Expr(expr_t t): _tag(t) {};

     expr_t  tag() { return _tag; }

     virtual ExprPtr clone() = 0;
private:
     expr_t _tag;
}

3) Derive variants, for instance

class ExprApplication: public Expr {
public:
      ExprApplication(ExprPtr e0, ExprPtr e1): Expr(APPLICATION), _left(e0), _right(e1) {}
      
      ExprApplication(const ExprApplication& e) {
          _left = e._left;
          _right = e._right;
      }

      ExprPtr clone() {
          return ExprPtr(new ExprApplication(*this));
      }

      // etc.

private:
    ExprPtr _left;
    ExprPtr _right;
}


It took a lot of googling but the idiom for using shared pointers where you mostly only reference the base class is aptly described here.

Well, that's it. Don't create cycles, as long as your term is a DAG, most likely it will always be a tree, this scheme should work with RAII and might even allow for memory-compact solutions far better than simply copying everything.

No comments:

Post a Comment