Friday, May 22, 2015

From Toy to Tool

The first transform I implemented is also the simplest. I've no idea how I did it but, after all the fooling around, I now find myself implementing a formal spec, the bootstrap compiler.

A compiler is bookkeeping plus term rewriting. Everything boils down to how you're going to treat the information flow while rewriting a tree.


              ↓:T | ↑:T
             APPLICATION
        ↙:T / ↗:T   ↘:T \ ↖:T
       TERM0             TERM1


Most, if not all, passes implemented with functional transforms in the bootstrap pass information around, in a pure fashion using functional data structures. Functional data structures are great when working declaratively; there's no shared state so all invariants you want remain in place when passing them around.

Despite everything, using functional data structures often isn't as fast as proponents claim it is. It's nice if people can write Lisp compilers in a nanosecond but, usually, everything breaks down when you throw large files at a compiler written in that manner.

A scenario as I imagine the life time of a Lisp compiler: It starts of representing all declarations in a program with a list. Given more than 5K declarations it hits the first performance problem so that is replaced with a red-black tree. Given more than 40K declarations it hits the next bottleneck so that becomes a hash map. Given more than 150K declarations, new issues, and that becomes a data structure which pushes and pops ranges of hash maps. Given more than 500K declarations that breaks down, it ends with a fast imperative data structure where memory is allocated by hand.

Despite the fact that I fully appreciate the interpreter will probably never be used, I want fast imperative data structures. Ah man can dream, right?

Therefore, the next step, before writing the next passes, is thinking about how to get various imperative data structures right.

No comments:

Post a Comment