Friday, December 23, 2011

Lessons learned

Going old-skool and bootstrapping a language is a great way to learn about compiler technology since most books touch on the what of compiler construction, and hardly on the why. I.e., one usually learns next to nothing from reading a book or copying a design. Having said that, implementing a language from scratch gives you a lot of why answers, but since you get some of the hows wrong, you end up implementing the whole enchilada again.

So the lessons learned:
  • Old-skool bootstrapping is a lousy idea given that writing an interpreter is a lot easier and you have a lot more time to experiment. Probably Hugs, or was it Gopher, existed for a decade before people began building performing Haskell compilers, and I should have done the same. The bootstrap leap is just too big unless you copy a design - which I find an uninteresting exercise.
  • Lisp days are over. On 64 bit machines you want to use a packed text representation, since a list of characters representation consumes an awful lot of memory. This is both a problem of the compiler, if it would have been implemented similarly in Haskell it would have the same memory consumption (2.4GB), as well as of the naive unpacked 64 bit representation of the memory model. I'll change this.
  • The FFI scheme is dynamic, based on libffi, reasonably portable, but can't handle corner cases. Moreover, it adds a dependency and vanilla C is just so much nicer. I'll change this.
  • The combinator scheme used is about on par with an (old) javascript or Lisp compiler/interpreter. It's going to be at least one order, possibly two orders, slower than C. Which is still an order faster than Python, so I am not sure how to count that. There are three main reasons for the creeping performance:
    1. No stack, all thunks are allocated in the heap. A stack based runtime will always beat that, but programs will segfault from time to time because people will continue writing recursive functions which will consume the C stack. I won't change this.
    2. Lack of 'lets' or ANF form. Now all local computations are lifted out, leading to an explosion of combinators, compile time, and thunks allocated runtime. I am not sure how much this costs, but I am assuming that I'll need to implement this. I'll change this.
    3. The compiler generates an unhealthily sized binary executable. I have no idea why, except for some hunches: a) it generates 64 bit assembly from C code, b) native assembly is more wordly than a concise minimal VM (32 or 16 bit) instruction set, and c) compiling to a 'concrete' machine is more costly than compiling to a SECD, or other lambda calculus inspired, machine. Having said that, Haskell binaries -despite the VM assembly- seem to be as bloated, so it might be unavoidable in compiling functional languages. I won't change this.
  • The compiler uses the concrete C bit words internally for primitive values. This leads to a very small compiler since there is essentially one primitive value type (to optimize on), but I actually grew tired of all the a priory word manipulations. Moreover, it leads to a dependency on C primitive value representations, consequently, the machine architecture. Nice for toy compilers, but actually just a lousy idea, so the alternative of elaborating all primitive types appeals now. I'll change this.
  • The compiler uses texts internally a lot for identification, which is lousy giving how they are represented, but -when switching over to native strings- is probably as costly in memory consumption as integers but very costly in runtime performance. I won't change this.
  • The compiler has no explicit symbol table which is again novel, it doesn't seem to matter much. What is far worse is probably the performance of the lookup. I won't change this now (see purity).
  • The compiler employs novel AST data traversal combinators which are good for implementing concise algorithms since they mimic an intermediate between attribute based compilation and direct traversal techniques, but it costs probably a factor two in compilation speed. I won't change this.
  • The compiler uses a novel bottom-up constraint gatherer and a separate type inferencer, an idea copied from the Helium Haskell compiler, which was rushed; I am not sure of all the inference rules I use. The bottom-up part is academically nice, but, even with unfinished inference rules, I can see that no production compiler will ever want to take the associated performance hit. So, old-skool ML like direct AST typing it is. I'll change this.
  • I ended up with a pure compiler (except for C calls), which I think is a bad idea. It doesn't use a symbol table at the moment, but a symbol table will probably come in handy when more information is needed due to a more elaborate VM targeted. I personally loathe monads, so I should make the language impure. Hello ML? I'll change this.
I'll start with a new runtime and leave the thought on how to target the new runtime for a while. I blathered a bit with some C programmers on it, and by nature they'll always target an Ocaml-like runtime or minimal design -I don't like that particular VM much,- but my nature is different. The language should be robust, but not low-level, so I'll go in the direction of a Java-esque design, which means 64 bit headers for everything.

It should be fun, and I don't feel like working on it full time. So it'll take some time, again.

No comments:

Post a Comment