So the lessons learned:
- Old-skool bootstrapping is a lousy idea given that writing an interpreter is a lot easier and you have a lot more time to experiment. Probably Hugs, or was it Gopher, existed for a decade before people began building performing Haskell compilers, and I should have done the same. The bootstrap leap is just too big unless you copy a design - which I find an uninteresting exercise.
- Lisp days are over. On 64 bit machines you want to use a packed text representation, since a list of characters representation consumes an awful lot of memory. This is both a problem of the compiler, if it would have been implemented similarly in Haskell it would have the same memory consumption (2.4GB), as well as of the naive unpacked 64 bit representation of the memory model. I'll change this.
- The FFI scheme is dynamic, based on libffi, reasonably portable, but can't handle corner cases. Moreover, it adds a dependency and vanilla C is just so much nicer. I'll change this.
- No stack, all thunks are allocated in the heap. A stack based runtime will always beat that, but programs will segfault from time to time because people will continue writing recursive functions which will consume the C stack. I won't change this.
- Lack of 'lets' or ANF form. Now all local computations are lifted out, leading to an explosion of combinators, compile time, and thunks allocated runtime. I am not sure how much this costs, but I am assuming that I'll need to implement this. I'll change this.
- The compiler generates an unhealthily sized binary executable. I have no idea why, except for some hunches: a) it generates 64 bit assembly from C code, b) native assembly is more wordly than a concise minimal VM (32 or 16 bit) instruction set, and c) compiling to a 'concrete' machine is more costly than compiling to a SECD, or other lambda calculus inspired, machine. Having said that, Haskell binaries -despite the VM assembly- seem to be as bloated, so it might be unavoidable in compiling functional languages. I won't change this.
I'll start with a new runtime and leave the thought on how to target the new runtime for a while. I blathered a bit with some C programmers on it, and by nature they'll always target an Ocaml-like runtime or minimal design -I don't like that particular VM much,- but my nature is different. The language should be robust, but not low-level, so I'll go in the direction of a Java-esque design, which means 64 bit headers for everything.
It should be fun, and I don't feel like working on it full time. So it'll take some time, again.