Sunday, August 1, 2010


The compiler of stage two is now running on its own source...

Two hours later: It took two hours for the Hi to C compiler to compile itself, split evenly between compiling to objects and linking those objects. So, it's one hour objects and one hour linking where the majority of time is spend printing diagnostic output. The compiler itself is slow on some parts due to ill-chosen data structures. Guess I need to get around a factor 50-100 speedup from somewhere to get to reasonable performance, not nice, but doable in the current setting.

Of course, just compiling with -O3 just makes it twice as fast already. Another 50% or more off by not printing diagnostic information. 20% by using more simple test on record structures. 20%-30% by using incremental garbage collection. Memoization of FFI. That's about 1/(0.5*0.5*0.8*0.8*0.8)= 8 times faster, about a factor ten off target after that. So, I would need about three optimizations extra which take 50% off. Real texts instead of character lists, inlining and partial evaluation, specializing libffi, better internal data structures & faster algorithms in the compiler?

It generated C which compiled, guess I now need to rerun it again to get a proper fixed point out of it.

Later: Ouch, that failed immediately. It doesn't lexify right, looks like I still got an escape bug somewhere. Confirmed that, fixed something somewhere which wasn't broken in the first place I guess. It's a puzzle, actually.

So, everything is set except for that escaping bug, which is a nasty one since it may have been introduced as early as stage zero and give wrong results, infect the subsequent compilers, starting from there. No other choice then just to unit check from stage zero up to stage two to see if it handles all characters right.

Gawd, now even I am confused, lemme see:
  1. Stage zero, a Hi compiler written in ML producing ML.
  2. Stage one, a Hi compiler written in Hi producing C, linking to an ML runtime. (stage zero, rewritten)
  3. Stage two, a Hi compiler written in Hi producing C, linking to a C runtime. (stage one, different native bindings)
  4. Hi version 0.1, not even wrong. A Hi to C compiler (stage two bootstrapped).
Still computes, but not what they learned me in school.

Later: Dont feel like debugging again - stage one passes some unit tests, stage two doesn't? Back to C&C, see tonight what went wrong.

Later: So, \' becomes ', \" remains \" should be just " in stage two. Can't figure out who does what wrong. Is it stage one not matching correctly or a bug in stage two? The code looks correct.

No comments:

Post a Comment