Sunday, February 28, 2010

12h, and one bug later

A full compile by the slow ML-interpreted to C compiler takes a whopping twelve hours - after, in the last phase, it ended with a happy bug... Which is annoying, since I want to bootstrap the Grafwegen Hi to C Compiler, called GHC, uh, I mean Hic.

Meanwhile I am documenting it.

030110: Ran it again for stats. Compiled it down to 4139 basic combinators, which is okay. Most of the time is spend producing expensive debugging output. The bug is in the code-generation, guess I'll print that part.
030210: After a fix, it stalls on the last phase, emitting code/calculating a large list of lines of C. I stopped evaluation on 1.3Gig memory, probably ocaml/unix is wasting more time on garbage collecting/swapping than anything else. Removed the semantics code for an intermediate compiler - should save around 25% memory - next try - emit each line directly?

What now?

Monday, February 22, 2010

Forks and Threads

I was looking at what I wanted for concurrency again, and decided against almost everything I so smartly thought of before. Given how chips are build these days, with support for processes, threads, and some interprocess communication the only plausible thing to do is:
  1. Support fork, create a copy of the whole process, and make use of the fact that modern chips have copy-on-write triggers.
  2. Support light thread creation, which should be mapped to some eval f a0 ... an construct, i.e., evaluation of a distinct function.
  3. Possibly support very light threads, for example by just having simple round-robin primitives in the language.
Not the academically nicest solution, but probably the best I can -easily- do.

Bye Erlang

Sunday, February 21, 2010

C Interpreter

I am thoroughly regretting not having started of with, or half-way having switched to, an interpreter in C/C++. Even if writing an interpreter in C is probably not the best way of exploring a design.

Spoiled Milk

Saturday, February 20, 2010

Widening and Narrowing

As stated, I want to be able to put values of different types, but known interfaces into a list. First, I thought I would need something elaborate for that, which turned out to be true, but not as elaborate as I first expected.

I just need to be able to erase a type. Which is pretty natural, since I can already do the opposite with type matching.

An example, a list of integers and strings:

def islist: ::Text t => list t =
    cons (5:t) (cons ("Hi!":t) nil)

It's as easy as that. It'll need some work on the prover and the type checker though, since now type variables are used in the terms of definitions, and that's not even close to nice.

022110: Oh cool, another nice date. The compiler I have know inferences a type in a simple bottom-up pass and I don't really feel like exposing the type of the definition into the term. Can I get away with just introducing unknowns?

Great, more work...

Thursday, February 18, 2010

On Comparison

I pretty printed the whole source code and found a few bugs. One most noteworthy is how I handle comparison on complex objects, I consistently miss a case.

I am not entirely happy about that simple mistake, and there doesn't seem to be a trivial fix I like. There are a few options:

  1. Keep it as it is, and handle all comparisons explicitly through the Ord interface, but live with potential bugs.
  2. Use some tagging scheme, for instance, unwind all structures to unique lazy lists of integers. Live with less potential bugs, and also, look at whether an inverse function for that would exist. I.e. serialization for free?
  3. Build a fast but simple comparison routine into the language runtime, which means the least code, a minimal chance of bugs, but is something I don't like since I'ld rather express everything in the language directly.
  4. Implement deriving as in Haskell.
  5. Anything I didn't think off....
No easy way out.

021810: I Like the second option the most, but, given time, I am going for the third.
021910: Wow, bad call. There turned out to be no point in doing anything general, since most of the equalities I employ are either a) very basic, there's no need for something elaborate, b) not basic but also not based on structural equality. I.e., in the compiler, the most difficult values are those of the ast, and these are all parameterized with the start-position of the expression, so none of the above schemes really work. It turned out to be easy with another scheme, though, so I am happy for the time being.

Stairway to Heaven...

Short Introduction To Hi

I was fiddling around with LaTeX and wrote a short introduction. Anyway, here it is, complete with all spelling and punctuation mistakes.

Meanwhile, I noticed two things. One, a mistake in how I handle some comparisons. And two, how convenient it would be to be able to build lists which holds values of different types but which are constrained to the same interfaces.

021910: New link with some small changes after some hints on LtU.

Why am I still not done?

Monday, February 15, 2010

Embarrassingly Trivial

I took a few involuntary days off and took another fresh look at the code. Again, I have to observe that in the change after the introduction of alpha-conversion, I messed up, and missed a few opportunities to straighten out and simplify the code. I did a recount, and am at 9-10k undocumented KLoc. It seems I have taken the longest path to the most embarrassing trivial code.

Red cheeks.

Friday, February 12, 2010


Sometimes, the rather silly observation is that you're just working around the fact that you don't own a printer.

021410: Sure as hell. Bought a printer, and found the bug.

Mom's typewriter.

Sunday, February 7, 2010

Log 020710

Guess I am almost there for a self-compile except for a hard-to-trace bug which occurs in complex expressions. I need unit tests to trace them. Also, I managed to speed up the stage1 compiler by just expressing the 'concat' operation on lists directly instead of with a fold, I might have used the wrong fold on it - I am wondering whether it would pay of to express more functions directly.

Anyway, unit tests on the operational semantics it is.

020810: Slow again. Narrowed it down to an example, but the example I found to hard to analyze. I gambled the problem is in the LC part, and decided to write a simple evaluator for LC terms to detect which transformation breaks the operational semantics.

021210: Narrowed it down to a simple example. In need of a printer.

When in doubt, let the machine do the hard work.

Monday, February 1, 2010


I got a small bug somewhere which is hard to trace, not sure what gave, guess its again the LC rewriter going wrong.

Meanwhile, I needed to implement coercions also since some unsafe terms couldn't be compiled by the more restrictive Hi-to-C compiler. (I've got three compilers, a compiler written in ML which translates Hi to ML called stage zero. A compiler written in Hi, with bindings to ML called stage one, which compiles to C. And, lastly, the same compiler but with dynamic bindings to C called stage two, only the system file differs. Essentially one and two are the same, I am making small patches to one to get it to accept two, almost itself.)

Anyway, coercions work in the following way

    [ x: T -> f x
    | y: S -> g y
    | z    -> z   ]

A pattern may query a type (constructor), after which the variable is treated as having that type. Type coercions may occur anywhere in a pattern, if I implemented it correctly. The pattern instantiates a type which is used in the rewrite, but gives an unknown upward to be unified agains the other matches. The return of the block is the intersection of types of the rewrites, which pragmatically seems correct, but aesthetically, I wonder whether you'ld want to be able to unify unknowns there such that it becomes abstract in the return type also.

Its a straightforward feature I haven't seen in functional languages, so far. I needed it for handling exceptions since these are not type-checked, so to get at the value, the value needs to be coerced.

020510: Implemented and works. It needed some work on which types I query, once I need to factor that out too and build me a decent type handling library. Unmarshalling code depended on type-casts too, that mostly works now, there are some runtime/C bugs which shouldn't be hard to fix.