It spends 40% of the time reference counting while printing which is a good indication of the overhead while rewriting terms. If a pointer comes into scope it needs to increase the reference count, if it leaves a scope it needs to decrease it; there's nothing to be done about it, and it will be like this for all rewriting functions.
After the parser recognized a small subset of the language, definitions and expressions, I now am expanding to the full language. The AST explodes, every alternative becomes about 50 lines instead of one. It's inconvenient and a lot of work.
The real hurdle will be whether I can express transforms concisely in C++. Well, it won't be very concise, pattern matching must be replaced by tedious downcasting and picking a record apart through accessor methods. I'll probably write some macro's for that, even if I find programming with macros bad practice. But I need some generic solution.
I need a transform which supports this:
↓:T | ↑:T
APPLICATION
↙:T / ↗:T ↘:T \ ↖:T
TERM0 TERM1
The arrows denote the information flow around the terms during a transform where a term may also be rewritten. By fiddling with the information flow locally in each rewrite step it's not very difficult to implement semantic analysis and by fiddling with the expressions returned it is not difficult to implement rewrite passes. It more or less generalizes attribute grammar approaches to compiler construction.
The transform should be a generic function which leaves information flow and terms intact, unless you override certain cases. In that manner, without spelling out all functionality for each leave, you end up with concise definitions of passes, albeit that the transform may not be very fast.
The transform should be a generic function which leaves information flow and terms intact, unless you override certain cases. In that manner, without spelling out all functionality for each leave, you end up with concise definitions of passes, albeit that the transform may not be very fast.
Well, that's it. There's progress but still two to three years of work left. I fully understand why people implement untyped scripting languages, it's just so much easier.
ADDENDUM: I am nearly finished with the parser. I severely restricted the input grammar for types; it used to have a liberal free-flow grammar where pretty absurd types would be accepted since I was rushing the compiler to a bootstrap. This time, since I know I have a compilation scheme, the type system will go first. So, no explicit forall constructs, no predicates on type synonyms, etc. Dumbing down continues.
ADDENDUM: I am nearly finished with the parser. I severely restricted the input grammar for types; it used to have a liberal free-flow grammar where pretty absurd types would be accepted since I was rushing the compiler to a bootstrap. This time, since I know I have a compilation scheme, the type system will go first. So, no explicit forall constructs, no predicates on type synonyms, etc. Dumbing down continues.
No comments:
Post a Comment