r0 = alloc(3); r0 = ... r0 = ... r0 = ... r1 = alloc(2); r1 = ... r1 = &r0;
Entirely unnecessary. It is a stack, not a heap, I should treat it like a stack, and get rid of all the alloc calls and just push values into it. For the example, that would mean, two allocs less, half of the arguments gone (since movs are replaced by push), but at the expense of some stack calculations.
I could, of course, make a double stack machine, where one stack is reserved for pushing closures and closures are just popped in order, and the second for pushing data. It would, of course, mean that sometimes closures should be copied to the data stack. I'll end up with something akin to Clean ABC code?
Problem, this only affects performance, number of instructions generated. So, it might be faster (if I can map to native push instructions), but it won't affect the space consumption of the generated code when running. Which is entirely a case of making the data-representation more compact, generating more compact representations for data-structures, and -for compiler performance- changing a number of routines in the compiler to make generated code for that less memory intensive.
For the sake of simplicity, I think I'ld rather go in the direction of a one-stack machine with an incremental garbage collector to clear away generations of thunks and data pushed.