Friday, June 11, 2010

Evaluation (Untested)

Below, some calls and combinators which are used in the runtime. It is rewritten, untested at the moment, since I use a slightly different thunk layout than before.

The design is new, I didn't look too much at other implementations. All work is done in one heap, no stack since I didn't want to make the runtime too complex, and certainly didn't want to use the C stack since it is too easy to run out of stack space with a functional language.

The calls are just those primitives I needed in the language, everything else -including integer addition- is handled through FFI. For some applications, that will mean a big performance hit, but that's the price you pay.

The runtime has three sorts of values: Constants, size and type tagged series of bits. Records, size, type and record typed tagged series of pointers. And Thunks, the result of combinator translating lambda expressions. A thunk is translated in a somewhat novel way; I tried not to implement a CPS transform but go for a direct translation which should have enough performance.

The translation is straightforward when you think of it. Everything is compiled down to untyped lambda terms, local computations are 'letified,' but I still think of local computations as if they just push a series of thunks into the heap. A thunk just knows two things: 'Where to store the result of a calculation' which may be a thunk, it is stored into an RT pointer. And, 'What to do next' which is the continuation, it is stored into a K pointer. Since the runtime doesn't allow for pointers into arrays, the RT pointer is split into two parts: A pointer which points to the start of a record or thunk, and an index into that thunk.

An example, suppose you have a computation '\x -> f (g x) 3'. This is letified to '\x -> let a is 3 in let b is g x in f b a' which translates to a combinator S which pushes three thunks into the heap. In DOT form, 'S x = [f . .] [g x] [3]', each respective calculation is stored into each dot reading from right to left. During the translation the bookkeeping for pushing local computations and constants is resolved and instructions which store the pointers to respective thunks and records are generated. For proper constants, actually nothing is done except for placing the constant in the RT and returning K. For thunks, the K and RT fields are copied into the thunk, and new calculations are set up which will evaluate that thunk first after their own evaluation is done.

In pseudocode, a program for the above example would read something like:
[S, k, rt, args] ->
    if args is [] then rt = [S, k, rt, args]; return k
    else x = expand([f, k, rt], args, 1) ; y = [g, x, x[3], args[0] ]; x[4] = 3; return y
The first line is an obligatory arity check. If not enough arguments are presents, the thunk is treated as the result and stored for currying purposes. Otherwise, the thunk for the body of the function x is pushed which is evaluated last according to eager semantics, it also inherits any extra arguments to S; next, another thunk y is pushed, corresponding to an argument of the body thunk, which when evaluated places its results in the x thunk; finally, a constant 3 is stored into the body thunk, and the thunk y is returned as the thunk to be evaluated first.

During translation, all 'arrays' are checked whether they are constant and those are compiled to static data, anything else is allocated on the heap. A main combinator, which calls -erm- main, is wrapped around the whole program together with an exception handler.

I like it a lot more than a CPS transform, though it is similar. Another way of thinking about is it like a G-machine, except that we already reversed some pointers. (The G-machine rewrites graphs, this translation just evaluates rewrites.) There is a straightforward translation for lazy functional languages too, since lazy evaluation corresponds to another evaluation order, it means just a simple swap in pushing K pointers, i.e., the order of 'What to do next' is just different.

Since combinators just push thunks, there is no stack, and they are usually not referenced anymore after they are evaluated, it is unnecessary to treat recursion different than any other function call. And hence, there is no need for tail call optimizations.

Comparison of speed with a CPS transform and translation to native code or a G-machine is difficult. I don't have a stack, so the heap will be used a lot but the translation is so direct that I think I removed the need for what would be a lot of stack pushes in other settings. It should -in theory- be faster than a naive G-machine evaluator since a lot of what the G-machine does is already statically determined. I translate to C, trampoline thunks, use FFI for almost everything, and use 64 bits for integers and pointers. It should be sluggish, but give decent performance with a generational garbage collector.

(But, I don't care too much about performance since it is easy to call C and I guess I am gonna be on par with something like Java and faster than Python. If you need performance, or don't like wasting bits, then you should call C. For example, a StringBuffer class could be implemented through libffi.)

Enough blabbering, below some untested and unfinished evaluation routines.


//** start of file:  eval/eval.h
/**
 *  COPYRIGHT 2010, M.C.A. Devillers
 *
 *  THIS MATERIAL IS PROVIDED AS IS, WITH ABSOLUTELY NO WARRANTY
 *  EXPRESSED OR IMPLIED.  ANY USE IS AT YOUR OWN RISK.
 *
 *  The runtime trampolines thunks - each thunk points to a
 *  C routine which evaluates it.
 *
 *  Layout of a constant:
 *     c[0]     size
 *     c[1]     0
 *     c[2]     type tag
 *     c[3..]   integers
 *
 *  Layout of a record:
 *     r[0]     size
 *     r[1]     1
 *     r[2]     pointer to type tag
 *     r[3]     pointer to record tag
 *     r[4..]   pointers to values
 *
 *  Layout of a thunk:
 *     t[0]     size
 *     t[1]     pointer to C evaluator
 *     t[2]     pointer to continuation thunk
 *     t[3]     pointer to result
 *     t[4]     pointer to integer (index in result)
 *     t[5..]   pointers to arguments
 */

#include <stdlib.h>
#include <stdio.h>
#include <stdarg.h>
#include <stdint.h>
#include <string.h>
#include <dlfcn.h>
#include "../types/types.h"

hi_int* __expand(void* v, hi_int *t0, hi_int* t1, hi_int* n);
hi_int* __alloc(void* v, hi_int* n);

hi_int* ___apply(void* v, hi_int *n);
hi_int* ___embed(void* v, hi_int *n);
hi_int* ___shift(void* v, hi_int *n);
hi_int* ___reset(void* v, hi_int *n);

hi_int* __exception(void* v, hi_int *n);
hi_int* __exit(void* v, hi_int *n);

hi_int* __system__reserve(void* v, hi_int *n);
hi_int* __enter_symbol(void* v, hi_int* s);

hi_int* __environment(void* v, hi_int* exc, hi_int* n);
hi_int* __environment_arg(void* v, int argc, char** argv);

//hi_int* __to_pointer(void* v, hi_int* exc, hi_int *n);
//hi_int* __from_pointer(void* v, hi_int* exc, hi_int *n);

hi_int* eval(void* v, hi_int* n);

int streq(char* s0, char* s1);
int strneq(char* s0, char* s1);

#define MAIN \
    int main(int argc, char** argv) { \
        __enter_symbols(env, (hi_int*) 0); \
        environment_arg(env, argc, argv); \
        hi_int* t = heap_alloc(env->heap, 5); \
        t[0] = 5; \
        t[1] = (hi_int) ___main; \
        t[2] = (hi_int) 0; \
        t[3] = (hi_int) 0; \
        t[4] = (hi_int) 0; \
        while (t) { \
            t = eval((void*) env, t); \
            t = heap_try_gc(env->heap, t); \
        } \
        return 0; \
    };
//** end of file:  eval/eval.h

//** start of file:  eval/eval.c
/** Runtime for the Hi Language (stage2).
 *
 *  COPYRIGHT 2008, M.C.A. Devillers
 *
 *  THIS MATERIAL IS PROVIDED AS IS, WITH ABSOLUTELY NO WARRANTY
 *  EXPRESSED OR IMPLIED.  ANY USE IS AT YOUR OWN RISK.
 *
 *  This runtime consists of a number of parts which are all included
 *  in one file for convenience.
 */

#include <stdlib.h>
#include <stdio.h>
#include <stdarg.h>
#include <stdint.h>
#include <string.h>
#include <dlfcn.h>
#include "ffi.h"
#include "eval.h"
#include "../types/types.h"
#include "../heap/heap.h"
#include "../dynamic/dynamic.h"
#include "../ffi/hffi.h"

#define SIZE_CONSTANT       4
#define SIZE_RECORD         4
#define SIZE_THUNK          5

#define INDEX_SIZE          0

#define INDEX_FIELD0        4
#define INDEX_FIELD1        5
#define INDEX_FIELD2        6
#define INDEX_FIELD3        7

#define INDEX_TYPE_TAG      1
#define INDEX_RECORD_TAG    2

#define INDEX_F             1
#define INDEX_K             2
#define INDEX_RT            3
#define INDEX_RTI           4

#define INDEX_ARG0          5
#define INDEX_ARG1          6
#define INDEX_ARG2          7
#define INDEX_ARG3          8

hi_int* __expand(void* v, hi_int *t0, hi_int* t1, hi_int* n);
hi_int* __alloc(void* v, hi_int* n);

/** A combinator which takes one handle and sets up the thunk.
 *
 *  This is a combinator instead of a call since it introduces
 *  a new thunk.
 *
 *  It handles cases where the first argument of a thunk is
 *  a translated variable, therefore postprocessing is needed
 *  to inherit arguments.
 *
 *  /param v    the environment
 *  /param n    the apply thunk holding one handle
 *  /return     the thunk with the handle
 */
hi_int* ___apply(void* v, hi_int *n) {
    heap_t* hp = (heap_t*) v;
    if (n == 0) return heap_string(hp, "___apply");
    if (n[INDEX_SIZE] < 1+SIZE_THUNK) {
        hi_int* k  = (hi_int*) n[INDEX_K];
        hi_int* rt = (hi_int*) n[INDEX_RT];
        hi_int* ri = (hi_int*) n[INDEX_RTI];
        rt[convert_to_int(ri)] = (hi_int) n;
        return k;
    }
    hi_int* f = (hi_int*) n[INDEX_ARG0];
    f = __expand(v, f, n, (hi_int*) 1);
    f[INDEX_K]   = n[INDEX_K];
    f[INDEX_RT]  = n[INDEX_RT];
    f[INDEX_RTI] = n[INDEX_RTI];
    return f;
}

/** A combinator which makes an FFI call.
 *
 *  This is a combinator instead of a call since it takes,
 *  and inspects an arbitrary number of the arguments.
 *
 *  /param v    the environment
 *  /param n    the ffi thunk holding an exception pointer, a dl
 *              record, an ffi record, and a number of arguments
 *  /return     the continuation after the call
 */
hi_int* ___embed(void* v, hi_int *n) {
    heap_t* hp = (heap_t*) v;
    if (n == 0) return heap_string(hp, "___embed");
    if (n[INDEX_SIZE] < 3 + SIZE_THUNK) {
        hi_int* k  = (hi_int*) n[INDEX_K];
        hi_int* rt = (hi_int*) n[INDEX_RT];
        hi_int* ri = (hi_int*) n[INDEX_RTI];
        rt[convert_to_int(ri)] = (hi_int) n;
        return k;
    }

    hi_int* dl  = (hi_int*) n[INDEX_ARG1];
    hi_int* ffi = (hi_int*) n[INDEX_ARG2];
    hi_int  ffi_argc = (hi_int) ffi[INDEX_SIZE] - SIZE_RECORD;

    hi_int  argc = n[INDEX_SIZE] - SIZE_THUNK - 3;
    hi_int* argv = &(n[INDEX_ARG3]);

    hi_int* k  = (hi_int*) n[INDEX_K];
    hi_int* rt = (hi_int*) n[INDEX_RT];
    hi_int* ri = (hi_int*) n[INDEX_RTI];

    hi_int* result;
    if (argc < ffi_argc) {
        result = n;
    } else {
        result = ffi_hi_call(hp, dl, ffi, argc, argv);
    }

    rt[convert_to_int(ri)] = (hi_int) result;
    // ignore the expansion check/extra args

    return k;
}

/** A combinator 'shift'.
 *
 *  This is a combinator instead of a call since it refers the calling
 *  sequence.
 *
 *  /param v    the environment
 *  /param n    the shift thunk holding the exception handler and an
 *              exception
 *  /return     the new size of the heap
 */
hi_int* ___shift(void* v, hi_int *n) {
    heap_t* hp = (heap_t*) v;
    if (n == 0) return heap_string(hp, "___shift");

    if (n[4] < 4) {
        hi_int* k  = (hi_int*) n[INDEX_K];
        hi_int* rt = (hi_int*) n[INDEX_RT];
        hi_int* ri = (hi_int*) n[INDEX_RTI];
        rt[convert_to_int(ri)] = (hi_int) n;
        return k;
    }

    hi_int* k  = (hi_int*)  n[INDEX_ARG0];
    hi_int* rt = (hi_int*)  n[INDEX_ARG1];
    hi_int* ri = (hi_int*)  n[INDEX_ARG2];
    hi_int* e  = (hi_int*)  n[INDEX_ARG3];

    rt[convert_to_int(ri)] = (hi_int) e;
    return k;
}

/** A combinator 'reset' which introduces a 'shift' combinator.
 *
 *  References to Reset/Shift may be bollocks. At the moment, I forgot
 *  why most of this works in this manner. I hacked it.
 *
 *  The point is, that reset saves the 'current contex', k and rt, and
 *  a function to be called into a combinator S, which, when used will
 *  restore/continue into that context with that function.
 *
 *  Save/restore would be better names. It looks like restore is
 *  unnecessary/could be simplified.
 *
 *  This is a combinator instead of a call since it refers the calling
 *  sequence.
 *
 *  /param v    the environment
 *  /param n    a function which is
 *  /return     the new size of the heap
 */
hi_int* ___reset(void* v, hi_int *n) {
    heap_t* hp = (heap_t*) v;

    if (n == 0) return heap_string(hp, "___reset");

    hi_int* k  = (hi_int*) n[INDEX_K];
    hi_int* rt = (hi_int*) n[INDEX_RT];
    hi_int* ri = (hi_int*) n[INDEX_RTI];

    if (n[INDEX_SIZE] < SIZE_THUNK + 1) {
        rt[convert_to_int(ri)] = (hi_int) n;
        return k;
    }

    hi_int* f = (hi_int*) n[INDEX_ARG0];

    hi_int* a = (hi_int*) __alloc(v, (hi_int*) 7);
    a[INDEX_SIZE]   = 7;
    a[INDEX_F]      = (hi_int) ___apply;
    a[INDEX_K]      = (hi_int) k;
    a[INDEX_RT]     = (hi_int) rt;
    a[INDEX_RTI]    = (hi_int) ri;
    a[INDEX_ARG0]   = (hi_int) f;
    a[INDEX_ARG1]   = (hi_int) 0;

    if (((hi_int) n[4]) != ((hi_int) 1)) {
        a = __expand(v, a, n, (hi_int*) 1);
    }

    hi_int* s = (hi_int*) __alloc(v, (hi_int*) (SIZE_THUNK + 3));
    s[INDEX_SIZE]   = (hi_int) (SIZE_THUNK + 3);
    s[INDEX_F]      = (hi_int) ___shift;
    s[INDEX_K]      = (hi_int) a;
    s[INDEX_RT]     = (hi_int) a;
    s[INDEX_RTI]    = (hi_int) heap_to_int(hp, 6);
    s[INDEX_ARG0]   = (hi_int) k;
    s[INDEX_ARG1]   = (hi_int) rt;
    s[INDEX_ARG2]   = (hi_int) ri;

    return s;

}

// depreciated ;) : should be called through ffi
/** A call which reserves N free cells.
 *
 *  This is a combinator instead of a call since it calls the heap.
 *
 *  /param v    the environment
 *  /param n    the free thunk holding the exception and one integer
 *  /return     the new size of the heap
 */
hi_int* __system__reserve(void* v, hi_int *n) {
    heap_t* hp = (heap_t*) v;

    hi_int  sz = convert_to_int(n);
//    n = heap_reserve(hp, sz);

    return n;
}

// depreciated ;) : symbols can be inspected at runtime too through dl
/** The enter_symbol call inserts a symbol into the symbol table.
 *
 *  This is used for serialization code, sometimes it is necessary to
 *  translate names back to pointers.
 *
 *  /param s    the symbol being entered
 *  /return     s
 */
hi_int* __enter_symbol(void* v, hi_int* s) {
    // this is a stub for the moment, it is not used atm and libdl
    // actually could solve it too.
    /*
    heap_t* hp = (heap_t*) v;
    env_t* env = (env_t*) v;
    hi_int*(*f)(void *, hi_int*) =
            (hi_int*(*)(void*, hi_int*)) s;
    char*  sym = (char*) f(0,0);
    syms_enter(env->syms, sym, s);
    */
    return s;
}

/** The exit call prints its argument and exits.
 *  the result.
 *
 *  /param v    the environment
 *  /param n    any value
 *  /return     nothing, it calls exit(0)
 */
hi_int* __exit(void* v, hi_int *n) {
    heap_t* hp = (heap_t*) v;
    if ((n[0] == 1) && (strncmp((char*) n[1], "system.nop", 16) == 0)) {
        fprintf(stdout, "\n");
    } else {
        print_hi(stdout, n);
        fprintf(stdout, "\n");
    };
    exit(0);
    return 0;
}

/** The exception call exits the main program and prints
 *  the exception.
 *
 *  /param v    the environment
 *  /param n    the exception combinator
 *  /return     nothing, it calls exit(1)
 */
hi_int* __exception(void* v, hi_int *n) {
    heap_t* hp = (heap_t*) v;
    fprintf(stderr, "exception(");
    print_hi(stderr, n);
    fprintf(stderr, ")\n");
    exit(1);
    return 0;
}

/** The environment call returns a pointer to the environment.
 *
 *  This is used such that through libffi the environment can be
 *  manipulated by the application.
 *
 *  /param v    the environment
 *  /param n    the exception combinator
 *  /return     a Hi pointer to the environment.
 */
hi_int* __environment(void* v, hi_int* exc, hi_int* n) {
    heap_t* hp = (heap_t*) v;
    hi_int* result = heap_pointer(hp, (void*) v);
    return result;
}

/** Store the application arguments in the heap.
 *
 *  /param v    the environment
 *  /param argc number of arguments
 *  /param argv arguments
 *  /return     a Hi pointer to the environment.
 */
hi_int* __environment_arg(void* v, int argc, char** argv) {
    // XXX: for later
    heap_t* hp = (heap_t*) v;
    hi_int* result = heap_pointer(hp, (void*) v);
    return result;
}

/** Allocate n cells in the heap.
 *
 *  /param n    the nr of cells
 *  /return     a memory region
 */
hi_int* __alloc(void* v, hi_int* n) {
    heap_t* hp = (heap_t*) v;
    hi_int* r = heap_allocate(hp, (hi_int) n);
    r[0] = (hi_int) n;
    r[1] = 0;
    return r;
}

/** Expand the first thunk with the second thunk except for the first
 *  n arguments of the second thunk
 *
 *  /param t0   the first thunk
 *  /param t1   the second thunk
 *  /param n    the nr of arguments which should be ommited
 *  /return     the expanded thunk
 */
hi_int* __expand(void* v, hi_int *t0, hi_int* t1, hi_int* n) {
    heap_t* hp = (heap_t*) v;
    hi_int m = (hi_int) n;
    hi_int argc0 = t0[0] - SIZE_THUNK;
    hi_int argc1 = t1[0] - SIZE_THUNK;
    hi_int new_size = argc0 + argc1 - m + SIZE_THUNK;
    hi_int* new_thunk = heap_alloc(hp, new_size);
    new_thunk[INDEX_SIZE] = new_size;
    new_thunk[INDEX_F]    = t0[INDEX_F];
    new_thunk[INDEX_K]    = t0[INDEX_K];
    new_thunk[INDEX_RT]   = t0[INDEX_RT];
    new_thunk[INDEX_RTI]  = t0[INDEX_RTI];
    hi_int i;
    for (i = 0; i < argc0; i++)
        new_thunk[INDEX_ARG0+i] = t0[INDEX_ARG0+i];
    for (i = 0; i < argc1 - m; i++)
        new_thunk[INDEX_ARG0+argc0+i] = t1[INDEX_ARG0+m+i];
    return new_thunk;
}


//////////////////////////////////////////////////////////////////////

/** Evaluate one thunk.
 *
 *  /param v    an environment pointer
 *  /param n    the starting thunk
 *  /return     the result of the call
 */
inline hi_int* eval(void* v, hi_int* n) {
    heap_t* hp = (heap_t*) v;
//    fprintf(stdout, "\nevaluating : ");
//    term_pretty(stdout, n);
//    fprintf(stdout, "\n");
    if (((hi_int) n[0]) < 2) {
        fprintf(stderr, "cannot evaluate a constant : ");
        term_debug(stderr, n);
        exit(1);
    }
    hi_int*(*f)(void*, hi_int*) =
        (hi_int*(*)(void*, hi_int*)) n[0];
    return f(v, n);
}

int streq(char* s0, char* s1) {
    if (s0 == s1) return (0==0);
    return (strncmp(s0, s1, 128) == 0);
}

int strneq(char* s0, char* s1) {
    if (s0 == s1) return (0==1);
    return (strncmp(s0, s1, 128) != 0);
}

//** end of file:  eval/eval.c


No comments:

Post a Comment