Friday, September 02, 2005

SoC: finished!

C++ is fantastic. It was very easy to integrate a garbage collector into the compiler: I only needed to add some simple code to a single header file, and it worked. I've also made grateful use of template specializations in redesigning the tuple implementation. In addition to these larger features, I also implemented a huge amount of smaller features over the last two weeks. Apart from several theoretical niceties, such as generation of complex generics, the compiler can now actually handle several relatively large programs as well. In all, I think I have reached all SoC milestones, and much more.

There are now 6 larger programs in the unit test set, written without any static compilation in mind (by friends, or by me, before I wanted to build a python compiler,) that the compiler now handles fine. In total they weigh in at 879 lines of Python code, which the compiler transforms in total into 1749 lines of C++ code (excluding header files, macro expansion, builtin functions, ..) These programs are:

-a simple satisfiability solver (110 lines, complicated list comprehensions)
-a japanese puzzle solver (180 lines, uses some real polymorphism)
-a more advanced satisfiability solver (170 lines)
-a not-too-bad othello player (123 lines, does much with binary tuples)
-a neural network simulator (95 lines, uses classes, badly polluted first iteration)
-a sudoku solver (177 lines)

The compiler generally takes less than 20 seconds to analyze each program on my old athlon XP 1800+. The speedup is in general larger than a factor of 10, often much more.

In fact, I would like to invite any adventurous soul at this point to try out the compiler, and send in anything that doesn't work but should (please, as small as possible), or anything non-trivial that does work (so I can add it to the test set.)

Note that there are currently several important limitations to the type of Python code that is supported:

-Of course, don't use reflection, eval or weird run-time polymorphism that cannot be easily mapped to C++.
-Do not import libraries. Only Set is currently supported. I am working on a simple system that will allow users to add C++ versions of libraries to the compiler.
-Do not mix None and integer types. The compiler will bail out with an error. Use zero or any other special value instead. Do not mix scalar with pointer types. Try to keep things uniform. This is usually more elegant anyway. However, it is often useful to mix types in binary tuples, and the compiler supports this.
-But do not use tuples with different element types with a length greater than 2. Binary tuples with differently types elements can also not be used in addition, etc. 'Zip' only supports two arguments.
-Declaring things before use may sometimes be necessary at this point. For example, classes. I hope to fix this soon.
-Using varargs and kwargs will probably end in tragedy.
-In general, program elegantly, like you would in C++, but without all the type declarations.
-The compiler has not been tested on programs larger than 250 lines.

I still need to fix a non-termination problem I encountered, but this won't occur for most programs under 250 lines. I also need to solve many smaller issues. However, the compiler should work pretty well for many programs. If you're interested, by all means, please try it out and let me know how it went. For any larger program, there are probably a few minor issues or missing builtin functions that I would gladly fix, if I have a use case.

The compiler is currently only available via CVS. I hope to release a packaged version in about a month. Please see the README with instructions on how to use it. In any case, it won't work under Windows. Let me know if you would like to debug this.. :-)

No comments: