Wednesday, May 17, 2006

pystone/richards benchmarks, upcoming 0.0.9

I have been working over the past few days to get the pystone and richards benchmarks compiling well. as they do now, I added them to the test set. both are very simple from a type inference perspective, but they did help me uncover and fix several minor and a few major issues.

the speedups on my computer are about 10 for pystone and 185 (!) for richards. the latter is probably due to the fact that richards is heavily OO, and C++ compilers know how to efficiently implement that! :-)

I hope to release a 0.0.9 version within a few weeks, with these and some other changes. there is one problem that I have observed a few times now, that I would like to fix before that. sometimes, type information is lost during inference, so that results are incomplete.. I think I know what causes the problem.

Wednesday, May 03, 2006

Shed Skin 0.0.8, new website, Google SoC/Thesis?

I have just released Shed Skin 0.0.8. for this version, I removed about 1000 lines (mostly memory optimizations - so the compiler is now less than 6000 lines!), cleaned up stuff a bit (it's still a monolithic file though), added/completed more string methods and applied many minor bugfixes and several more error messages, based on Bearophile's list of known bugs. thanks man! :-)

I also created a simple Shed Skin 'homepage' and modified the README, to better introduce Shed Skin to people. please modify any links to my blog or the sourceforge site to this page - see the link on the right. please let me know if you think I should change something.

now that the source code is becoming pretty clean, and there are many largish test programs that run well (see Section 5 of me thesis!), the time seems right to invite other people to join the project, and look into some important aspect I don't have enough time for/interest in. there are three important things that can be investigated relatively separately:

-I removed my simple memory optimizations (turning heap allocation into stack- and static preallocation). this is a fascinating subject, with a lot of existing techniques coming from the Java community. as can be seen from my thesis, it can really help performance as well. I just never had the time to properly investigate it.

-SS currently uses the bloody C++ STL string type, which makes it really slow for string-intensive programs. it would be really nice to have a more efficient (preferrably OO) string type, possibly using Psyco-like techniques. since I never really use strings much, I do not have enough interest in this myself, but I recognize the importance.

-integration of Python code and compiled code remains a hassle. currently, a lot of manual work is needed to provide 'bindings'. it would be great to somehow have a (semi-)automated process, to enable compiled code to at least use the standard library, and to be able to easily call compiled code from Python programs.

if you are interested in any of these three topics, note that the deadline for the Google Summer of Code 2006 is in about a week. since SS got accepted last year, and there will probably be more slots for Python this year, this might be worth a try! let me know, and we can cook up a proposal together.

the first topic (memory optimization) is also a great topic for doing a Master's/PhD Thesis. unfortunately, Robert could not find a mentor for this. please let me know if you are interested, or you know of a compiler-savvy (Master/PhD) student that might be interested!