Sunday, August 08, 2010

Shed Skin 0.5

I have just released Shed Skin 0.5, an optimizing (restricted) Python-to-C++ compiler. Looking back at the release notes, I'm happy to see contributions from more people than ever. I'd also like to thank David Ripton again, for sending me his Project Euler solutions (see my previous post), and Jason Ye, for sending in so many issues.

The new release supports 64-bit integers (shedskin -l), which was necessary to get many of the Project Euler solutions to work. I'm wondering if this perhaps should become the default, since by default we want to stay as close to Python as possible.. But I don't think it's really useful that often, and it will probably be slower in many cases, so we'll see.

In comparing the Project Euler solutions with Psyco, it became clear that Shed Skin did not perform very well for certain types of iteration. I found out that exceptions in C++ are extremely slow, greatly slowing down e.g. set iteration. The new release avoids throwing StopIteration most of the time, that is, inside the builtins.

After basing the set implementation on CPython before, FFAO saw my invitation to do the same for the dict implementation, and quickly implemented it. I think Shed Skin was able to beat Psyco on about 5 more euler solutions with this new implementation.

Random number generation (without using shedskin -r) should also be much faster now.

Andy Miller sent in patches to add basic support for MSVC (shedskin -v). Although of course there is still no recent Windows package for Shed Skin. If anyone would like to volunteer, it should be easy to base a 0.5 package on the 0.3 version for Windows, though it would be nice to use a more recent version of MinGW.

Douglas McNeil sent in patches to add support for __future__.print_function and generator methods (as opposed to non-method functions). Thomas Spura optimized printing somewhat, though there is still some room for optimization there. Finally, Michael Elkins sent in some improvements for the socket module (which he originally wrote).

More details about the new release can be found in the release notes.

5 comments:

Arkanosis said...

Great, good job !

BTW, it's nice too see the growing number of contributions :-)

Michael Buckley said...

Is there any possibility of unicode support in Shed Skin? or any idea of the amount of work required to implement it? I am working on a largish system (about 8,000 loc) which used mainly unicode strings. It would be cool to see how much Shed Skin could speed it up, but no unicode is a bit of a show stopper for me.

srepmub said...

thanks for asking.

I don't think unicode support would be very useful to have, because string operations in Python are already roughly as fast as in C++, and probably a bit faster even in many cases, and programs requiring unicode support are typically of the string-processing kind.

if there's a part of a program that could be compiled for speed, I think it can typically be separated from the 'unicode part' and compiled as a separate extension module (shedskin -e). 8000 lines is also way too much to handle for shedskin at the moment.

fahhem said...

I've been using this compiler for a class on computational bio and I seriously can't thank you enough. I can write the code in Python and debug with Python's speed of debugging, and then compile to C++ and get C++ performance with only minimal tweaks and a few lines of type inferencing.

One thing I noticed that's sort of a bug is that type inference hints, such as "i=0" right before "for i in foo():" actually converts to i=0 in C++, even though I don't really want to initialize i to 0. This can cause performance degradation in loops with a lot of type hints, especially when those hints are for lists or other classes with non-trivial __init__ methods.

Otherwise this is great and I look forward shedding the skin of Python un the futer.

srepmub said...

it's nice to hear shedskin is useful for you. please let me know if any of your assignments might be interesting to add to shedskin's example test set.

I'm not sure why you have to add type hints for "i" here, because shedskin should be able to figure out the type of such variables by itself..?