Wednesday, April 22, 2009

Shed Skin 0.1.1

I have just released version 0.1.1 of Shed Skin, an experimental (restricted) Python-to-C++ compiler. It comes with 5 new example programs (for a total of 35 example programs, at over 10,000 lines) and several important improvements/bug fixes. See here for the full changelog.

The most interesting new example is minilight, an elegant raytracer (more precisely, a global illumination renderer) that utilizes triangle primitives and an octree spatial index. As shown on the minilight homepage, it becomes up to 100 times faster.

Other new examples are Peter Norvig's sudoku solver (which unfortunately doesn't become faster, but is a cool example anyway), yet another raytracer and a mastermind strategy evaluator by Raymond Hettinger.

The fifth new example, an interactive circle packing program, is especially nice, because it is, well, interactive, but also because it shows how easy it is to generate an extension module with Shed Skin and use this in a larger program (a Pygame application in this case).

The biggest improvement (if you can call it this) for this release was to drop support for generic types. This simply means that functions that can be called with different kinds of types of objects for a single argument are no longer supported (unless these types have a common base-class, of course). Similarly, generic datastructures are no longer supported. Dropping support for generic types made the compiler core a lot simpler (and 10% smaller!), and removed the need for the -i command-line option.

Seeing the compiler core shrink so much, I was inspired to further refactor several messy areas of the compiler for readability (most notably in infer.py). This means the compiler core has also become a lot more readable/hackable with this release. I hope to continue this work of refactoring (and adding docstrings) with each release from now on, and invite anyone to help out here.

In general, I would really like to receive more help. For example, if someone could take over maintainership of the Windows version (and possible upgrade the MingW distribution that is packaged with it), that would be great. A suggestion on the coding side might be to add keyword support to generated extension modules (see extmod.py), or to optimize some builtins (for example, string/list slicing, see lib/builtin.?pp). But I would also be happy to receive more interesting test cases (most come from a single person at the moment) and/or more bug reports (which I get far too few of.)

30 comments:

Poromenos said...

I don't actively use Shed Skin but I follow its development, because it sounds like it will come in handy some time. Even if it doesn't, I'm sure there will be many useful advances in speeding up Python stemming from your efforts.

So, thank you and keep up the good work!

asmodai said...

Sorry mate,

too busy with a ton of other programming, both at work and at home, to even spare some minutes.

fra said...

What kind of help do you need for mantaining the windows version?

srepmub said...

it would be nice if someone else could do the testing and packaging of the Windows version before each new release.

another thing that would be very useful would be to upgrade the MingW distribution that is packaged with the Windows version.

fra said...

I will try to allocate time for some test about the mingw distribution next saturday.

I don't know if it can help you, I ran the testsuite (python unit.py)

with the following result:
*** tests failed: 5
[(168, 'frozenset'), (170, 'fixes for 0.0.18'), (173, 'fixes for 0.0.20'), (180, 'fixes for 0.0.27; re, time'), (185, 'fixes for 0.1.1')]

Python version is:
Python 2.6.2c1 (r262c1:71369, Apr 7 2009, 18:44:00) [MSC v.1500 32 bit (Intel)] on win32

srepmub said...

thanks! this looks correct - one test doesn't work because the 'signal' module currently fails under Windows, and the other tests fail because Shedskin still acts like Python 2.5, rather than 2.6, which causes some subtle differences.

q said...

in a large remote sensing data content-based image retrieval system, on osx, from 42.5 seconds in pure python to 17.2 in shedskin C++ for a core naive bayes classifier tight pixel loop

q said...

...and from 17.2 to 11.2 by avoiding access to matrices placed deep in dictionaries in inner loops - enough for the moment :-)

srepmub said...

perhaps you can win some more time by using -b and/or -w.. (and possibly -r, for random numbers)?

which GCC version and flags did you use? did you use profile-guided optimisation?

where is the current bottleneck in the generated C++ code? gprof2dot.py might be useful for locating this.

(see the tutorial for these and other performance tips).

q said...

wow: core loop gets down to about 6 seconds (from the original 42!) with -w -b. no random numbers, so no -r needed. with g++ 4.0.1, no big performance gain from -fprofile-generate/use. looking into gprof2dot right now!

int19h said...

The latest version of gcc seems to have caused an issue with Shedskin:

/usr/lib/shedskin-0.1/lib/builtin.cpp:684: error: ‘uint32_t’ does not name a type
/usr/lib/shedskin-0.1/lib/builtin.cpp: In member function ‘virtual int __shedskin__::str::__hash__()’:
/usr/lib/shedskin-0.1/lib/builtin.cpp:734: error: ‘SuperFastHash’ was not declared in this scope

srepmub said...

thanks for mentioning! I will fix this in SVN shortly.

srepmub said...

sorry for the delay, I've been a bit sick. does SVN work for you now..?

int19h said...

Thanks, it works great with GCC 4.4 now. :)

int19h said...

Shedskin (SVN) does compile nicely with GCC 4.4 now, but when it's used on python-scripts, the resulting C++ code produces errors.

srepmub said...

python-scripts..?

int19h said...
This comment has been removed by the author.
int19h said...

I was a bit diffuse.

What I meant to say, was that I have a Python script that I want to compile with Shedskin. However, with Shedskin (svn) and GCC 4.4 (default for Arch Linux), thousands of lines of errors appear when GCC tries to compile the output from Shedskin.

Here they are: http://pastebin.com/f57204619

srepmub said...

the first error indicates you may not have your C++ header files installed correctly, or GCC cannot find them ('new' is a standard C++ header file):

/usr/include/gc/gc_allocator.h:43:36: error: new: No such file or directory

mine is somewhere here:

/usr/include/c++/../new

int19h said...

Thanks for the reply. It's a standard installation of gcc 4.4, no modifications on my part, so I'm a bit puzzled by this. With gcc 3.4 Shedskin and Shedskin SVN worked beautifully.

I have programmed some C++, but not enough to know the nuances of the C++ standard, so I'm stumped.

I'll reinstall gcc and try again.

srepmub said...

you'll probably need to install libstdc++XXX-dev. on ubuntu at least, this is automatically installed with g++.

int19h said...
This comment has been removed by the author.
int19h said...

The issue was with the particular machine I worked on, some files could not be found like they should.

Everything works beautifully now. Thanks for the suggestions.

Shedskin is available both as "shedskin" (v.0.1) and "shedskin-svn" in Arch Linux now.

http://aur.archlinux.org/packages.php?ID=24600

http://aur.archlinux.org/packages.php?ID=24618

int19h said...

When's the next stable release coming? I see that svn is 0.1.2. :)

mine809 said...

Hey, great program but could you try to add GTK support???

srepmub said...

I'm hoping to improve a few more things before doing a new release.. please consider sending in new test cases or bug reports ;)

srepmub said...

GTK support won't ever happen, because it's probably too hard, and there's not much use in compiling GUI glue code anyway - it's typically fast enough already.

but note that you can already combine GUI libraries and Shedskin-compiled code, by generating an extension module (shedskin -e, see the tutorial). this way, you can use unrestricted Python and arbitrary libraries in the 'main' program, while speeding up some part that really needs a speedup.

mine809 said...

ok, did not knew that but thanks and keep up the good work!!!

Kjell Kristian said...

Great work.

This program looks great. As soon as I'm finished work, I will try it on my python implementation of the Dijkstra algorithm. I also have written the same algorithm in C++, so I'm looking forward to test if it works, and if so, see how fast the python to C++-translated code is compared to mine C++ implementation.

If it works well, you can add the Dijkstra solving algorithm to your examples.

srepmub said...

thanks. note there is already a dijkstra algorithm in the example programs, but please do let me know what happens for your version.