I have just released version 0.1.1 of Shed Skin, an experimental (restricted) Python-to-C++ compiler. It comes with 5 new example programs (for a total of 35 example programs, at over 10,000 lines) and several important improvements/bug fixes. See here for the full changelog.
The most interesting new example is minilight, an elegant raytracer (more precisely, a global illumination renderer) that utilizes triangle primitives and an octree spatial index. As shown on the minilight homepage, it becomes up to 100 times faster.
Other new examples are Peter Norvig's sudoku solver (which unfortunately doesn't become faster, but is a cool example anyway), yet another raytracer and a mastermind strategy evaluator by Raymond Hettinger.
The fifth new example, an interactive circle packing program, is especially nice, because it is, well, interactive, but also because it shows how easy it is to generate an extension module with Shed Skin and use this in a larger program (a Pygame application in this case).
The biggest improvement (if you can call it this) for this release was to drop support for generic types. This simply means that functions that can be called with different kinds of types of objects for a single argument are no longer supported (unless these types have a common base-class, of course). Similarly, generic datastructures are no longer supported. Dropping support for generic types made the compiler core a lot simpler (and 10% smaller!), and removed the need for the -i command-line option.
Seeing the compiler core shrink so much, I was inspired to further refactor several messy areas of the compiler for readability (most notably in infer.py). This means the compiler core has also become a lot more readable/hackable with this release. I hope to continue this work of refactoring (and adding docstrings) with each release from now on, and invite anyone to help out here.
In general, I would really like to receive more help. For example, if someone could take over maintainership of the Windows version (and possible upgrade the MingW distribution that is packaged with it), that would be great. A suggestion on the coding side might be to add keyword support to generated extension modules (see extmod.py), or to optimize some builtins (for example, string/list slicing, see lib/builtin.?pp). But I would also be happy to receive more interesting test cases (most come from a single person at the moment) and/or more bug reports (which I get far too few of.)
30 comments:
I don't actively use Shed Skin but I follow its development, because it sounds like it will come in handy some time. Even if it doesn't, I'm sure there will be many useful advances in speeding up Python stemming from your efforts.
So, thank you and keep up the good work!
Sorry mate,
too busy with a ton of other programming, both at work and at home, to even spare some minutes.
What kind of help do you need for mantaining the windows version?
it would be nice if someone else could do the testing and packaging of the Windows version before each new release.
another thing that would be very useful would be to upgrade the MingW distribution that is packaged with the Windows version.
I will try to allocate time for some test about the mingw distribution next saturday.
I don't know if it can help you, I ran the testsuite (python unit.py)
with the following result:
*** tests failed: 5
[(168, 'frozenset'), (170, 'fixes for 0.0.18'), (173, 'fixes for 0.0.20'), (180, 'fixes for 0.0.27; re, time'), (185, 'fixes for 0.1.1')]
Python version is:
Python 2.6.2c1 (r262c1:71369, Apr 7 2009, 18:44:00) [MSC v.1500 32 bit (Intel)] on win32
thanks! this looks correct - one test doesn't work because the 'signal' module currently fails under Windows, and the other tests fail because Shedskin still acts like Python 2.5, rather than 2.6, which causes some subtle differences.
in a large remote sensing data content-based image retrieval system, on osx, from 42.5 seconds in pure python to 17.2 in shedskin C++ for a core naive bayes classifier tight pixel loop
...and from 17.2 to 11.2 by avoiding access to matrices placed deep in dictionaries in inner loops - enough for the moment :-)
perhaps you can win some more time by using -b and/or -w.. (and possibly -r, for random numbers)?
which GCC version and flags did you use? did you use profile-guided optimisation?
where is the current bottleneck in the generated C++ code? gprof2dot.py might be useful for locating this.
(see the tutorial for these and other performance tips).
wow: core loop gets down to about 6 seconds (from the original 42!) with -w -b. no random numbers, so no -r needed. with g++ 4.0.1, no big performance gain from -fprofile-generate/use. looking into gprof2dot right now!
The latest version of gcc seems to have caused an issue with Shedskin:
/usr/lib/shedskin-0.1/lib/builtin.cpp:684: error: ‘uint32_t’ does not name a type
/usr/lib/shedskin-0.1/lib/builtin.cpp: In member function ‘virtual int __shedskin__::str::__hash__()’:
/usr/lib/shedskin-0.1/lib/builtin.cpp:734: error: ‘SuperFastHash’ was not declared in this scope
thanks for mentioning! I will fix this in SVN shortly.
sorry for the delay, I've been a bit sick. does SVN work for you now..?
Thanks, it works great with GCC 4.4 now. :)
Shedskin (SVN) does compile nicely with GCC 4.4 now, but when it's used on python-scripts, the resulting C++ code produces errors.
python-scripts..?
I was a bit diffuse.
What I meant to say, was that I have a Python script that I want to compile with Shedskin. However, with Shedskin (svn) and GCC 4.4 (default for Arch Linux), thousands of lines of errors appear when GCC tries to compile the output from Shedskin.
Here they are: http://pastebin.com/f57204619
the first error indicates you may not have your C++ header files installed correctly, or GCC cannot find them ('new' is a standard C++ header file):
/usr/include/gc/gc_allocator.h:43:36: error: new: No such file or directory
mine is somewhere here:
/usr/include/c++/../new
Thanks for the reply. It's a standard installation of gcc 4.4, no modifications on my part, so I'm a bit puzzled by this. With gcc 3.4 Shedskin and Shedskin SVN worked beautifully.
I have programmed some C++, but not enough to know the nuances of the C++ standard, so I'm stumped.
I'll reinstall gcc and try again.
you'll probably need to install libstdc++XXX-dev. on ubuntu at least, this is automatically installed with g++.
The issue was with the particular machine I worked on, some files could not be found like they should.
Everything works beautifully now. Thanks for the suggestions.
Shedskin is available both as "shedskin" (v.0.1) and "shedskin-svn" in Arch Linux now.
http://aur.archlinux.org/packages.php?ID=24600
http://aur.archlinux.org/packages.php?ID=24618
When's the next stable release coming? I see that svn is 0.1.2. :)
Hey, great program but could you try to add GTK support???
I'm hoping to improve a few more things before doing a new release.. please consider sending in new test cases or bug reports ;)
GTK support won't ever happen, because it's probably too hard, and there's not much use in compiling GUI glue code anyway - it's typically fast enough already.
but note that you can already combine GUI libraries and Shedskin-compiled code, by generating an extension module (shedskin -e, see the tutorial). this way, you can use unrestricted Python and arbitrary libraries in the 'main' program, while speeding up some part that really needs a speedup.
ok, did not knew that but thanks and keep up the good work!!!
Great work.
This program looks great. As soon as I'm finished work, I will try it on my python implementation of the Dijkstra algorithm. I also have written the same algorithm in C++, so I'm looking forward to test if it works, and if so, see how fast the python to C++-translated code is compared to mine C++ implementation.
If it works well, you can add the Dijkstra solving algorithm to your examples.
thanks. note there is already a dijkstra algorithm in the example programs, but please do let me know what happens for your version.
Post a Comment