Monday, March 23, 2009

Minilight Compiled

(Shed Skin is an experimental (restricted-)Python-to-C++ compiler.)

Minilight is an elegant minimal global illumination renderer, or raytracer, that uses triangle primitives and an octree spatial index. The original version consists of about 1,000 lines of C++, but there are several translations available at the homepage, such as ~500 line OCaml and Python versions. The Python version apparently clocks in 173 times later than the C++ version, while the OCaml version is only about 3 times slower.. Wouldn't it be interesting if we could make the Python version run faster than the OCaml version?

After some small changes (splitting up some dynamic variables, replacing calls to 'map', 'type'..), and applying some minor fixes to Shedskin, I was able to compile the Python version to C++ using Shedskin SVN. For the basic Cornell box example, it runs about 35 times faster than the modified Python version (which should again be a bit faster than the original Python version). It's probably still somewhat slower than the OCaml version, but not that much. To try to compile it yourself, you'll probably want to have more than 512 MB of RAM, and use the 'shedskin -i' option. The modified version can be found in /examples/minilight in SVN.

I'm guessing that the Python version can be made a bit faster in itself. What typically happens then is that the C++ version will get relatively even faster, because certain bottlenecks that don't really slow down Python will slow down C++ (such as allocating lots of objects). If you'd like to try and help beat OCaml, I'd be very interested in hearing about potential improvements (where readability doesn't suffer too much, obviously.)

Another program that I managed to compile, while much smaller, is one that solves a well-known chess puzzle. To compile knight2.py, I had to replace the 'key' argument to 'sorted' with a 'cmp' version (as 'key' is not yet supported), and fix a small bug in Shedskin ("'%*d' % (2, 7)" and such did not work yet.)

I'm always interested in hearing about other interesting test cases for Shedskin, as they seem hard to come by. Please also consider sending in bug reports. Both types of feedback are essential for me to keep working on Shedskin!

Update: After some tweaking, it is now more than 60 times faster than the original here. See the comments for more information.

Update 2: I removed the -i option with Shedskin 0.1.1, so it is not necessary anymore.

13 comments:

xyproto said...

Hi, I just wanted to say that I packaged shedskin and shedskin-svn for ArchLinux (in the AUR) and that your work is appreciated.

Personally, I think good support for pygame and pyopengl for shedskin would be great, as it would allow for many games to be written in Python+Shedskin instead of C++.

Thanks

srepmub said...

nice, thanks! maybe I can add the install command under archlinux to the tutorial?

type inference will probably never work for large non-standard libraries. but shedskin already allows you to generate extension modules, which should help integrate things in many cases.

this should work also for other GUI-type libraries, such as Qt: compile one or more speed-critical parts as extension modules, and import them in some main program, that can make full use of Python dynamism and arbitrary libraries.

René Dudfield said...

hi,

very cool :)

I have an implementation of map in python if that helps? Then that could be compiled with shedskin hopefully - for shedskin speed map!

With psyco, I get a faster map than cpythons map with it.

Details, and downloads here:
http://renesd.blogspot.com/2006/12/python-map-vs-c-map.html

It's probably not complete... but works well enough for me.

cu.

srepmub said...

thanks! the biggest hurdle in supporting map (and filter, apply, reduce, coerce and whatnot..) is probably psychological though: I really don't like them, and would prefer them to be completely removed from the Python language. another reason they don't have priority with me is that they can usually be rewritten more nicely using list comprehensions..

kanary said...

I like what you're doing with
shedskin. My disappointment right now is that the urllib module hasn't been implemented yet (at least not in the version I have). I had to use py2exe to get an executable for a short little python script. That's not nearly as nice as using shedskin, which I've successfully used before.

I'ld be interested in helping out on shedskin.

srepmub said...

how about adding support for urllib then? :) for information on how to do this, please see the section 'calling c/c++ code' in the shedskin tutorial. you could always start by adding support for the functionality that you need..

srepmub said...

for those interested in ray tracers, I added another one (that's 4 now) to the example set (examples/mao.py in SVN). it becomes about 70 times faster here:

http://lucille.atso-net.jp/aobench/

srepmub said...

to get the 60 time speedup for minilight (see examples/minilight in SVN), use the following options:

shedskin -irbw minilight.py

a new option, r, causes shedskin to use C rand() instead of the Python-compatible random engine. this alone makes the compiled minilight 30% faster.

srepmub said...

oh, and I used these GCC flags:

-O3 -s -fprofile-use -msse2 -fomit-frame-pointer

profile-guided optimization: first run with fprofile-generate, then with -fprofile-use. this can make a real difference.

xyproto said...

The SVN version of Shedskin fixes some problems I had with Shedskin v.0.1. Would it be possible to release the latest working version of Shedskin as "shedskin-latest", "shedskin-0.2", "shedskin-0.11" or something similar? That would make it easier to package the latest working version of Shedskin with the package name "shedskin". Thank you.

When it comes to makeflags, it would be really nifty to be able to supply "--small" or "--fast" that in turn used -Os or -O2, -march=native and other compiler options, depending on what would be quickest or smallest. Also, "--compile" to run make after shedskin was done would be helpful. Just an idea.

I've tried using Shedskin for a few small scripts, and it works great.

srepmub said...

which problems did SVN fix for you? please do let me know, next time you have problems. almost everything I add or fix is based on feedback (also, mostly from a single person/bear!)

does SVN still work for you now? I made a 'big' change to type inference yesterday (ie, revert a typo that caused several example programs to fail..)

about packaging, would it be possible to wait two weeks..? I'm planning on releasing 0.1.1 RSN.

I'm not convinced a --compile option would add much compared to 'make' (think occam's razor.). also note that you can achieve your --small and --fast already with the -f option (to point shedskin to an alternative FLAGS file), which seems more flexible.

xyproto said...
This comment has been removed by the author.
xyproto said...

I had forgot to package the FLAGS file with Shedskin 0.1, that was why the SVN version worked better for some scripts I tried it on... It's fixed now.

SVN works for me right now, and if needed, it's possible to package a specific revision, so a new release is not a problem and not a hurry.

I agree with your philosophy of keeping things simple, in terms of extra flags and parameters.

Thanks for being receptive for feedback!