Sign in to follow this  
harold

Proper GPU-based effect - finally

Recommended Posts

So we were recently given OpenCL, and nVidia was so nice as to provide both a 32bit and a 64bit version.

Since GPU programming is cool I made a nice wrapper for it (currently unfinished but (and this is the best part) workable) in C#, and did a couple of tests with it.

I found out that (on at least my GPU) there is a smallish limit on the output size, which is a shame, but we could of course come up with some clever tiling scheme.

A 1024x1024 mandelbrot render with 250 max iterations takes approximately 23 milliseconds - depending of course on what part you are looking at and your GPU (GTX260 in my case) etc etc, but 23 milliseconds is so incredibly fast that none of the details really matter here. Copying the result back to normal ram took an other 20 milliseconds which is pretty much instantaneously.

These kinds of speed won't be beaten by your average CPU.

Compiling the OpenCL code takes a while, but this only needs to be done once per session.

The wrapper is written in pure C# and works in both 32bit and 64bit modes (although it currently only supports nvidia GPU's and requires their OpenCL capable drivers), and will be released into the public domain as soon as I feel comfortable with it (ironed out some bugs and added some more functionality etc) for now you can link to the binary in your project (but you really shouldn't as it is not finished) and edit the source as long as you don't make it available (but honestly, what's going to happen if you do it anyway?)

The reason I'm posting here is that this wrapper may be interesting to plugin writers, at least until Paint.NET exposes something like DirectCompute/OpenCL itself (any plans, Rick?)

In the mean time you can all get a copy of the temporary wrapper + test project (mandelbrot) here (mediafire) (too big to upload it to the forum)

And yes, most of the error codes are translated to throw new Exception() which sucks, it is not finished.

ps: sorry if this is the wrong forum

small status update: it looks to me like there is a bug in clEnqueueNDRangeKernel when called with a dimension of 2 (haven't tested 3), it throws a division by zero exception seemingly for no reason (I gave it reasonable arguments), so for now I'm cheating and using code like

int id = get_global_id(0);
int ix = id % pxwidth;
int iy = id / pxwidth;

Which works fine but probably costs some extra time (well that's an extra % and / you normally wouldn't have, and since their argument is not a constant they can not be free)

It looks like a hack and that's exactly what it is, but until I find out where that odd division by zero is coming from you/I won't be able to use multidimensional kernels.

Share this post


Link to post
Share on other sites

Of all the download hosts on the internet, you had to use RapidS***?!

I'd love to give this a try, but I'm an ATI guy. Also, even if I could download from RapidShare, I wouldn't on a matter of principle. OK, that last part's a lie. I would download it. But I can't. 'Cause RapidS*** earned its nickname for a reason.

Share this post


Link to post
Share on other sites

You can't really release something that only works on NVIDIA GPUs though.

Effects written for something like HLSL need to work on NVIDIA, ATI, Intel, etc. as well as have a software fallback. There's a good reason that I haven't baked this into Paint.NET yet. It's not trivial.

Share this post


Link to post
Share on other sites

:(

It can fall back to CPU though

I'm not sure what ATI is doing with OpenCL atm, do their drivers support it already? If I could get my hands on their OpenCL.dll then I could try to make it work

edit: I don't have an ATI gpu so I couldn't test it, oh well.. btw making an actual effect from this is not very easy - is there an easy way to get the entire render region instead of those small 2pixel-high rectangles?

Share this post


Link to post
Share on other sites

Almost every solution for GPU rendering has an Achille's heel right now ...

1. CUDA only works on NVIDIA chips.

2. OpenCL ... haven't really investigated that. If it's a component you have to install, then I won't include it in Paint.NET. I have a policy of only relying on Microsoft-supplied prerequisites. (it has nothing to do with Microsoft per se, except that they also supply and service the OS ... if Paint.NET were on Mac, then I'd only take dependencies on Apple-supplied and supported components. It just makes things a lot saner for me.)

3. DirectCompute is not supported on XP, so a software fallback is still required there. On Vista/Win7, you can use the reference DX11 driver for software fallback but I do not know how well it performs (probably slowly).

4. WPF only supports Pixel Shader 2.0, and I don't believe you can render with hardware to an off-screen bitmap. WPF 4.0 supports PS 3.0 though, which is good news.

My preference is to support DirectCompute but also have a software fallback for XP and for bad/slow hardware. It's just not something I have the time to solve right at this moment. So if you find a software compiler for compute shaders, let me know. I haven't been able to find a good one.

Share this post


Link to post
Share on other sites

OpenCL automatically comes with the graphics driver in nvidia's case (very recently only, older drivers offer no support)

ATI and Intel etc seem to be running behind, wikipedia says ATI has their OpenCL support in beta and that they would supposedly support R700/R800 GPUs and SSE3 capable CPUs

So that's pretty much a lose.

Can we still use it in effects though?

Share this post


Link to post
Share on other sites

For now, not for effects published to the main Plugins forum. I'm limiting that area to plugins that work with Paint.NET's minimum requirements (XP SP2, SSE1) and prerequisites (.NET 3.5 SP1).

However, feel free to post any experimental stuff here in the Plugin Dev Central.

You might look at using WPF -- figure out if it can do hardware rendering to offscreen surfaces. That way you'll get software fallback for free, it'll work on XP, and it won't require SSSE3. My guess is that it won't cooperate, unfortunately. It's designed as a presentation framework (hence the name), not a rendering framework.

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Sign in to follow this