Segment Image 1.1

Neil Cassidy · November 12, 2009

Edit, 22-01-2010: Segment Image 1.1, a small update. It's now much easier to cancel the effect before it finishes running! See bottom of post for download.

Hi everyone,

Hot on the heels of #2, here's my third plugin, "Segment Image". I wasn't actually going to release this - it was originally a Code Lab experiment that I threw together a few days ago. But it's proven surprisingly useful, so I thought I'd port it to Visual Studio, clean it up, and post it here.

This effect performs what some might call "automatic image segmentation by clustering". It analyzes the pixels of the image and arranges them into several groups, trying to ensure that each group's pixels are relatively close together in the image and have roughly the same color. So the effect helps you to break up an image into its constituent regions. It also helps you to find a small color palette that matches an image.

The UI allows you to specify the number of groups that you'd like the effect to look for, the relative influence of the color variables and the location variables, and the type of output that you'd like to see. If you leave the checkbox unchecked, the effect will fill each pixel in a group with a representative color for that group. If you check it, the representative colors of all the groups will be displayed as a set of horizontal bars over the selection.

Image segmentation

Start out with a guess as to the number of clusters in the image. If adjacent parts of your image that should be distinct seem to be stuck together in one cluster, move the bias slider to the left by a small amount. If adjacent parts of your image that should be one continuous cluster are split into several clusters, move the bias slider right by a small amount. This is the preferred way to fix any issues with the clustering, but if there really do seem to be too few or too many different clusters, adjust "Number of clusters" up or down by 1 as appropriate. Repeat this until the result looks good. Then check the "Higher quality" box.

Around 10 clusters, bias somewhere near 0.5, "Higher quality". Some clusters were merged by selecting them with Magic Wand (global fill mode, tolerance 0%) and then running Average Color (HSL). Original Image

Palette extraction

Check the "Display palette instead of segmentation" box. Move the slider all the way to the left. If the palette doesn't seem to contain enough colors, increase "number of colors" by 1. If the colors in the palette are a little too distinct, you can also move the slider a little ways towards the right. Repeat this until the result looks good. I saw while writing this post that jxp has released a plugin that saves the colors in the image as a palette file, so you might want to use that as well!

32 clusters, bias 0.01, "Display palette...", "Higher quality". Original Image

General

Be careful with the sliders when you run this on large images. If you push the bias too far to the left or right, or increase the number of clusters too much, it could result in the effect running for a very long time. This was the only major issue that I encountered in my testing, so I incorporated two safeguards that should prevent the clustering algorithm from running on forever. They can be relaxed somewhat by checking the "high quality" box. If you tell it to find 32 clusters in a large image, you'll be waiting for a very long time. I tested a 39 megapixel image with two clusters, and somehow, it did actually manage to return a result. It took 2.8 GB of memory to do it, and it sucked up an entire CPU core for about five minutes. You've been warned! Please remember to save your images. Use this effect at your own risk.

As usual, thanks to Rick Brewster, Tom Jackson, and Boltbait. Without Paint.NET and Code Lab, none of my plugins would exist. To install, simply unzip to your Effects folder. I'd be glad to clean the source up for release, if anyone would like to see it. Feel free to use Reflector to reverse/adapt it until then. License (MIT) is included in the archive. Enjoy!

You can find Segment Image in the Effects > Color menu.

Version 1.0 was downloaded 414 times.

SegmentImage.zip

yy10 · November 12, 2009

Nice one. I'll try it out sometime.

Patarien · November 12, 2009

Hello everybody ...

Thank you for this interesting plugin.

It's just a little bit too slow (time increasing a lot with size of image, sometimes freezing PDN with huge ones), but the result is fine.

May I suggest you to add to your message the way to find it ...... :?:

Effects / Color / Segment image

Just because I suppose people like me, installing quite all the existing plugins, may need time to find it ... :oops:

Have a good day ...

LJXD · November 13, 2009

This will be good for pallettes and other things, Thank you very much. Your a welcome addition to the Paint.NET plugin creators IMO.

pyrochild · November 13, 2009

Your a welcome addition to the Paint.NET plugin creators IMO.

QFT, except for "your" vs. "you're."

Neil Cassidy · November 14, 2009

Thanks for your kind comments everyone, I really appreciate them!

I'll put some effort into making this effect faster. I've identified a few potential improvements:

[*:2xo79k34]Adding additional dimensions to help the effect to distinguish different types of texture should make the clusters easier to separate, possibly improving performance as well as accuracy.
[*:2xo79k34]I'm currently using Lloyd's k-means algorithm, with the k-means++ improvement due to Arthur and Vassilvitskii (2007). There are some other k-means algorithms out there that might provide better performance without requiring too much extra memory.
[*:2xo79k34]I'm currently stopping the clustering if no points were swapped to another centroid, or if the total number of iterations becomes too large, or if the maximum amount that any centroid was shifted is too small. Then the effect returns the current best result. I intend to add a stopwatch-time stopping criterion that will optionally cut off the improvements after a few seconds or so, to make previews easier.
[*:2xo79k34]I can parallelize the most important bottleneck - calculating squared distances to assign points to centroids, which currently takes up 80% of the effect's running time on my Intel T8300 machine.

zarathoustra · November 14, 2009

Nice use of k-means.

KrisVDM · December 28, 2009

Be careful with the sliders when you run this on large images. If you push the bias too far to the left or right, or increase the number of clusters too much, it could result in the effect running for a very long time. This was the only major issue that I encountered in my testing, so I incorporated two safeguards that should prevent the clustering algorithm from running on forever. They can be relaxed somewhat by checking the "high quality" box. If you tell it to find 32 clusters in a large image, you'll be waiting for a very long time. I tested a 39 megapixel image with two clusters, and somehow, it did actually manage to return a result. It took 2.8 GB of memory to do it, and it sucked up an entire CPU core for about five minutes.

Hi Neil,

I had a quick look at the code, and it seems that there is plenty of room for optimization. Here's a few tips, for your information:

1. You are using type Datum for two different things, one is input, the other is an accumulator for the means. Separate them. Actually, you don't need a type for an input Datum. You only need one bit to store whether the pixel is selected or not. You could use a single byte to store this (or use bitwise arthmetic to pack 32 pixels in a single uint). The actual color can be read from the source surface. The scaled location can be calculated on the fly, as it is just one floating point multiplication for each coordinate dimension. Calculate the multiplication factor just once for each dimension though, not for every pixel as you do now. Whatever you do, do not use a class for Datum, use a struct instead.

2. Get rid of Datum.UpdateAverage. You should just add vectors together, and divide by the count just once after adding. UpdateCentroids is already calculating the count for each mean, but not using this count afterwards.

3. R, G, B and A are bytes. To calculate the mean, you can add them in a long. Don't use floats to add bytes together. Certainly don't use floats to store your input colors as you do now.

4. SquaredDistance will be much faster if R, G, B and A are just bytes. Create a function SquaredDistance that takes the two colors, and the two int x and int y coordinates as input. Calculate the scaled location on the fly as described above.

5. Assignments is an int[]. If you limit the number of clusters to 256, you can use a byte[]. If you feel that's not enough, consider a ushort[]. Again less memory. Bringing down the amount of memory used will also improve cache efficiency, and therefore speed.

This is just the low-hanging fruit. I hope you appreciate these comments. They're meant to help you, and I hope they do.

zarathoustra · December 29, 2009

Also, why do you cluster in the {x, y, r, g, b, a} space, instead of the {r, g, b, a} space?

I can't think of a case where it would make a big difference.

By droping the x and y dimensions, you'll just don't get 2 clusters of nearly identical colors, just because they are spatialy far from each others.

If you do it in the {r, g, b, a} space, you can simplify your distance function, but you can also reduce the number of points to cluster, since there are usualy less different colors than pixel count in a picture.

You could also use a tolerance, such that you blend 2 colors instead of adding a new one, if they are too close.

This could significantly reduce the number of points to cluster.

Frontcannon · December 29, 2009

Woah, that's going to be very useful for colour gradients regarding my fractals.. thanks!

Neil Cassidy · December 29, 2009

Hi Kris and zarathoustra! Thanks for the tips, I appreciate that you took the time to look at the code and write a post.

Kris, some of the optimizations you mentioned are definitely low-hanging fruit, namely #5 and parts of #1. The method that you propose in #2 is dangerous in general, though... it's a classic example of a numerically unstable algorithm, and you can get an arbitrarily bad result when running this on large input sets.

Regarding #3 and #4, I'm not sure whether it will actually improve cache utilization much - the working set can't really be expected to fit in L2 or L3, and the various intermediaries should be detecting this trivial access pattern and prefetching. Actually I'm currently saving roughly 100-1000 unnecessary conversions and multiplications per pixel by caching the results of that small computation in main memory. I think that most of my gains here are going to come from storing the input data in structs, but I'll certainly re-evaluate this after I've made some changes.

zarathoustra, using the location information seems to be a pretty standard thing to do - see Forsyth & Ponce 2003, pp. 313-317. I actually intend to add texture information when I get a chance, but using a bank of Gabor filters is probably too heavy, so I'm still thinking about other ways. Combining two cluster centers when they become too close is not really going to be helpful because the algorithm is constantly trying to separate them. It is a clustering algorithm, after all.

You did allude to something else that I've been looking at, though... I can't find any papers which discuss or evaluate a technique like this, but I think that I might dramatically speed up the overall clustering by first clustering a small proportion of the pixels (say, the top left corner of every 4-pixel square) and then propagating the labels to nearby ones, a multiscale approach of sorts.

Frontcannon, no problem. I hope it helps. Looking forward to seeing the results!

zarathoustra · December 29, 2009

zarathoustra, using the location information seems to be a pretty standard thing to do - see Forsyth & Ponce 2003, pp. 313-317. I actually intend to add texture information when I get a chance, but using a bank of Gabor filters is probably too heavy, so I'm still thinking about other ways. Combining two cluster centers when they become too close is not really going to be helpful because the algorithm is constantly trying to separate them. It is a clustering algorithm, after all.
You did allude to something else that I've been looking at, though... I can't find any papers which discuss or evaluate a technique like this, but I think that I might dramatically speed up the overall clustering by first clustering a small proportion of the pixels (say, the top left corner of every 4-pixel square) and then propagating the labels to nearby ones, a multiscale approach of sorts.

hmm, i don't know that paper (didn't find), but those guys seems to be in computer vision/artificial intelligence.

So, it can make a lot of sense for them to consider the spatial dimensions, because in a picture with 2 red balls, a red ball on the left is not the same object than a red ball in the right of the picture, so you'd like to have 2 different clusters to identify 2 different objects.

But, in your case, in the end, you're just turning a 16 million colors in a 20 colors image, you never exploit the information that the right palm tree in not the same thing than the left palm tree for example (considering your sample picture). So, having more than 1 cluster for the palm trees doesn't help.

Right now, you get 6 dimensions centroids, but then, you only use 4 dimensions of each centroid to get the color of the cluster, the 2 other spatial dimensions are discarded.

I might be wrong, but i really think, you're clustering on too many dimensions here.

Neil Cassidy · December 30, 2009

It's a textbook actually! Imaging and computer vision stuff. I'd highly recommend borrowing it if you're interested in this sort of thing.

This effect is not really intended to isolate objects... as I mentioned in the first post, it's supposed to cluster similar pixels together. The similarity measure is squared distance in a six-dimensional space that consists of color and location variables (and I might add texture descriptors too). There are much better ways to isolate objects, actually. Wikipedia has a pretty good description of the overall goal of image segmentation (and of many techniques) on this page.

If I were to just use the color variables, this effect would perform color quantization, which is what you're thinking of. If I were to just use the location variables, the effect would return what's called a centroidal Voronoi tessellation (a tiling of the plane into cells based upon a set of points called generators, such that all points in any given cell are closer to their own cell's generator than to any other cell's generator, and each cell's generator is also its centroid).

You can see something very close to either result by biasing the effect all the way towards color or location, respectively. Try biasing all the way towards location when operating on a manual selection - it will partition the shape into cells and fill the cell with the average color of its members. The color information still has a bit of influence at 0.99, so there will be a little "noise" in this version. Example:

If you play around with the number of clusters when you run it with bias 0.99 on geometric shapes, you can get some neat results:

At intermediate settings, it essentially interpolates between color quantization and construction of a centroidal Voronoi tesselation. With moderately high bias towards location, you get a tesselation with fuzzy edges that are sensitive to the original colors. With moderately high bias towards color, you get a somewhat location-sensitive color quantization. This is by clustering in all dimensions simultaneously, weighting some more heavily than others by an amount that depends upon the bias selected by the user.

You might be onto something here, zarathoustra... although I don't intend to remove the ability to cluster in all dimensions, I'm not using the 0.0 or 1.0 ends of that bias slider right now... I could write some fast code based on small integers to perform strictly-color or strictly-location segmentation, and occupy the extreme ends of the slider with that.

Or better yet, I could fork that stuff off into two different effects and put all of them in a Segmentation sub-menu... one based on color, one based on location. That would allow easy access to a couple of faster effects for what some people might think of as the most interesting uses, and I'd still be able to chip away on a "deluxe" version, which ideally is going to automatically select the bias. Probably by some simple method like standardizing the dimensions, which would give me a decent excuse to find/write some C# matrix inversion code.

Anyone have any thoughts? Do you make use of the fine-grained control offered by the bias slider, or can I replace it with an automatic method? Do you ever use the palette creation option with the bias set to anything other than 0.01? Would you like to see a few specialized segmentation effects instead of a single catch-all effect?

david.atwell · December 30, 2009

Someone should update the "cutting out images" tut. :-) Awesome plugin.

KrisVDM · December 30, 2009

The method that you propose in #2 is dangerous in general, though... it's a classic example of a numerically unstable algorithm, and you can get an arbitrarily bad result when running this on large input sets.

If you use floating point, indeed. But not if you add bytes together in a long (ints actually, the sum of squared bytes). In fact, in that case there is no loss of precision, and it will be infinitely more numerically stable than what you do now. (In fact, if you were to accumulate a billion floats in a double, it would still be numerically stable.)

Regarding #3 and #4, I'm not sure whether it will actually improve cache utilization much - the working set can't really be expected to fit in L2 or L3, and the various intermediaries should be detecting this trivial access pattern and prefetching.

Oh yes it will make a big difference, believe me. First of all, not every image is 50 MB; most I use are a lot smaller. In fact, most images I use are smaller than my L2 cache. Secondly, if you reduce memory usage by a factor x, you reduce the number of cache misses by a factor x too (in this case). I wouldn't count on prefetching too much; since you're actually just scanning linearly through memory, the memory bandwith can become the bottleneck. Have you considered what happens when you start paging? Or if you run into the limits of a 32bit memory space?

Also, don't forget you're calculating a distance here. Integer addition and multiplication may be faster than floating point on some CPU's.

Actually I'm currently saving roughly 100-1000 unnecessary conversions and multiplications per pixel by caching the results of that small computation in main memory.

A 100-1000 what? Seriously, you don't need the color values to be floats, and the scaled coordinates are just two multiplications (one for x and one for y). You're trading two multiplications for a memory problem.

Oh, and BTW, you're absolutely correct that you need to cluster on six dimensions, not four.

but I think that I might dramatically speed up the overall clustering by first clustering a small proportion of the pixels (say, the top left corner of every 4-pixel square) and then propagating the labels to nearby ones, a multiscale approach of sorts.

It will, but I would suggest you get the basic algorithm correct first, before you make it iterative. This is one of those things that I did not consider _low_ hanging fruit. Big fruit, but hanging high.

zarathoustra · December 30, 2009

Yes, right, what i was thinking about would be different, i see what you mean now

Neil Cassidy · December 31, 2009

Kris, I'm definitely not going to optimize this effect for the 800x600 images that I might be able to fit into a decent-sized L2 or L3 cache. I can't control what images users want to run it on, and so I should probably be making performance acceptable for the largest reasonable size, which is roughly on the order of the 40MB images that my camera shoots. Anything around that size can't possibly fit in L2 or L3, and it's pointless to try to make it happen. I'm quite interested in using the effect on images that large, actually. I'm never going to achieve that if I spend my time on tedious low-level optimizations.

Thanks for getting me thinking about cache utilization though. I believe that I've come up with a suitable change to the underlying algorithm that will allow this effect to make full use of any cache, just have to implement it now.

Major update coming soon!

Rick Brewster · January 20, 2010

So when do we get a version that responds to IsCancelRequested?

Neil Cassidy · January 22, 2010

Sorry to anyone who's been waiting! At long last, here's a version that polls IsCancelRequested. I'll put the download in the first post in a second. Just a few lines of code added, and it should cancel much faster. I haven't tested it exhaustively, but I was able to cancel a 10-cluster run on a 6 megapixel image within about 7 seconds. The previous version takes far longer to cancel (at least 40 seconds) because it has to complete the entire clustering run before it gets a chance to "quit". This makes me very happy, and I hope it makes others happy too. Thanks very much for adding the property, Rick!

I could probably improve the cancellation granularity even further by more careful placement of the polling code, but I'll save those improvements for the rewrite that I'm working on, which should have significantly better performance anyways... some tests that I ran a couple weeks ago were very encouraging. I'll try my best to get it done this weekend - most of the difficulties that I'm experiencing have to do with design stuff rather than the algorithm itself.

Rick Brewster · January 23, 2010

If you have code which doesn't have access to the property (outside of your Effect class), then just pass it a delegate which it can poll.

void OnSetRenderInfo(...)
{
   ...
   for (y = ...)
   {
       if (IsCancelRequested) return;

       HelperClass.ComputeSomethingExpensiveOmg(y, ..., () => IsCancelRequested);
   }
}

...
class HelperClass
{
   bool ComputeSomethingExpensiveOmg(int y, ..., Func cancelPollFn)
   {
       ...
       if (cancelPollFn()) return false;
       ...
       return true;
   }
}

This is what Auto-Levels does, for instance.

Neil Cassidy · January 24, 2010

I knew that anonymous functions existed, but could never really find a good excuse to use one in an object-oriented context. This is definitely a case where it makes a lot of sense! Thanks for the tip.

dung83qt · December 28, 2010

I knew that anonymous functions existed, but could never really find a good excuse to use one in an object-oriented context. This is definitely a case where it makes a lot of sense! Thanks for the tip.

Hi, how to use SegmentImageEffectPlugin class.

Segment Image 1.1

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Join the conversation