Jump to content

An Approach to Multipass Effects.


Julian Beaumont

Recommended Posts

I've been working on effects which require multiple passes a lot recently, for example, Gaussian Blur -> Edge Detect -> Combine edges and original image. As a result of this I've developed what I think is a reasonable approach for doing multiple passes in the current version of Paint.Net. This approach overcomes many of the limitations in other approaches that I've seen. I thought this may be of use/interest to someone else, so feel free to use, abuse, modify, or ignore any of the ideas or code presented below.

One of the limitations of the current effect system is the difficulty of implementing multipass effects. The two approaches which I have seen used on these forums involve computing the entire output in OnSetRenderInfo or in the first call to OnRender and then copying the precomputed result to the output surface on subsequent calls to OnRender. This has a number of disadvantages. In particular, this method often results in a significant pause while the result is precomputed. During the period the Pant.Net progress indicator will remain at 0% and the user is unable to see the parts of the output which have been computed. The inability to see the output as it is computed can lead to significant increases in the time required to fine tune parameter values. In addition, these two approaches are unable to take advantages of the underlying multithreading in Paint.Net on multicore machines, leading to additional work for effect developers if they wish to optimize for multicore machines.

Below I present code for an alternative approach to multipass effects under the current version of Paint.Net. This approach avoids many of the disadvantages of the approaches mentioned above. In particular, it performs the minimum amount of work required to compute the requested output values for the current invocation of OnRender without performing redundant computation across passes. This allows it to play nicely with Paint.Net and take advantage of the interactive output and multithreading. The disadvantage of this approach is that it is not suitable for multipass effects where one pass depends on a global property of the previous pass. For example, if one pass depends on the average of all of the output of the previous pass then this approach is not applicable (although it will still work).

The key assumption behind this approach is that given some region, rout, for which we want to compute the output of the current pass there exists some corresponding region of the input pass, rin, such that the output inside rin depends only on values of the input (aka the output of the previous pass) which lie inside rin. The region rin may be larger or smaller then rout and does not have to include or overlap rout (although in most cases it does). In the worst case we may set rin to be the entire input, however, in this case our approach degenerates down to the case of the previously mentioned approaches.

We represent a surface (2D array) as a function which given a rectangular region on computes values on the in that region. We then define the notion of a lazy surface which keeps track of which regions of the surface have been previously computed and avoids recomputing these values when regions overlap. This allows us to define a multipass algorithm as a series of lazy surfaces with each pass computing a new lazy surface from the previous lazy surface. We may then implement OnRender by requesting the current region from the lazy surface and copying the result to the output surface. In order to keep track of which regions of a lazy surface have been computed we use a balanced binary tree which ensures an overhead of at most O(log(WH)) where W and H are the width and height of the image respectively. In practice this overhead is usually not noticeable in comparison to the cost of the actual image processing operations.

Below is an example of this method used to implement a Gaussian blur in two passes. The first pass computes the blur in the X-direction, and the second pass computes the blur in the Y-direction. Everything up to the GaussianFilterX class is essentially infrastructure that can be reused across all (multipass) effects.

Hidden Content:
// This should work as is in CodeLab. If you want to use it in a non-codelab effect
// some minimal modifications may be required.

// Represents a region of a surface. The representation uses a balanced binary tree
// to keep track of which locations are in the region and which are not. The region
// is constrained to lie inside of bounds.
public class SurfaceRegion
{
   // Higher values of Balance force the binary tree to be more balanced,
   // however, they may result in more small rectanglular regions then lower
   // values. In all cases we must have Balance > 2.
   private const int Balance = 5;

   // If isFull then the entire area within bounds is in the region.
   private bool isFull;
   // If isEmpty then none of the area within bounds is in the region.
   private bool isEmpty;
   private Rectangle bounds;
   // If !isEmpty && !isFull then the area within bounds is split into two
   // sub-regions, left and right. The sub-regions are such that 
   // bounds = Union(left.bounds, right.bounds) and left.bounds and right.bounds
   // are disjoint.
   private SurfaceRegion left;
   private SurfaceRegion right;

   private SurfaceRegion
       ( Rectangle bounds, bool isFull, bool isEmpty
       , SurfaceRegion left, SurfaceRegion right
       )
   {
       this.bounds  = bounds; 
       this.isFull  = isFull; 
       this.isEmpty = isEmpty; 
       this.left    = left; 
       this.right   = right; 
   }

   private static bool isDisjoint(Rectangle lhs, Rectangle rhs)
   {
       return
           lhs.Right  <= rhs.Left ||
           rhs.Right  <= lhs.Left ||
           lhs.Bottom <= rhs.Top  ||
           rhs.Bottom <= lhs.Top  ;
   }

   private static bool isSubset(Rectangle lhs, Rectangle rhs)
   {
       return
           rhs.Left <= lhs.Left && lhs.Right  <= rhs.Right &&
           rhs.Top  <= lhs.Top  && lhs.Bottom <= rhs.Bottom;
   }

   public static SurfaceRegion Full(Rectangle bounds)
   {
       return new SurfaceRegion(bounds, true, false, null, null);
   }

   public static SurfaceRegion Empty(Rectangle bounds)
   {
       return new SurfaceRegion(bounds, false, true, null, null);
   }

   public static SurfaceRegion Compound(Rectangle bounds, SurfaceRegion left, SurfaceRegion right)
   {
       if (left.isFull && right.isFull)
       {
           return SurfaceRegion.Full(bounds);
       }

       if (left.isEmpty && right.isEmpty)
       {
           return SurfaceRegion.Empty(bounds);
       }

       return new SurfaceRegion(bounds, false, false, left, right);
   }

   // Construct a SurfaceRegion represention the rectangular region given by
   // region. The resulting SurfaceRegion represents only that part of region
   // which lies inside bounds.
   public static SurfaceRegion FromRectangle(Rectangle bounds, Rectangle region)
   {
       // If bounds is a subset of region of bounds and region are disjoint
       // then constructing the SurfaceRegion is trivial.
       if (isSubset(bounds, region))
       {
           return SurfaceRegion.Full(bounds);
       }

       if (isDisjoint(bounds, region))
       {
           return SurfaceRegion.Empty(bounds);
       }

       // If not, we split the SurfaceRegion along the largest dimension of bounds
       // and construct two sub-regions which compose the final SurfaceRegion.
       // In order to minimize the depth of the resulting tree we split on one of
       // the edges of region provided it is close enough to the centre of bounds.
       // The requirement that the split point be "close enough" to the centre
       // ensures that the resulting tree is balanced, and hence has maximum
       // depth O(log(WH)) where W and H are the width and height of bounds.
       if (bounds.Width > bounds.Height)
       {
           int centreX = bounds.Left + bounds.Width / 2;
           int offsetX = bounds.Width / SurfaceRegion.Balance;
           int splitX  = centreX;
           if (centreX - offsetX <= region.Left && region.Left <= centreX + offsetX)
           {
               splitX = region.Left;
           }
           if (centreX - offsetX <= region.Right && region.Right <= centreX + offsetX)
           {
               splitX = region.Right;
           }
           int leftWidth  = splitX - bounds.Left;
           int rightWidth = bounds.Right - splitX;
           Rectangle leftBounds  = new Rectangle(bounds.Left, bounds.Top, leftWidth, bounds.Height);
           Rectangle rightBounds = new Rectangle(splitX, bounds.Top, rightWidth, bounds.Height);

           SurfaceRegion left  = SurfaceRegion.FromRectangle(leftBounds, region);
           SurfaceRegion right = SurfaceRegion.FromRectangle(rightBounds, region);
           return SurfaceRegion.Compound(bounds, left, right);
       }
       else
       {
           int centreY = bounds.Top + bounds.Height / 2;
           int offsetY = bounds.Height / SurfaceRegion.Balance;
           int splitY  = centreY;
           if (centreY - offsetY <= region.Top && region.Top <= centreY + offsetY)
           {
               splitY = region.Top;
           }
           if (centreY - offsetY <= region.Bottom && region.Bottom <= centreY + offsetY)
           {
               splitY = region.Bottom;
           }
           int leftHeight  = splitY - bounds.Top;
           int rightHeight = bounds.Bottom - splitY;
           Rectangle leftBounds  = new Rectangle(bounds.Left, bounds.Top, bounds.Width, leftHeight);
           Rectangle rightBounds = new Rectangle(bounds.Left, splitY, bounds.Width, rightHeight);

           SurfaceRegion left  = SurfaceRegion.FromRectangle(leftBounds, region);
           SurfaceRegion right = SurfaceRegion.FromRectangle(rightBounds, region);
           return SurfaceRegion.Compound(bounds, left, right);
       }
   }
   // The union of lhs with rhs restricted to lhs.bounds. That is, the
   // returned region is the union of lhs and rhs ignoring any parts of
   // rhs which fall outside lhs.bounds.
   public static SurfaceRegion Union(SurfaceRegion lhs, Rectangle rhs)
   {
       if (lhs.isFull || isDisjoint(lhs.bounds, rhs))
       {
           return lhs;
       }

       if (lhs.isEmpty || isSubset(lhs.bounds, rhs))
       {
           return SurfaceRegion.FromRectangle(lhs.bounds, rhs);
       }

       SurfaceRegion left  = SurfaceRegion.Union(lhs.left, rhs);
       SurfaceRegion right = SurfaceRegion.Union(lhs.right, rhs);
       return SurfaceRegion.Compound(lhs.bounds, left, right);
   }

   // The region resulting from removing the region rhs from the rectangle lhs.
   // As with FromRectangle and Union this result is restricted to lie inside 
   // rhs.bounds.
   public static SurfaceRegion Difference(Rectangle lhs, SurfaceRegion rhs)
   {
       if (rhs.isFull || isDisjoint(lhs, rhs.bounds))
       {
           return SurfaceRegion.Empty(rhs.bounds);
       }

       if (rhs.isEmpty)
       {
           return SurfaceRegion.FromRectangle(rhs.bounds, lhs);
       }

       SurfaceRegion left  = SurfaceRegion.Difference(lhs, rhs.left);
       SurfaceRegion right = SurfaceRegion.Difference(lhs, rhs.right);
       return SurfaceRegion.Compound(rhs.bounds, left, right);
   }

   // Apply the function action to rectangular subregions of lhs such that
   // all subregions are disjoint and their union is exactly lhs.
   public static void ForEach(SurfaceRegion lhs, Action action)
   {
       if (lhs.isFull)
       {
           action(lhs.bounds);
       }
       else if (!lhs.isEmpty)
       {
           SurfaceRegion.ForEach(lhs.left, action);
           SurfaceRegion.ForEach(lhs.right, action);
       }
   }
}

// A StrictSurface is a 2D surface containing values of type T
// computed in a rectangular region.
public class StrictSurface
{
   private T[,]      values;
   private Rectangle region;

   public StrictSurface(T[,] values, Rectangle region)
   {
       this.values = values;
       this.region = region;
   }

   public T this[int x, int y]
   {
       get { return values[x - region.Left, y - region.Top]; }
       set { values[x - region.Left, y - region.Top] = value; }
   }

   public int Left   { get { return region.Left; } }
   public int Right  { get { return region.Right; } }
   public int Top    { get { return region.Top; } }
   public int Bottom { get { return region.Bottom; } }

   public int Width  { get { return region.Width; } }
   public int Height { get { return region.Height; } }

   public Rectangle Region { get { return this.region; } }
}

// A DelayedSurface represents a rectangular region for which
// we can compute the T values of any sub-region. Multiple calls
// to Request(region) with overlapping regions may result in
// redundent computations.
public abstract class DelayedSurface
{
   // Request(region) returns a StrictSurface containing the
   // values of the DelaySurface computed for all points in region.
   public abstract StrictSurface Request(Rectangle region);
   public abstract Rectangle Region { get; }
}

// A LazySurface is a DelayedSurface that caches results so that
// multiple calls to Request(region) with overlapping regions will not
// result in multiple computations results for the overlapping regions.
public abstract class LazySurface : DelayedSurface
{
   // Region is the subset of the surface (image) for which values may
   // be requested. Values is the cached results in this region.
   private T[,]      values;
   private Rectangle region;
   // Computed is the region of this surface for which the result has
   // already been computed. Computing is the region for which the result
   // has been or is currently being computed. In the case of multiple
   // thread/cores these regions may differ.
   private SurfaceRegion computed;
   private SurfaceRegion computing;

   public LazySurface(Rectangle region)
   {
       this.values    = new T[region.Width, region.Height];
       this.region    = region;
       this.computed  = SurfaceRegion.Empty(region);
       this.computing = SurfaceRegion.Empty(region);
   }

   public override StrictSurface Request(Rectangle region)
   {
       // When the values in region are requested we first determine
       // which parts of region have not been computed and extend
       // computing to cover the requested region. If region overlapps
       // the part of computing which is not also in computed then this
       // may lead to some redundent computation, however, experimental
       // results show this is *extremly* rare. This could be avoided
       // by checking if toCompute and computing overlap and waiting
       // until computed == computing if so, however, this is probably
       // more difficult and costly then it's worth given the rarity.
       // Disclaimer: Tests where on a dual core machine. More cores may
       // make this issue more important.
       SurfaceRegion toCompute;
       lock (computing)
       {
           toCompute = SurfaceRegion.Difference(region, computed);
           computing = SurfaceRegion.Union(computing, region);
       }

       // Compute the region specified by toCompute.
       StrictSurface outSurface = new StrictSurface(this.values, this.region);
       OnRender(outSurface, toCompute, region);

       // Update computed to reflect the fact that region has been computed.
       lock (computing)
       {
           computed = SurfaceRegion.Union(computed, region);
       }

       return outSurface;
   }

   public int Left   { get { return region.Left; } }
   public int Right  { get { return region.Right; } }
   public int Top    { get { return region.Top; } }
   public int Bottom { get { return region.Bottom; } }

   public int Width  { get { return region.Width; } }
   public int Height { get { return region.Height; } }

   public override Rectangle Region { get { return this.region; } }

   // OnRender(dst, region, bounds) should be implemented in derived classes
   // to compute the values in region and store the results in dst. Bounds is
   // a rectangular region containing region which can be used to compute the
   // region needed from previous passes.
   protected abstract void OnRender(StrictSurface destination, SurfaceRegion region, Rectangle bounds);
}

// BgraSurface is a simple wrapper around a Surface.
public class BgraSurface : LazySurface
{
   private Surface source;

   public BgraSurface(Surface source, Rectangle region)
       : base(region)
   {
       this.source = source;
   }

   protected override void OnRender(StrictSurface dst, SurfaceRegion region, Rectangle bounds)
   {
       SurfaceRegion.ForEach(region, delegate (Rectangle current) {
           for (int y = current.Top; y < current.Bottom; ++y)
           {
               for (int x = current.Left; x < current.Right; ++x)
               {
                   dst[x, y] = source[x, y];
               }
           }
       });
   }
}

// Below is an example of implimenting a Gaussian blur as two passes.
// In the first pass we blur along the x-direction, then in the second
// pass we blur the result of the first pass in the y-direction.

// GaussianFilterX is a LazySurface which computes the gaussian blur
// of an input surface in the x-direction.
public class GaussianFilterX : LazySurface
{
   // The source surface which is to be blurred, the weights for the filter,
   // and the radius of the filter are stored as members of the class.
   private DelayedSurface source;
   private int[] weights;
   private int   radius;

   private static int[] Weights(double sigma, int radius)
   {
       int[] weights = new int[2 * radius + 1];
       for (int x = -radius; x <= radius; ++x)
       {
           weights[x + radius] = (int)(512.0 * Math.Exp(-x * x / (sigma * sigma)) + 0.5);
       }
       return weights;
   }

   // SourceRegion(bounds, region, radius) computes the region of the source surface
   // required to compute the filtered values for all points in region, given that
   // bounds is the boundaries of the source surface (usually the image bounds).
   private static Rectangle SourceRegion(Rectangle bounds, Rectangle region, int radius)
   {
       int xl = Math.Max(region.Left  - radius, bounds.Left);
       int xr = Math.Min(region.Right + radius, bounds.Right);
       return new Rectangle(xl, region.Top, xr - xl, region.Height);
   }

   public static Rectangle SourceRegion(Rectangle bounds, Rectangle region, double sigma)
   {
       int radius = (int)(sigma * Math.Sqrt(-Math.Log(1.0 / 512.0)));
       return SourceRegion(bounds, region, radius);
   }

   public GaussianFilterX(DelayedSurface source, Rectangle region, double sigma)
       : base(region)
   {
       this.source  = source;
       this.radius  = (int)(sigma * Math.Sqrt(-Math.Log(1.0 / 512.0)));
       this.weights = Weights(sigma, radius);
   }

   protected override void OnRender(StrictSurface dst, SurfaceRegion region, Rectangle bounds)
   {
       ColorBgra c = new ColorBgra();

       // To render the are in region we first compute the required part of the
       // source surface.
       Rectangle srcRegion = SourceRegion(source.Region, bounds, radius);
       StrictSurface src = source.Request(srcRegion);
       // Then iterate over the output region and compute the filtered result.
       SurfaceRegion.ForEach(region, delegate (Rectangle current) {
           for (int y = current.Top; y < current.Bottom; ++y)
           {
               for (int x = current.Left; x < current.Right; ++x)
               {
                   int xl = Math.Max(x - radius, src.Left);
                   int xr = Math.Min(x + radius, src.Right - 1);
                   int wt = 0; int rt = 0; int gt = 0; int bt = 0; int at = 0;
                   for (int cx = xl; cx <= xr; ++cx)
                   {
                       int w = weights[cx - x + radius]; c = src[cx, y];
                       wt += w; rt += w * c.R; gt += w * c.G; bt += w * c.B; at += w * c.A;
                   }
                   c.R = (byte)(rt / wt);
                   c.G = (byte)(gt / wt);
                   c.B = (byte)(bt / wt);
                   c.A = (byte)(at / wt);
                   dst[x, y] = c;
               }
           }
       });
   }
}

// GaussianFilterY is a LazySurface which computes the gaussian blur
// of an input surface in the y-direction. This is essentially the same
// as GaussianFilterX so see it for more details.
public class GaussianFilterY : LazySurface
{
   private DelayedSurface source;
   private int[] weights;
   private int   radius;

   private static int[] Weights(double sigma, int radius)
   {
       int[] weights = new int[2 * radius + 1];
       for (int y = -radius; y <= radius; ++y)
       {
           weights[y + radius] = (int)(512.0 * Math.Exp(-y * y / (sigma * sigma)) + 0.5);
       }
       return weights;
   }

   private static Rectangle SourceRegion(Rectangle bounds, Rectangle region, int radius)
   {
       int yl = Math.Max(region.Top    - radius, bounds.Top);
       int yr = Math.Min(region.Bottom + radius, bounds.Bottom);
       return new Rectangle(region.Left, yl, region.Width, yr - yl);
   }

   public static Rectangle SourceRegion(Rectangle bounds, Rectangle region, double sigma)
   {
       int radius = (int)(sigma * Math.Sqrt(-Math.Log(1.0 / 512.0)));
       return SourceRegion(bounds, region, radius);
   }

   public GaussianFilterY(DelayedSurface source, Rectangle region, double sigma)
       : base(region)
   {
       this.source  = source;
       this.radius  = (int)(sigma * Math.Sqrt(-Math.Log(1.0 / 512.0)));
       this.weights = Weights(sigma, radius);
   }

   protected override void OnRender(StrictSurface dst, SurfaceRegion region, Rectangle bounds)
   {
       ColorBgra c = new ColorBgra();
       StrictSurface src = source.Request(SourceRegion(source.Region, bounds, radius));
       SurfaceRegion.ForEach(region, delegate (Rectangle current) {
           for (int y = current.Top; y < current.Bottom; ++y)
           {
               for (int x = current.Left; x < current.Right; ++x)
               {
                   int yl = Math.Max(y - radius, src.Top);
                   int yr = Math.Min(y + radius, src.Bottom - 1);
                   int wt = 0; int rt = 0; int gt = 0; int bt = 0; int at = 0;
                   for (int cy = yl; cy <= yr; ++cy)
                   {
                       int w = weights[cy - y + radius]; c = src[x, cy];
                       wt += w; rt += w * c.R; gt += w * c.G; bt += w * c.B; at += w * c.A;
                   }
                   c.R = (byte)(rt / wt);
                   c.G = (byte)(gt / wt);
                   c.B = (byte)(bt / wt);
                   c.A = (byte)(at / wt);
                   dst[x, y] = c;
               }
           }
       });
   }
}

// GaussianFilter is a DelayedSurface which applies a Gaussian blur
// to the input surface using GaussianFilterX and GaussianFilterY.
// It does not need to be a LazySurface as it doesn't do any computation
// of it's own and GaussianFilterX is already cached.
public class GaussianFilter : DelayedSurface
{
   private DelayedSurface source;
   private GaussianFilterX gaussianX;
   private GaussianFilterY gaussianY;

   public static Rectangle SourceRegion(Rectangle bounds, Rectangle region, double sigma)
   {
       Rectangle filteredXRegion = GaussianFilterX.SourceRegion(bounds, region, sigma);
       return GaussianFilterY.SourceRegion(bounds, filteredXRegion, sigma);
   }

   public GaussianFilter(DelayedSurface source, Rectangle region, double sigma)
   {
       Rectangle filterXRegion = GaussianFilterY.SourceRegion(source.Region, region, sigma);
       this.source = source;
       this.gaussianX = new GaussianFilterX(source, filterXRegion, sigma);
       this.gaussianY = new GaussianFilterY(gaussianX, region, sigma);
   }

   public override StrictSurface Request(Rectangle region)
   {
       return gaussianY.Request(region);
   }

   public override Rectangle Region { get { return gaussianY.Region; } }
}

// Copy a region of a StrictSurface to a Surface.
static void TransferToSurface(Surface dst, StrictSurface src, Rectangle region)
{
   for (int y = region.Top; y < region.Bottom; ++y)
   {
       for (int x = region.Left; x < region.Right; ++x)
       {
           dst[x, y] = src[x, y];
       }
   }
}

Object sync = new Object();
bool isInitialized = false;
DelayedSurface source;
DelayedSurface output;

void Render(Surface dst, Surface src, Rectangle rect)
{
   double sigma = 150.0;

   // We only create the surfaces once, the first time Render is called.
   lock (sync)
   {
       if (!isInitialized)
       {
           Rectangle imageBounds  = src.Bounds;
           Rectangle outputRegion = this.EnvironmentParameters.GetSelection(src.Bounds).GetBoundsInt();
           Rectangle sourceRegion = GaussianFilter.SourceRegion(imageBounds, outputRegion, sigma);

           source = new BgraSurface(src, sourceRegion);
           output = new GaussianFilter(source, outputRegion, sigma);
       }
       isInitialized = true;
   }

   // Then we just need to Request the region we are rendering and
   // write it to the destination surface.
   StrictSurface outputSurface = output.Request(rect);
   TransferToSurface(dst, outputSurface, rect);
}

Link to comment
Share on other sites

Paint.NET v3.5, internally*, has a notion of lazy-evaluation surfaces via the IRenderer interface. It's inspired by LINQ, and lets you do something like ...

// s0 and s1 are surfaces 
var s2 = s0.Resample(s0.Width / 2, s0.Height / 2, Bicubic).BlendWith(s1, OverlayBlendOp); // similar to using LINQ operators, Select() Where() etc.
// At this point no pixels have been computed, and no bitmaps have been created
s2.RenderTo(s3, new Int32Point(x, y)); // region of interest is (x,y) with s3.Width,s3.Height for size
s2.ToSurface(...); // analagous to LINQ's ToArray()

Basically, IRenderer is analagous to IEnumerable, and Surface is analagous to Array.

* Technically these are public and exported from PDN.Core. However, they are not meant for plugin's use. At least not yet.

The Paint.NET Blog: https://blog.getpaint.net/

Donations are always appreciated! https://www.getpaint.net/donate.html

forumSig_bmwE60.jpg

Link to comment
Share on other sites

Thanks for posting this, Julian. I've read it from start to finish. It's a fantastic contribution, and must have involved a lot of work. In particular, your management of the "computing" and "computed" sets with a kd-tree is very interesting! I hadn't even considered something like this.

I don't think that the worst-case overhead is in O(log n), though. The running time of the Union operation can be described by the recurrence T(n) = T(floor(n/2)) + T(ceiling(n/2)) + f(n), with f(n) in O(1). So it follows from case 1 of the Master Theorem that the running time is in Θ(n). Likewise for the Difference operation. Both take stack space in O(log n) and the tree occupies O(n log n) space in memory.

I'm sure that the average-case running time of the tree solution is good, though. I can't come up with a sensible average case to test. But logarithmic seems like a safe bet.

Hope this helps.

Segment Image : Average Color (HSL) : Eigen Blur : []

Cool, new forum!

Link to comment
Share on other sites

Thanks for the analysis. It is indeed correct that the worst case for Union and Difference is O(n) where n = Size(bounds). What I had intended to say was that the amortized cost per pixel of the Rectangle is O(log(n)). This also appears to be wrong, however. In particular, for Union, let m = Size(Intersect(lhs.bounds, rhs)). The number of leaf nodes visited that overlap rhs is bounded by m. Additionally, the depth of the tree is bounded by O(log(n)). Hence, the total number of nodes overlapping rhs which are visited is bounded by O(m log(n)). Unfortunately, the total cost contains an additional term which I had previously neglected, the number of nodes visited which don't overlap rhs. The best bound I can obtain for this is just the naive, O((n-m) log(n))*. Without this term, the total cost is O(m log(n)), and since m <= Size(rhs), the amortized cost is O(log(n)). If this were correct it isn't hard to show that the amortized cost for k passes is also O(log(n)) provided Size(rin) < C * Size(rout) for some C.

It's unfortunate that the amortized cost isn't bounded by O(log(n)) IMO. When I have some more spare time I may try to devise an alternative implementation/data structure which does satisfy this bound. I suspect something based on a BVH may work, however, the problem with that is that [in the naive implementation] the cost and size of the tree depends on the number of Union and Difference operations performed.

*Actually, I'm not totally convinced I can't get a better bound then O((n-m)log(n)), I'll have to think about this more.

Link to comment
Share on other sites

No problem Julian, glad it helped. I considered an amortized analysis but wagered that I couldn't get a better lower-bound for the problem than Ω(n) unless the sets were guaranteed to be disjoint. If you can reformulate this as a disjoint-set union problem, you might be able to use a union-find data structure to achieve amortized O(1) for a sequence of set-creation, union, and find operations. The disadvantage of this approach would be that the results wouldn't be easy to traverse in any sensible order. I thought about formulating Union as a 2D range-search query on the kd-tree, but that gave worse performance than Θ(n).

By the way, if you construct a checkerboard sort of thing, where the computed/computing region is either the set of white squares or the set of black squares, that seems to hit all the upper bounds.

Segment Image : Average Color (HSL) : Eigen Blur : []

Cool, new forum!

Link to comment
Share on other sites

Unfortunately, the total cost contains an additional term which I had previously neglected, the number of nodes visited which don't overlap rhs. The best bound I can obtain for this is just the naive, O((n-m) log(n))*

Actually, the number of nodes visited which don't overlap rhs can be bounded by O(log(n) * k) where k = Perimeter(Intersect(lhs.bounds, rhs)). Since k <= 4 * m this gives O(m log(n)) total cost, and the amortized cost does indeed turn out to be O(log(n)).

Sketch of Proof:

Let r = Intersect(lhs.bounds, rhs). Consider those points on the edge of r. There are k such points. Due to the fact that subtrees are disjoint there can be at most O(log(n)) nodes which overlap each of these k points. Every node visited which doesn't overlap r must be a child of one of these nodes. Since no child of a node that doesn't overlap r is visited this implies the total number of nodes visited which don't overlap r is O(k log(n)).

Link to comment
Share on other sites

If the region is not convex, k > 4 * m in some cases... possibly proportional to m^2 :wink:

The region under consideration isn't an arbitrary region however, it's Intersect(lhs.bounds, rhs). Since lhs.bounds and rhs are both rectangular the intersection will also be rectangular, hence, k <= 4 * m in all cases. :D

Link to comment
Share on other sites

OK, I took another look at this. I agree that the amortized cost per pixel is in O(log n). But this actually yields a looser upper bound than analysis of the recurrence, because if the amortized cost per pixel is O(log n) and the size of the rectangle is in O(n) *, one obtains an O(n log n) upper bound on cost of the whole Union or Difference.

* The rectangle is of size in O(W). Depending on the shape of the input this could be in either O(sqrt n) or O(n), so it's in O(n).

Segment Image : Average Color (HSL) : Eigen Blur : []

Cool, new forum!

Link to comment
Share on other sites

Your right of coarse, however, the nice part about the amortized analysis is when you get to examining the performance of multiple calls to Union. In particular, if we start by considering just a single pass effect, Paint.Net will typically subdivide the input surface into O(sqrt(n)) rectangles. Each of these rectangles is disjoint and together cover the surface, hence, having total area n. By the amortized analysis, this means that the total cost of the Union applied to each of these rectangles is O(n log(n)). By the worst case analysis based on the recurrence relation, however, the total cost would be O(n sqrt(n)). This can be extended to show that for multiple passes (which results in overlapping rectangles), under some reasonable assumptions, the cost is still O(n log(n)).

Link to comment
Share on other sites

Julian, this is quite correct, but it seems to only apply to a square image and a selection that contains the whole image! Which is perhaps close to best-case. In particular, choose log n regions, with width W, for a total area of W * log n. Our selection was only possible if H >= log n, so n >= W * log n, and W <= n / log n, and W is in O(n / log n).

Then the upper bound from the recurrence is O(n log n), and the upper bound from amortized analysis is O(W log^2 n) = O(n log n), and so the bounds concur in this case!

Intuitively I think amortized analysis would only help if, over the course of the run, the organization of the data structure was being improved in some way that made future operations significantly faster. By the way, this is the best mental workout I've had in a while, I hadn't used amortized analysis for at least a year. Thanks for the very stimulating discussion!

Segment Image : Average Color (HSL) : Eigen Blur : []

Cool, new forum!

Link to comment
Share on other sites

Julian, this is quite correct, but it seems to only apply to a square image and a selection that contains the whole image! Which is perhaps close to best-case.

Actually, it can be generalized to any sequence of Union operations. Consider an initial region, r, and a set of k potentially overlapping rectangles, rs. The computation

Fold(Union, r, rs) = Union(Union(...Union(r, rs[0])..., rs[k-2]), rs[k-1])

requires at most O(S log(n)) operators where S = Sum(Area(rs), i = 0 .. k-1). This is easily shown by noting that, from the amortized analysis, each individual union requires O(Area(rs) log(n)) time. Summing the times for the individual unions then gives the desired result.

Hence, for any partitioning of the selected region, the complexity is O(A log(n)) where A = Area(selection). Additionally, we know that A <= n, so the worst case is O(n log(n)). This follows from the fact that a partitioning of the selected region will consist of disjoint rectangles, hence,

A = Area(selection) = Sum(Area(rs), i = 0 .. k - 1) = S.

The best bound that can be obtained from the recurrence relation in this general setting is O(k n). The advantage of the recurrence relation here is that it didn't assume the rectangles are disjoint. If they are disjoint then the worst case has k = O(n) and this becomes O(n^2). Because of this, if we maintain the assumption k = O(n) but don't insist on disjoint rectangles the recurrence relation still gives a worst case of O(n^2). The amortized version, however, results in a bound of O(n^2 log(n)) since, in the worst case, Area(rs) = n and hence S = O(n^2).

For the case of multipass effects usually we can show that S = O(n), however, the best bound we can obtain on k is O(n) since we have no control over how Paint.Net partitions the selection. This means that in most cases the amortized version gives the tighter bound.

Conclusion: The amortized analysis and recurrence relation provide complementary information, each one being advantageous in different situations.

Thanks for the very stimulating discussion!

Your welcome and thank you.

Link to comment
Share on other sites

Very interesting! I agree with your conclusion, and I'll be sure to consider the amortized analysis more carefully in the future. Thanks for the very helpful explanation!

Can you provide any more information regarding how performance generalizes to d-pass filters, particularly when d becomes large? It seems to me that for filters consisting of a number of small 2D convolutions in series, computation of a given rectangle at stage d requires an incrementally larger rectangle of results from stage d-1, and so rectangles at that stage may overlap. Considering all passes of the filter, S would be in O(d * n^2). But I'd be interested to hear if S can be shown to be in O(n) or O(d * n) in this case.

Segment Image : Average Color (HSL) : Eigen Blur : []

Cool, new forum!

Link to comment
Share on other sites

Can you provide any more information regarding how performance generalizes to d-pass filters, particularly when d becomes large? It seems to me that for filters consisting of a number of small 2D convolutions in series, computation of a given rectangle at stage d requires an incrementally larger rectangle of results from stage d-1, and so rectangles at that stage may overlap. Considering all passes of the filter, S would be in O(d * n^2). But I'd be interested to hear if S can be shown to be in O(n) or O(d * n) in this case.

Sure, let Rout denote the rectangular region of the output that we wish to compute and Rin denote the rectangular region of the input required to compute that region. Assume there is some C such that for all Rin and Rout, Area(Rin) <= C * Area(Rout). This assumption seems to hold for most [all?] such filters, for example, in the case of a Gaussian blur C = (2 * radius + 1)^2. Let Routs be the complete set of possibly overlapping output rectangles requested from pass k. To compute the outputs in Routs, pass k will request inputs from the previous pass, Rins. By our assumption we know that,

S_{k-1} = Sum(Area(Rins)) <= Sum(C * Area(Routs)) = C * Sum(Area(Routs)) = C * S_{k}.

Additionally, we have that S_{d} = n. Hence, by induction, we may show that S_{k} <= C^(d-k) n. The complexity for the total computation is thus,

O(Sum(S_{k} log(n), k = 1 .. d)) = O(Sum(C^k, k = 0 .. d-1) n log(n)) = O(C^d n log(n)).

From the recurrence relation we also know that the total computation time is in O(d n^2), hence, the final bound would be O(n min{ d n , C^d log n }).

Edit: Fixed some mistakes.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...