Jump to content

Neil Cassidy

Members
  • Posts

    71
  • Joined

  • Last visited

Posts posted by Neil Cassidy

  1. Thanks for pointing Joe's blog out, Rick! I read it shortly after you first mentioned it, good stuff. I ran a couple of benchmarks a while ago, as I recall, it seemed that using the volatile keyword was slightly (~5%) faster than using an explicit memory barrier. Both solutions were dramatically faster than single-checked locking (by roughly a factor of six).

    There was some discussion a while ago about averaging hue values. The average of 1° and 359° should intuitively be 0°, or equivalently 360°, but a naive average calculation (summing the hues and dividing by n) returns 180°... an entirely different color altogether. I came across a nice solution while working on my own effect, and thought that I'd put together some code to share with everyone. I didn't test it much, so use at your own risk.

    Hidden Content:
    [reason]Code[/reason]
    using System;
    
    namespace meanOfAngles
    {
       class Program
       {
           // @pre: angles are inside the interval [0, 360) and are in units of degrees
           static double meanOfAngles(double[] angles)
           {
               // Number of angles included in the average.
               int n = angles.Length;
    
               double sumOfCosines = 0.0;
               double sumOfSines = 0.0;
    
               for (int i = 0; i < n; i++)
               {
                   // Convert the current angle into radians.
                   double currentAngleInRadians = Deg2Rad(angles[i]);
    
                   sumOfCosines += Math.Cos(currentAngleInRadians);
                   sumOfSines += Math.Sin(currentAngleInRadians);
               }
    
               // Find the coordinates of the centroid (basically, a 2-dimensional mean).
               double meanOfCosines = sumOfCosines / n;
               double meanOfSines = sumOfSines / n;
    
               // Use the atan2 function to find the corresponding angle.
               double meanAngleInRadians = Math.Atan2(meanOfSines, meanOfCosines);
    
               // Atan2 returns results in (-π, π], so move {x:x<0} to reshape into [0, 2π).
               if (meanAngleInRadians < 0.0)
                   meanAngleInRadians = 2 * Math.PI + meanAngleInRadians;
    
               // Convert angle back to degrees and return.
               return Rad2Deg(meanAngleInRadians);
           }
    
           static double Deg2Rad(double valueInDegrees)
           {
               return valueInDegrees * Math.PI / 180;
           }
    
           static double Rad2Deg(double valueInRadians)
           {
               return valueInRadians * 180 / Math.PI;
           }
    
           static void Main(string[] args)
           {
               double[] angles = { 127.0, 231.0, 281.0, 281.0, 281.0, 308.0, 0.0 };
    
               Console.WriteLine("Average hue is: " + meanOfAngles(angles));
               Console.ReadLine();
           }
       }
    }
    

    It's not exactly the same thing as the normal arithmetic mean, in that it minimizes a slightly different measure of dispersion. But it does the trick!

    phrebh, if you're interested in finishing this effect I'd be glad to leave it up to you, but I essentially have working code that averages the HSL color of a selection as a byproduct of an effect I'm working on, so I'll publish it in a couple of days if I don't hear anything since you haven't been online for a month, I've published it (see link in sig). With credit to you for the great idea, of course!

  2. Illnab1024: The dark values in the spectrogram correspond to a low-amplitude coefficient in a frequency-domain representation of a certain block (maybe window?) of the original (time-domain) signal. Precise reconstruction of the waveform requires all of the coefficients. If those values aren't explicitly defined, we might implicitly assume that they're zero... but that would be rather similar to an ubiquitous form of lossless compression :(

    Rick: Thanks for the tip! The BlendColorsWFP method is close to what I'm looking for.

    barkbark00: You've identified a very important issue here: how can we run a filter on points near the boundary of the defined portion of a signal which isn't defined everywhere? My Eigen Blur effect essentially makes up values beyond the edge of the image. I treat the image as a periodic function and continue it a little ways past the bounds so that I can keep the filter well-behaved. Apparently this is a common thing to do. Some other effects that I've seen assume that those values are zero or use a nearby value from the image. Some avoid handling pixels near the boundary altogether. Others modify the filter near the edges to avoid having to make up values, as you propose. It's hard to read, but I think that Rick's Gaussian Blur does this. Each of these different approaches probably makes sense in at least one interpretation :D

    Edit: Found a better link for link #2.

  3. This is a really good idea! One of the problems with Eigen Blur is that it doesn't correctly handle pixels that "don't exist"... i.e. those which are white and fully transparent. I was thinking about this a week or two ago, and I presumed that the problem could be solved by weighting the contribution of each pixel by the product of the Gaussian filter's weight at that location and the pixel's alpha value, keeping track of the sum of all such weights over the window, and then dividing by the sum at the end. This would essentially be an extension of what Illnab1024 has done above. I haven't actually tested it though. I think I decided that it would yield poor results in some corner case, but I can't remember the details. You should keep experimenting with this!

    By the way, a good size for 2D convolution kernels is 5x5. If you make them any larger, they will be very slow unless you separate them into a pair of 1D filters, which is a little difficult to do. Performance in the 5x5 case is roughly the same whether the filter is separated or not... I think that many things hardcode a 5x5 kernel size for this reason.

    And if the size is fixed, and relatively small, you can unroll it, as Illnab1024 has done in the code above. The unrolled version will be very fast, and is still easy to understand. I'm not sure how much unrolling the compiler will do, it might depend on the machine.

  4. Hi APShredder,

    Often I want to compare two layers A and B, yielding a third layer C which contains the signed differences between the respective pixels of A and B. The Difference blend mode in Paint.NET only gives the absolute difference, so I lose information about whether the pixel from the top layer was brighter than the pixel from the bottom, or the other way around.

    Difference blend mode: C(x,y,channel) = |A(x,y,channel) - B(x,y,channel)|

    Signed difference: C(x,y,channel) = [A(x,y,channel) - B(x,y,channel)] / 2 + 127.5

    Technically this can be done with built-in blend modes, Invert Colors, and Levels. But it's a ten-step process :(

    Could you potentially add this to BlendModes Plus? I'd be grateful! I can upload an example of the intended effect later, if you need one.

  5. Hi Tanel! This is a great pack, one of the first ones I downloaded after I cleared out my Effects folder tonight. I thought I might let you know that "Color to Alpha" and "Color Mixer" both crash on Paint.NET v3.5 Beta 4. They load without any errors when the program starts, but immediately crash when I attempt to bring them up. Everything else can be brought up without any problems!

    Hidden Content:
    [reason]Error Messages[/reason]
    File: C:\Program Files\Paint.NET\Effects\ColorToAlpha.dll
         Effect Name: ColorToAlpha.EffectPlugin
         Full error message: System.TypeLoadException: Could not load type 'PaintDotNet.ColorGradientControl' from assembly 'PaintDotNet.Core, Version=3.50.3591.32416, Culture=neutral, PublicKeyToken=null'.
      at ColorToAlpha.ColorToAlpha.InitializeComponent()
      at ColorToAlpha.EffectPlugin.CreateConfigDialog()
      at PaintDotNet.Menus.EffectMenuBase.RunEffectImpl(Type effectType) in D:\src\pdn\paintdotnet\src\PaintDotNet\Menus\EffectMenuBase.cs:line 707
    
    File: C:\Program Files\Paint.NET\Effects\ColorMixer.dll
         Effect Name: ColorMixer.EffectPlugin
         Full error message: System.TypeLoadException: Could not load type 'PaintDotNet.ColorGradientControl' from assembly 'PaintDotNet.Core, Version=3.50.3591.32416, Culture=neutral, PublicKeyToken=null'.
      at ColorMixer.EffectPluginConfigDialog.InitializeComponent()
      at ColorMixer.EffectPlugin.CreateConfigDialog()
      at PaintDotNet.Menus.EffectMenuBase.RunEffectImpl(Type effectType) in D:\src\pdn\paintdotnet\src\PaintDotNet\Menus\EffectMenuBase.cs:line 707

  6. Strange discovery: There's a subtle synchronization problem with the code that I recommended above! This particular pattern is apparently quite unpredictable across different platforms, some call it double-checked locking, there's quite a bit of reading material out there.

    Synopsis: The problem is not that the computation can occur more than once - rather, it's that it might not finish on time. An out-of-order processor may reorder the writes in the inner if-block in a way that does not violate causal consistency, but which allows the flag to be cleared before the results of the computation are visible to other processors. Another concurrent thread may see the cleared flag and skip the critical section. That thread may then make use of the incomplete results.

    There are a few ways to solve this. One is to use Rick's recommendation, which is thread-safe, but not as fast. The solution that I like best uses a memory barrier. All writes which occur above the memory barrier instruction in the program must be committed to memory before any subsequent writes can proceed. This guarantees that the flag will never be cleared before the result has been computed and written out.

    Edit: Code segment removed, because it's probably incorrect. The CLR guarantees that these writes won't be reordered anyways, and the code is potentially susceptible to other problems.

  7. Very interesting! I agree with your conclusion, and I'll be sure to consider the amortized analysis more carefully in the future. Thanks for the very helpful explanation!

    Can you provide any more information regarding how performance generalizes to d-pass filters, particularly when d becomes large? It seems to me that for filters consisting of a number of small 2D convolutions in series, computation of a given rectangle at stage d requires an incrementally larger rectangle of results from stage d-1, and so rectangles at that stage may overlap. Considering all passes of the filter, S would be in O(d * n^2). But I'd be interested to hear if S can be shown to be in O(n) or O(d * n) in this case.

  8. Julian, this is quite correct, but it seems to only apply to a square image and a selection that contains the whole image! Which is perhaps close to best-case. In particular, choose log n regions, with width W, for a total area of W * log n. Our selection was only possible if H >= log n, so n >= W * log n, and W <= n / log n, and W is in O(n / log n).

    Then the upper bound from the recurrence is O(n log n), and the upper bound from amortized analysis is O(W log^2 n) = O(n log n), and so the bounds concur in this case!

    Intuitively I think amortized analysis would only help if, over the course of the run, the organization of the data structure was being improved in some way that made future operations significantly faster. By the way, this is the best mental workout I've had in a while, I hadn't used amortized analysis for at least a year. Thanks for the very stimulating discussion!

  9. OK, I took another look at this. I agree that the amortized cost per pixel is in O(log n). But this actually yields a looser upper bound than analysis of the recurrence, because if the amortized cost per pixel is O(log n) and the size of the rectangle is in O(n) *, one obtains an O(n log n) upper bound on cost of the whole Union or Difference.

    * The rectangle is of size in O(W). Depending on the shape of the input this could be in either O(sqrt n) or O(n), so it's in O(n).

  10. No problem Julian, glad it helped. I considered an amortized analysis but wagered that I couldn't get a better lower-bound for the problem than Ω(n) unless the sets were guaranteed to be disjoint. If you can reformulate this as a disjoint-set union problem, you might be able to use a union-find data structure to achieve amortized O(1) for a sequence of set-creation, union, and find operations. The disadvantage of this approach would be that the results wouldn't be easy to traverse in any sensible order. I thought about formulating Union as a 2D range-search query on the kd-tree, but that gave worse performance than Θ(n).

    By the way, if you construct a checkerboard sort of thing, where the computed/computing region is either the set of white squares or the set of black squares, that seems to hit all the upper bounds.

  11. Say I put a method call in a loop test. I know that the method is referentially transparent (for all intents and purposes). So intuition tells me that the jitter will extensively analyze all relevant assemblies, determine that the method is side-effect free, and lift the call out of the loop *. But the bytecode tells me that my method will be called a few million times.

    To resolve the ambiguity, I must either:

    1. Accurately predict how the jitter will transform the loop on this machine, or

    2. Conduct careful performance tests, or

    2. Disassemble and read the machine code.

    So Tim's hypothetical optimization may or may not be an optimization, but hopefully we've learned something from this. Namely that if we set the upper bound by calling our referentially transparent method before entering the loop, as Tim suggests, the performance characteristics of the resulting program are substantially less surprising to someone who lacks up-to-date and detailed knowledge of the machine-dependent optimizations performed by the jitter.

    I must admit that it does introduce a problem: the loop's upper bound is now accessible outside of the loop's scope. Frankly, I'm more confident in my ability to read C# code than in my ability to mentally debug the jitter, so I will use Tim's solution from now on.

    Hidden Content:
    [reason]*[/reason]Actually... the get is side-effect free, but definitely not referentially transparent. Another thread could intervene during the execution of the loop and change the property's value. Even if the jitter does do enough analysis to determine whether it is safe to lift the get, then whether it is willing to do so will depend upon the precise circumstances under which every possible set might occur, in every related assembly. This would make it extremely hard for a programmer to reason about the optimizations performed by the jitter without being familiar with the entire code base. So intuition tells me that it will not lift. But I don't know much about the jitter, so please take this with a grain of salt.
  12. Thanks for posting this, Julian. I've read it from start to finish. It's a fantastic contribution, and must have involved a lot of work. In particular, your management of the "computing" and "computed" sets with a kd-tree is very interesting! I hadn't even considered something like this.

    I don't think that the worst-case overhead is in O(log n), though. The running time of the Union operation can be described by the recurrence T(n) = T(floor(n/2)) + T(ceiling(n/2)) + f(n), with f(n) in O(1). So it follows from case 1 of the Master Theorem that the running time is in Θ(n). Likewise for the Difference operation. Both take stack space in O(log n) and the tree occupies O(n log n) space in memory.

    I'm sure that the average-case running time of the tree solution is good, though. I can't come up with a sensible average case to test. But logarithmic seems like a safe bet.

    Hope this helps.

  13. In a sense. That corrects based on an actual color chart. The Bayesian algorithm corrects without needing a chart. It computes a posterior probability distribution for the color of the illuminant, based upon the observed intensities of the image pixels, a model of surface reflectance, and prior knowledge about what sort of color an illuminant might have. Then it estimates the true color of each pixel by "inverting" the illumination. I think the charts are just in the pictures to demonstrate its performance. The two plug-ins I mentioned also correct without a chart, but I imagine they use heuristics to do so. (edited for clarity)

  14. Hi, Rick. Thanks for writing. Just wanted to clear up a misconception regarding code segment 1: "However, when the work items first start out it's possible that the computation will be performed up to N times, where N is the # of threads being used."

    If the flag is not set, as many as N threads may queue up to enter the monitor. They will enter in turn. The first thread to enter the monitor will always compute the result and clear the flag before leaving *. After this, any threads that enter the monitor will immediately exit without performing the computation, because the flag has already been cleared *. So the inner if-statement prevents the second computation, and the outer one is just an optimization to avoid unnecessary acquisition of the lock.

    * Obviously these statements are not true if one admits break, goto, return, computations that don't terminate, tampering with the lock or flag, probably many other things.

    Thanks for mentioning that the first slice is run by itself! It's quite helpful to know. In some cases I observe two threads entering MJW's critical section on my dual-core machine, which I can't seem to explain. This only occurs for wide, short regions.

×
×
  • Create New...