Issue 7307076: Fix a few bugs in both fast and slow blur; implementations now match visually. Also provide a way …

Humper

https://codereview.appspot.com/7307076/diff/2001/src/effects/SkBlurMask.cpp File src/effects/SkBlurMask.cpp (left): https://codereview.appspot.com/7307076/diff/2001/src/effects/SkBlurMask.cpp#oldcode32 src/effects/SkBlurMask.cpp:32: static int boxBlur(const uint8_t* src, int src_y_stride, uint8_t* dst, ...

12 years, 6 months ago (2013-02-08 19:02:32 UTC) #1

Stephen White

https://codereview.appspot.com/7307076/diff/2001/src/effects/SkBlurMask.cpp File src/effects/SkBlurMask.cpp (left): https://codereview.appspot.com/7307076/diff/2001/src/effects/SkBlurMask.cpp#oldcode32 src/effects/SkBlurMask.cpp:32: static int boxBlur(const uint8_t* src, int src_y_stride, uint8_t* dst, ...

12 years, 6 months ago (2013-02-08 19:24:18 UTC) #2

https://codereview.appspot.com/7307076/diff/2001/src/effects/SkBlurMask.cpp
File src/effects/SkBlurMask.cpp (left):

https://codereview.appspot.com/7307076/diff/2001/src/effects/SkBlurMask.cpp#o...
src/effects/SkBlurMask.cpp:32: static int boxBlur(const uint8_t* src, int
src_y_stride, uint8_t* dst,
On 2013/02/08 19:02:32, Humper wrote:
> box blur functions changed to use 16 bit fixed point and read/write to
temporary
> copies of the data for added precision.

Have you benchmarked this?  How does it affect performance?

https://codereview.appspot.com/7307076/diff/2001/src/effects/SkBlurMask.cpp
File src/effects/SkBlurMask.cpp (right):

https://codereview.appspot.com/7307076/diff/2001/src/effects/SkBlurMask.cpp#n...
src/effects/SkBlurMask.cpp:54: *dptr = (sum * scale + (1 << 15)) >> 16; \
On 2013/02/08 19:02:32, Humper wrote:
> effectively add 1/2; this prevents bias and makes sure that we blur an array
of
> 255's to an array of 255's, not 249's.

This looks good, but could you benchmark all the rounding alone, separately from
the 16bit change?

https://codereview.appspot.com/7307076/diff/2001/src/effects/SkBlurMask.cpp#n...
src/effects/SkBlurMask.cpp:255: for (; x < new_width - border - 16; x += 16) {
On 2013/02/08 19:02:32, Humper wrote:
> Make sure that we touch exactly the right pixels; before this was causing some
> weird issues in the output size of the blurred mask where for certain border
> values it would be off by one.  Very noticable when I animated the blur radius
> from 1 to 20; artifacts now gone.

Could you give me an example of the input parameters where this would occur (not
saying you're wrong, I just want to understand).

https://codereview.appspot.com/7307076/diff/2001/src/effects/SkBlurMask.cpp#n...
src/effects/SkBlurMask.cpp:873: SkScalar passRadius = (kHigh_Quality == quality)
? SkScalarMul( radius, kBlurRadiusFudgeFactor): radius;
This is definitely going to change behaviour:  before we were only forcing low
quality when the input radius was < 3; now we're only going to do it when the
*adjusted* radius is < 3.  (I suspect this might be where the animation
artifacts actually went away, although I could be wrong.)

Sign in to reply to this message.

robertphillips

nits https://codereview.appspot.com/7307076/diff/2001/bench/BlurRectBench.cpp File bench/BlurRectBench.cpp (right): https://codereview.appspot.com/7307076/diff/2001/bench/BlurRectBench.cpp#newcode60 bench/BlurRectBench.cpp:60: should be loopCount. Why not just set loopCount ...

12 years, 6 months ago (2013-02-08 19:33:25 UTC) #3

Stephen White

If it helps, I've attached what the code looks like with the unrolling taken out, ...

12 years, 6 months ago (2013-02-08 19:40:02 UTC) #4

If it helps, I've attached what the code looks like with the unrolling taken
out, and the two cases (width < diameter, width >= diameter) broken into
separate clauses.  Hopefully that makes it a bit clearer.

I think you can see from this version that both cases fill exactly width +
diameter = width + leftRadius + rightRadius pixels, which I believe is correct.

static int boxBlur(const uint8_t* src, int src_y_stride, uint8_t* dst,
                   int leftRadius, int rightRadius, int width, int height,
                   bool transpose)
{
    int diameter = leftRadius + rightRadius;
    int kernelSize = diameter + 1;
    uint32_t scale = (1 << 24) / kernelSize;
    int new_width = width + SkMax32(leftRadius, rightRadius) * 2;
    int dst_x_stride = transpose ? height : 1;
    int dst_y_stride = transpose ? 1 : new_width;
    if (width < diameter) {
        for (int y = 0; y < height; ++y) {
            int sum = 0;
            uint8_t* dptr = dst + y * dst_y_stride;
            const uint8_t* right = src + y * src_y_stride;
            const uint8_t* left = right;
            int x = 0;
            for (; x < width; ++x) {
                sum += *right++; \
                *dptr = (sum * scale) >> 24; \
                dptr += dst_x_stride;
            }
            for (; x < diameter; ++x) {
                *dptr = (sum * scale) >> 24; \
                dptr += dst_x_stride;
            }
            x = 0;
            for (; x < width; ++x) {
                *dptr = (sum * scale) >> 24; \
                sum -= *left++; \
                dptr += dst_x_stride;
            }
            SkASSERT(sum == 0);
        }
    } else {
        for (int y = 0; y < height; ++y) {
            int sum = 0;
            uint8_t* dptr = dst + y * dst_y_stride;
            const uint8_t* right = src + y * src_y_stride;
            const uint8_t* left = right;
            int x = 0;
            for (; x < diameter; ++x) {
                sum += *right++; \
                *dptr = (sum * scale) >> 24; \
                dptr += dst_x_stride;
            }
            for (; x < width; ++x) {
                sum += *right++; \
                *dptr = (sum * scale) >> 24; \
                sum -= *left++; \
                dptr += dst_x_stride;
            }
            x = 0;
            for (; x < diameter; ++x) {
                *dptr = (sum * scale) >> 24; \
            }
            SkASSERT(sum == 0);
        }
    }
    return new_width;
}

Sign in to reply to this message.

Humper

https://codereview.appspot.com/7307076/diff/2001/bench/BlurRectBench.cpp File bench/BlurRectBench.cpp (right): https://codereview.appspot.com/7307076/diff/2001/bench/BlurRectBench.cpp#newcode60 bench/BlurRectBench.cpp:60: On 2013/02/08 19:33:25, robertphillips wrote: > should be loopCount. ...

12 years, 6 months ago (2013-02-08 19:51:21 UTC) #5

https://codereview.appspot.com/7307076/diff/2001/bench/BlurRectBench.cpp
File bench/BlurRectBench.cpp (right):

https://codereview.appspot.com/7307076/diff/2001/bench/BlurRectBench.cpp#newc...
bench/BlurRectBench.cpp:60: 
On 2013/02/08 19:33:25, robertphillips wrote:
> should be loopCount. Why not just set loopCount to 1000 and then only change
it
> when > 25?

Very recent change definitely a bug.  I'll clean it up.

https://codereview.appspot.com/7307076/diff/2001/bench/BlurRectBench.cpp#newc...
bench/BlurRectBench.cpp:177: 
They got forgotten about :(  Will re-enable.

On 2013/02/08 19:33:25, robertphillips wrote:
> What's up with these guys?

https://codereview.appspot.com/7307076/diff/2001/src/effects/SkBlurMask.cpp
File src/effects/SkBlurMask.cpp (left):

https://codereview.appspot.com/7307076/diff/2001/src/effects/SkBlurMask.cpp#o...
src/effects/SkBlurMask.cpp:32: static int boxBlur(const uint8_t* src, int
src_y_stride, uint8_t* dst,
Not separately.  I'll do that now.

On 2013/02/08 19:24:18, Stephen White wrote:
> On 2013/02/08 19:02:32, Humper wrote:
> > box blur functions changed to use 16 bit fixed point and read/write to
> temporary
> > copies of the data for added precision.
> 
> Have you benchmarked this?  How does it affect performance?

https://codereview.appspot.com/7307076/diff/2001/src/effects/SkBlurMask.cpp
File src/effects/SkBlurMask.cpp (right):

https://codereview.appspot.com/7307076/diff/2001/src/effects/SkBlurMask.cpp#n...
src/effects/SkBlurMask.cpp:54: *dptr = (sum * scale + (1 << 15)) >> 16; \
Sorry, do you mean benchmark the code with and without the + (1 << 15) 

On 2013/02/08 19:24:18, Stephen White wrote:
> On 2013/02/08 19:02:32, Humper wrote:
> > effectively add 1/2; this prevents bias and makes sure that we blur an array
> of
> > 255's to an array of 255's, not 249's.
> 
> This looks good, but could you benchmark all the rounding alone, separately
from
> the 16bit change?

https://codereview.appspot.com/7307076/diff/2001/src/effects/SkBlurMask.cpp#n...
src/effects/SkBlurMask.cpp:255: for (; x < new_width - border - 16; x += 16) {
I wish I had written them down :(  I'll see if I can reproduce it in the master
branch.

Basically what I would see is that as I increased the size of the blur radius,
the blur profile on the right hand side wasn't growing like you'd expect.  There
would be two or even sometimes three blur radii where the output mask was
exactly the same width, even though the blur was getting wider.  I initially
thought this had to do with floor'ing something instead of round'ing it, but in
the end I realized that if we're going to touch "border" pixels on the left hand
side of the output mask, we should definitely touch "border" pixels on the right
hand side, so I rewrote the loop this way and it went away.

I'll see if I can repro it and send you a specific example.

On 2013/02/08 19:24:18, Stephen White wrote:
> On 2013/02/08 19:02:32, Humper wrote:
> > Make sure that we touch exactly the right pixels; before this was causing
some
> > weird issues in the output size of the blurred mask where for certain border
> > values it would be off by one.  Very noticable when I animated the blur
radius
> > from 1 to 20; artifacts now gone.
> 
> Could you give me an example of the input parameters where this would occur
(not
> saying you're wrong, I just want to understand).
>

https://codereview.appspot.com/7307076/diff/2001/src/effects/SkBlurMask.cpp#n...
src/effects/SkBlurMask.cpp:873: SkScalar passRadius = (kHigh_Quality == quality)
? SkScalarMul( radius, kBlurRadiusFudgeFactor): radius;
Nuts, I didn't mean to move this above the low quality check, nice catch.

On 2013/02/08 19:24:18, Stephen White wrote:
> This is definitely going to change behaviour:  before we were only forcing low
> quality when the input radius was < 3; now we're only going to do it when the
> *adjusted* radius is < 3.  (I suspect this might be where the animation
> artifacts actually went away, although I could be wrong.)

https://codereview.appspot.com/7307076/diff/2001/src/effects/SkBlurMask.cpp#n...
src/effects/SkBlurMask.cpp:884: SkScalar box_width = SkScalarMul(passRadius,
radius_multiplier);
On 2013/02/08 19:33:25, robertphillips wrote:
> SK_ScalarHalf

hero!

Sign in to reply to this message.

Stephen White

https://codereview.appspot.com/7307076/diff/2001/src/effects/SkBlurMask.cpp File src/effects/SkBlurMask.cpp (right): https://codereview.appspot.com/7307076/diff/2001/src/effects/SkBlurMask.cpp#newcode255 src/effects/SkBlurMask.cpp:255: for (; x < new_width - border - 16; ...

12 years, 6 months ago (2013-02-08 20:07:11 UTC) #6

https://codereview.appspot.com/7307076/diff/2001/src/effects/SkBlurMask.cpp
File src/effects/SkBlurMask.cpp (right):

https://codereview.appspot.com/7307076/diff/2001/src/effects/SkBlurMask.cpp#n...
src/effects/SkBlurMask.cpp:255: for (; x < new_width - border - 16; x += 16) {
On 2013/02/08 19:51:21, Humper wrote:
> Basically what I would see is that as I increased the size of the blur radius,
> the blur profile on the right hand side wasn't growing like you'd expect. 
There
> would be two or even sometimes three blur radii where the output mask was
> exactly the same width, even though the blur was getting wider.

Part of that may be due to the 1/sqrt(3) fudge factor:  the mask size will only
grow when the *adjusted* radius grows by a pixel.  (Sorry if this is obvious.)

>  I initially
> thought this had to do with floor'ing something instead of round'ing it, but
in
> the end I realized that if we're going to touch "border" pixels on the left
hand
> side of the output mask, we should definitely touch "border" pixels on the
right
> hand side, so I rewrote the loop this way and it went away.

That's not what this code does.  In the width < diameter case, it fills "width"
pixels with the left border then "width - diameter" with a constant, then it
fills "width" pixels with the right border.  In the width >= diameter case, it
fills "diameter" pixels with the left border, then "width - diameter" with the
blurred center, then it fills "diameter" pixels with the right border.  In both
cases, it should fill exactly width + LR + RR pixels.  (Note that each border
straddles the edge of the primitive, hence why they're "diameter", and not just
LR or RR in size.  In the first case, that makes them overlap, so they cancel
each other out and the interior is a constant.  (I probably should hoist the
constant out of the loop, but I got lazy and am relying on the compiler to do
CSE.)

Sign in to reply to this message.

Humper

https://codereview.appspot.com/7307076/diff/2001/bench/BlurRectBench.cpp File bench/BlurRectBench.cpp (right): https://codereview.appspot.com/7307076/diff/2001/bench/BlurRectBench.cpp#newcode60 bench/BlurRectBench.cpp:60: haha loop_count isn't even used! :D On 2013/02/08 19:51:21, ...

12 years, 6 months ago (2013-02-08 20:13:15 UTC) #7

Stephen White

https://codereview.appspot.com/7307076/diff/2001/src/effects/SkBlurMask.cpp File src/effects/SkBlurMask.cpp (right): https://codereview.appspot.com/7307076/diff/2001/src/effects/SkBlurMask.cpp#newcode54 src/effects/SkBlurMask.cpp:54: *dptr = (sum * scale + (1 << 15)) ...

12 years, 6 months ago (2013-02-08 20:13:18 UTC) #8

Humper

Okay I've done all the style cleanups Robert mentioned. I'm going to benchmark the 16 ...

12 years, 6 months ago (2013-02-08 20:14:54 UTC) #9

Humper

https://codereview.appspot.com/7307076/diff/2001/src/effects/SkBlurMask.cpp File src/effects/SkBlurMask.cpp (right): https://codereview.appspot.com/7307076/diff/2001/src/effects/SkBlurMask.cpp#newcode54 src/effects/SkBlurMask.cpp:54: *dptr = (sum * scale + (1 << 15)) ...

12 years, 6 months ago (2013-02-08 21:36:29 UTC) #10

Stephen White

https://codereview.appspot.com/7307076/diff/2001/src/effects/SkBlurMask.cpp File src/effects/SkBlurMask.cpp (right): https://codereview.appspot.com/7307076/diff/2001/src/effects/SkBlurMask.cpp#newcode54 src/effects/SkBlurMask.cpp:54: *dptr = (sum * scale + (1 << 15)) ...

12 years, 6 months ago (2013-02-08 21:50:30 UTC) #11

https://codereview.appspot.com/7307076/diff/2001/src/effects/SkBlurMask.cpp
File src/effects/SkBlurMask.cpp (right):

https://codereview.appspot.com/7307076/diff/2001/src/effects/SkBlurMask.cpp#n...
src/effects/SkBlurMask.cpp:54: *dptr = (sum * scale + (1 << 15)) >> 16; \
On 2013/02/08 21:36:29, Humper wrote:
> After talking to Mike briefly about this, he agreed that the + (1 << 15) code
> was needed for correctness, so benching it separately wasn't important for
> landing this.

Yes, I agree that the rounding is good.  I'm not sure about the uint16_t change,
though.  All the other blur implementations in Skia and WebKit use same-sized (8
or 8888) intermediate buffers, and see no visible loss of quality AFAIK.  Even
the non-separable blur in this file used 8-bit temporary buffers between the 3
SAT passes, and it seemed to be OK.  You're also adding a conversion step that
wasn't there before, and possibly could be skipped by templating the blur loop
to go from 8 -> 16 on the first pass, if 16 bit is absolutely necessary.

> I'm submitting some bench trybots with the new code so we can compare against
> ToT performance.
> 
> On 2013/02/08 20:13:18, Stephen White wrote:
> > On 2013/02/08 19:51:21, Humper wrote:
> > > Sorry, do you mean benchmark the code with and without the + (1 << 15) 
> > 
> > Yes, I'd like to see how each change (16bit, rounding) separately affects
> > performance.  (I actually had a 32bit flavour of this at some point, and it
> > wasn't horribly slower on Intel, but I have no idea how ARM devices might
> > behave.)
> > 
> > > On 2013/02/08 19:24:18, Stephen White wrote:
> > > > On 2013/02/08 19:02:32, Humper wrote:
> > > > > effectively add 1/2; this prevents bias and makes sure that we blur an
> > array
> > > > of
> > > > > 255's to an array of 255's, not 249's.
> > > > 
> > > > This looks good, but could you benchmark all the rounding alone,
> separately
> > > from
> > > > the 16bit change?
> > > 
> > 
>

Sign in to reply to this message.

Humper

https://codereview.appspot.com/7307076/diff/2001/src/effects/SkBlurMask.cpp File src/effects/SkBlurMask.cpp (right): https://codereview.appspot.com/7307076/diff/2001/src/effects/SkBlurMask.cpp#newcode54 src/effects/SkBlurMask.cpp:54: *dptr = (sum * scale + (1 << 15)) ...

12 years, 6 months ago (2013-02-08 21:51:50 UTC) #12

https://codereview.appspot.com/7307076/diff/2001/src/effects/SkBlurMask.cpp
File src/effects/SkBlurMask.cpp (right):

https://codereview.appspot.com/7307076/diff/2001/src/effects/SkBlurMask.cpp#n...
src/effects/SkBlurMask.cpp:54: *dptr = (sum * scale + (1 << 15)) >> 16; \
Yep, we're in agreement -- I'm just waiting for the bench trybots to finish
their thing before asking anyone to sign off.  Might not happen until Monday at
this rate :)

On 2013/02/08 21:50:30, Stephen White wrote:
> On 2013/02/08 21:36:29, Humper wrote:
> > After talking to Mike briefly about this, he agreed that the + (1 << 15)
code
> > was needed for correctness, so benching it separately wasn't important for
> > landing this.
> 
> Yes, I agree that the rounding is good.  I'm not sure about the uint16_t
change,
> though.  All the other blur implementations in Skia and WebKit use same-sized
(8
> or 8888) intermediate buffers, and see no visible loss of quality AFAIK.  Even
> the non-separable blur in this file used 8-bit temporary buffers between the 3
> SAT passes, and it seemed to be OK.  You're also adding a conversion step that
> wasn't there before, and possibly could be skipped by templating the blur loop
> to go from 8 -> 16 on the first pass, if 16 bit is absolutely necessary.
> 
> > I'm submitting some bench trybots with the new code so we can compare
against
> > ToT performance.
> > 
> > On 2013/02/08 20:13:18, Stephen White wrote:
> > > On 2013/02/08 19:51:21, Humper wrote:
> > > > Sorry, do you mean benchmark the code with and without the + (1 << 15) 
> > > 
> > > Yes, I'd like to see how each change (16bit, rounding) separately affects
> > > performance.  (I actually had a 32bit flavour of this at some point, and
it
> > > wasn't horribly slower on Intel, but I have no idea how ARM devices might
> > > behave.)
> > > 
> > > > On 2013/02/08 19:24:18, Stephen White wrote:
> > > > > On 2013/02/08 19:02:32, Humper wrote:
> > > > > > effectively add 1/2; this prevents bias and makes sure that we blur
an
> > > array
> > > > > of
> > > > > > 255's to an array of 255's, not 249's.
> > > > > 
> > > > > This looks good, but could you benchmark all the rounding alone,
> > separately
> > > > from
> > > > > the 16bit change?
> > > > 
> > > 
> > 
>

Sign in to reply to this message.

Humper

Bench results here: http://70.32.156.53:10117/builders/Skia_Shuttle_Ubuntu12_ATI5770_Float_Bench_64_Trybot/builds/12/steps/BenchWebpagePictures/logs/stdio Greg Humphreys | Software Engineer [skia, chrome] | humper@google.com | 434-260-0543 ...

12 years, 6 months ago (2013-02-09 14:13:36 UTC) #13

Bench results here:

http://70.32.156.53:10117/builders/Skia_Shuttle_Ubuntu12_ATI5770_Float_Bench_...

Greg Humphreys | Software Engineer [skia, chrome] | humper@google.com |
434-260-0543


On Fri, Feb 8, 2013 at 4:51 PM, <humper@google.com> wrote:

>
> https://codereview.appspot.**com/7307076/diff/2001/src/**
>
effects/SkBlurMask.cpp<https://codereview.appspot.com/7307076/diff/2001/src/effects/SkBlurMask.cpp>
> File src/effects/SkBlurMask.cpp (right):
>
> https://codereview.appspot.**com/7307076/diff/2001/src/**
>
effects/SkBlurMask.cpp#**newcode54<https://codereview.appspot.com/7307076/diff/2001/src/effects/SkBlurMask.cpp#newcode54>
> src/effects/SkBlurMask.cpp:54: *dptr = (sum * scale + (1 << 15)) >> 16;
> \
> Yep, we're in agreement -- I'm just waiting for the bench trybots to
> finish their thing before asking anyone to sign off.  Might not happen
> until Monday at this rate :)
>
>
> On 2013/02/08 21:50:30, Stephen White wrote:
>
>> On 2013/02/08 21:36:29, Humper wrote:
>> > After talking to Mike briefly about this, he agreed that the + (1 <<
>>
> 15) code
>
>> > was needed for correctness, so benching it separately wasn't
>>
> important for
>
>> > landing this.
>>
>
>  Yes, I agree that the rounding is good.  I'm not sure about the
>>
> uint16_t change,
>
>> though.  All the other blur implementations in Skia and WebKit use
>>
> same-sized (8
>
>> or 8888) intermediate buffers, and see no visible loss of quality
>>
> AFAIK.  Even
>
>> the non-separable blur in this file used 8-bit temporary buffers
>>
> between the 3
>
>> SAT passes, and it seemed to be OK.  You're also adding a conversion
>>
> step that
>
>> wasn't there before, and possibly could be skipped by templating the
>>
> blur loop
>
>> to go from 8 -> 16 on the first pass, if 16 bit is absolutely
>>
> necessary.
>
>  > I'm submitting some bench trybots with the new code so we can
>>
> compare against
>
>> > ToT performance.
>> >
>> > On 2013/02/08 20:13:18, Stephen White wrote:
>> > > On 2013/02/08 19:51:21, Humper wrote:
>> > > > Sorry, do you mean benchmark the code with and without the + (1
>>
> << 15)
>
>> > >
>> > > Yes, I'd like to see how each change (16bit, rounding) separately
>>
> affects
>
>> > > performance.  (I actually had a 32bit flavour of this at some
>>
> point, and it
>
>> > > wasn't horribly slower on Intel, but I have no idea how ARM
>>
> devices might
>
>> > > behave.)
>> > >
>> > > > On 2013/02/08 19:24:18, Stephen White wrote:
>> > > > > On 2013/02/08 19:02:32, Humper wrote:
>> > > > > > effectively add 1/2; this prevents bias and makes sure that
>>
> we blur an
>
>> > > array
>> > > > > of
>> > > > > > 255's to an array of 255's, not 249's.
>> > > > >
>> > > > > This looks good, but could you benchmark all the rounding
>>
> alone,
>
>> > separately
>> > > > from
>> > > > > the 16bit change?
>> > > >
>> > >
>> >
>>
>
>
>
https://codereview.appspot.**com/7307076/<https://codereview.appspot.com/7307...
>

Sign in to reply to this message.

Humper

I also submitted bench runs for other platforms, but the system didn't send me mail ...

12 years, 6 months ago (2013-02-09 14:14:06 UTC) #14

I also submitted bench runs for other platforms, but the system didn't send
me mail when they were done like I thought it would, so I have to hunt
those down.

Greg Humphreys | Software Engineer [skia, chrome] | humper@google.com |
434-260-0543


On Sat, Feb 9, 2013 at 9:13 AM, Greg Humphreys <humper@google.com> wrote:

> Bench results here:
>
>
>
http://70.32.156.53:10117/builders/Skia_Shuttle_Ubuntu12_ATI5770_Float_Bench_...
>
> Greg Humphreys | Software Engineer [skia, chrome] | humper@google.com |
> 434-260-0543
>
>
> On Fri, Feb 8, 2013 at 4:51 PM, <humper@google.com> wrote:
>
>>
>> https://codereview.appspot.**com/7307076/diff/2001/src/**
>>
effects/SkBlurMask.cpp<https://codereview.appspot.com/7307076/diff/2001/src/effects/SkBlurMask.cpp>
>> File src/effects/SkBlurMask.cpp (right):
>>
>> https://codereview.appspot.**com/7307076/diff/2001/src/**
>>
effects/SkBlurMask.cpp#**newcode54<https://codereview.appspot.com/7307076/diff/2001/src/effects/SkBlurMask.cpp#newcode54>
>> src/effects/SkBlurMask.cpp:54: *dptr = (sum * scale + (1 << 15)) >> 16;
>> \
>> Yep, we're in agreement -- I'm just waiting for the bench trybots to
>> finish their thing before asking anyone to sign off.  Might not happen
>> until Monday at this rate :)
>>
>>
>> On 2013/02/08 21:50:30, Stephen White wrote:
>>
>>> On 2013/02/08 21:36:29, Humper wrote:
>>> > After talking to Mike briefly about this, he agreed that the + (1 <<
>>>
>> 15) code
>>
>>> > was needed for correctness, so benching it separately wasn't
>>>
>> important for
>>
>>> > landing this.
>>>
>>
>>  Yes, I agree that the rounding is good.  I'm not sure about the
>>>
>> uint16_t change,
>>
>>> though.  All the other blur implementations in Skia and WebKit use
>>>
>> same-sized (8
>>
>>> or 8888) intermediate buffers, and see no visible loss of quality
>>>
>> AFAIK.  Even
>>
>>> the non-separable blur in this file used 8-bit temporary buffers
>>>
>> between the 3
>>
>>> SAT passes, and it seemed to be OK.  You're also adding a conversion
>>>
>> step that
>>
>>> wasn't there before, and possibly could be skipped by templating the
>>>
>> blur loop
>>
>>> to go from 8 -> 16 on the first pass, if 16 bit is absolutely
>>>
>> necessary.
>>
>>  > I'm submitting some bench trybots with the new code so we can
>>>
>> compare against
>>
>>> > ToT performance.
>>> >
>>> > On 2013/02/08 20:13:18, Stephen White wrote:
>>> > > On 2013/02/08 19:51:21, Humper wrote:
>>> > > > Sorry, do you mean benchmark the code with and without the + (1
>>>
>> << 15)
>>
>>> > >
>>> > > Yes, I'd like to see how each change (16bit, rounding) separately
>>>
>> affects
>>
>>> > > performance.  (I actually had a 32bit flavour of this at some
>>>
>> point, and it
>>
>>> > > wasn't horribly slower on Intel, but I have no idea how ARM
>>>
>> devices might
>>
>>> > > behave.)
>>> > >
>>> > > > On 2013/02/08 19:24:18, Stephen White wrote:
>>> > > > > On 2013/02/08 19:02:32, Humper wrote:
>>> > > > > > effectively add 1/2; this prevents bias and makes sure that
>>>
>> we blur an
>>
>>> > > array
>>> > > > > of
>>> > > > > > 255's to an array of 255's, not 249's.
>>> > > > >
>>> > > > > This looks good, but could you benchmark all the rounding
>>>
>> alone,
>>
>>> > separately
>>> > > > from
>>> > > > > the 16bit change?
>>> > > >
>>> > >
>>> >
>>>
>>
>>
>>
https://codereview.appspot.**com/7307076/<https://codereview.appspot.com/7307...
>>
>
>

Sign in to reply to this message.

reed1

... what is the summary of the bench-results for your CL? On Sat, Feb 9, ...

12 years, 6 months ago (2013-02-11 14:36:33 UTC) #15

... what is the summary of the bench-results for your CL?


On Sat, Feb 9, 2013 at 9:14 AM, Greg Humphreys <humper@google.com> wrote:

> I also submitted bench runs for other platforms, but the system didn't
> send me mail when they were done like I thought it would, so I have to hunt
> those down.
>
> Greg Humphreys | Software Engineer [skia, chrome] | humper@google.com |
> 434-260-0543
>
>
> On Sat, Feb 9, 2013 at 9:13 AM, Greg Humphreys <humper@google.com> wrote:
>
>> Bench results here:
>>
>>
>>
http://70.32.156.53:10117/builders/Skia_Shuttle_Ubuntu12_ATI5770_Float_Bench_...
>>
>> Greg Humphreys | Software Engineer [skia, chrome] | humper@google.com |
>> 434-260-0543
>>
>>
>> On Fri, Feb 8, 2013 at 4:51 PM, <humper@google.com> wrote:
>>
>>>
>>> https://codereview.appspot.**com/7307076/diff/2001/src/**
>>>
effects/SkBlurMask.cpp<https://codereview.appspot.com/7307076/diff/2001/src/effects/SkBlurMask.cpp>
>>> File src/effects/SkBlurMask.cpp (right):
>>>
>>> https://codereview.appspot.**com/7307076/diff/2001/src/**
>>>
effects/SkBlurMask.cpp#**newcode54<https://codereview.appspot.com/7307076/diff/2001/src/effects/SkBlurMask.cpp#newcode54>
>>> src/effects/SkBlurMask.cpp:54: *dptr = (sum * scale + (1 << 15)) >> 16;
>>> \
>>> Yep, we're in agreement -- I'm just waiting for the bench trybots to
>>> finish their thing before asking anyone to sign off.  Might not happen
>>> until Monday at this rate :)
>>>
>>>
>>> On 2013/02/08 21:50:30, Stephen White wrote:
>>>
>>>> On 2013/02/08 21:36:29, Humper wrote:
>>>> > After talking to Mike briefly about this, he agreed that the + (1 <<
>>>>
>>> 15) code
>>>
>>>> > was needed for correctness, so benching it separately wasn't
>>>>
>>> important for
>>>
>>>> > landing this.
>>>>
>>>
>>>  Yes, I agree that the rounding is good.  I'm not sure about the
>>>>
>>> uint16_t change,
>>>
>>>> though.  All the other blur implementations in Skia and WebKit use
>>>>
>>> same-sized (8
>>>
>>>> or 8888) intermediate buffers, and see no visible loss of quality
>>>>
>>> AFAIK.  Even
>>>
>>>> the non-separable blur in this file used 8-bit temporary buffers
>>>>
>>> between the 3
>>>
>>>> SAT passes, and it seemed to be OK.  You're also adding a conversion
>>>>
>>> step that
>>>
>>>> wasn't there before, and possibly could be skipped by templating the
>>>>
>>> blur loop
>>>
>>>> to go from 8 -> 16 on the first pass, if 16 bit is absolutely
>>>>
>>> necessary.
>>>
>>>  > I'm submitting some bench trybots with the new code so we can
>>>>
>>> compare against
>>>
>>>> > ToT performance.
>>>> >
>>>> > On 2013/02/08 20:13:18, Stephen White wrote:
>>>> > > On 2013/02/08 19:51:21, Humper wrote:
>>>> > > > Sorry, do you mean benchmark the code with and without the + (1
>>>>
>>> << 15)
>>>
>>>> > >
>>>> > > Yes, I'd like to see how each change (16bit, rounding) separately
>>>>
>>> affects
>>>
>>>> > > performance.  (I actually had a 32bit flavour of this at some
>>>>
>>> point, and it
>>>
>>>> > > wasn't horribly slower on Intel, but I have no idea how ARM
>>>>
>>> devices might
>>>
>>>> > > behave.)
>>>> > >
>>>> > > > On 2013/02/08 19:24:18, Stephen White wrote:
>>>> > > > > On 2013/02/08 19:02:32, Humper wrote:
>>>> > > > > > effectively add 1/2; this prevents bias and makes sure that
>>>>
>>> we blur an
>>>
>>>> > > array
>>>> > > > > of
>>>> > > > > > 255's to an array of 255's, not 249's.
>>>> > > > >
>>>> > > > > This looks good, but could you benchmark all the rounding
>>>>
>>> alone,
>>>
>>>> > separately
>>>> > > > from
>>>> > > > > the 16bit change?
>>>> > > >
>>>> > >
>>>> >
>>>>
>>>
>>>
>>>
https://codereview.appspot.**com/7307076/<https://codereview.appspot.com/7307...
>>>
>>
>>
>

Sign in to reply to this message.

Humper

Ah, the summary is that the bench trybots don't email me when they're done, which ...

12 years, 6 months ago (2013-02-11 14:38:34 UTC) #16

Ah, the summary is that the bench trybots don't email me when they're done,
which limits their usefulness since it can take a pretty long time for the
results to finish.  Now I have to go hunt down all the results and compare
them to ToT.

I'll do that now.

Greg Humphreys | Software Engineer [skia, chrome] | humper@google.com |
434-260-0543


On Mon, Feb 11, 2013 at 9:36 AM, Mike Reed <reed@google.com> wrote:

> ... what is the summary of the bench-results for your CL?
>
>
> On Sat, Feb 9, 2013 at 9:14 AM, Greg Humphreys <humper@google.com> wrote:
>
>> I also submitted bench runs for other platforms, but the system didn't
>> send me mail when they were done like I thought it would, so I have to hunt
>> those down.
>>
>> Greg Humphreys | Software Engineer [skia, chrome] | humper@google.com |
>> 434-260-0543
>>
>>
>> On Sat, Feb 9, 2013 at 9:13 AM, Greg Humphreys <humper@google.com> wrote:
>>
>>> Bench results here:
>>>
>>>
>>>
http://70.32.156.53:10117/builders/Skia_Shuttle_Ubuntu12_ATI5770_Float_Bench_...
>>>
>>> Greg Humphreys | Software Engineer [skia, chrome] | humper@google.com |
>>> 434-260-0543
>>>
>>>
>>> On Fri, Feb 8, 2013 at 4:51 PM, <humper@google.com> wrote:
>>>
>>>>
>>>> https://codereview.appspot.**com/7307076/diff/2001/src/**
>>>>
effects/SkBlurMask.cpp<https://codereview.appspot.com/7307076/diff/2001/src/effects/SkBlurMask.cpp>
>>>> File src/effects/SkBlurMask.cpp (right):
>>>>
>>>> https://codereview.appspot.**com/7307076/diff/2001/src/**
>>>>
effects/SkBlurMask.cpp#**newcode54<https://codereview.appspot.com/7307076/diff/2001/src/effects/SkBlurMask.cpp#newcode54>
>>>> src/effects/SkBlurMask.cpp:54: *dptr = (sum * scale + (1 << 15)) >> 16;
>>>> \
>>>> Yep, we're in agreement -- I'm just waiting for the bench trybots to
>>>> finish their thing before asking anyone to sign off.  Might not happen
>>>> until Monday at this rate :)
>>>>
>>>>
>>>> On 2013/02/08 21:50:30, Stephen White wrote:
>>>>
>>>>> On 2013/02/08 21:36:29, Humper wrote:
>>>>> > After talking to Mike briefly about this, he agreed that the + (1 <<
>>>>>
>>>> 15) code
>>>>
>>>>> > was needed for correctness, so benching it separately wasn't
>>>>>
>>>> important for
>>>>
>>>>> > landing this.
>>>>>
>>>>
>>>>  Yes, I agree that the rounding is good.  I'm not sure about the
>>>>>
>>>> uint16_t change,
>>>>
>>>>> though.  All the other blur implementations in Skia and WebKit use
>>>>>
>>>> same-sized (8
>>>>
>>>>> or 8888) intermediate buffers, and see no visible loss of quality
>>>>>
>>>> AFAIK.  Even
>>>>
>>>>> the non-separable blur in this file used 8-bit temporary buffers
>>>>>
>>>> between the 3
>>>>
>>>>> SAT passes, and it seemed to be OK.  You're also adding a conversion
>>>>>
>>>> step that
>>>>
>>>>> wasn't there before, and possibly could be skipped by templating the
>>>>>
>>>> blur loop
>>>>
>>>>> to go from 8 -> 16 on the first pass, if 16 bit is absolutely
>>>>>
>>>> necessary.
>>>>
>>>>  > I'm submitting some bench trybots with the new code so we can
>>>>>
>>>> compare against
>>>>
>>>>> > ToT performance.
>>>>> >
>>>>> > On 2013/02/08 20:13:18, Stephen White wrote:
>>>>> > > On 2013/02/08 19:51:21, Humper wrote:
>>>>> > > > Sorry, do you mean benchmark the code with and without the + (1
>>>>>
>>>> << 15)
>>>>
>>>>> > >
>>>>> > > Yes, I'd like to see how each change (16bit, rounding) separately
>>>>>
>>>> affects
>>>>
>>>>> > > performance.  (I actually had a 32bit flavour of this at some
>>>>>
>>>> point, and it
>>>>
>>>>> > > wasn't horribly slower on Intel, but I have no idea how ARM
>>>>>
>>>> devices might
>>>>
>>>>> > > behave.)
>>>>> > >
>>>>> > > > On 2013/02/08 19:24:18, Stephen White wrote:
>>>>> > > > > On 2013/02/08 19:02:32, Humper wrote:
>>>>> > > > > > effectively add 1/2; this prevents bias and makes sure that
>>>>>
>>>> we blur an
>>>>
>>>>> > > array
>>>>> > > > > of
>>>>> > > > > > 255's to an array of 255's, not 249's.
>>>>> > > > >
>>>>> > > > > This looks good, but could you benchmark all the rounding
>>>>>
>>>> alone,
>>>>
>>>>> > separately
>>>>> > > > from
>>>>> > > > > the 16bit change?
>>>>> > > >
>>>>> > >
>>>>> >
>>>>>
>>>>
>>>>
>>>>
https://codereview.appspot.**com/7307076/<https://codereview.appspot.com/7307...
>>>>
>>>
>>>
>>
>

Sign in to reply to this message.

Humper

Nevermind, the mails were just going to spam. The summary is that it seems some ...

12 years, 6 months ago (2013-02-11 15:29:17 UTC) #17

Nevermind, the mails were just going to spam.

The summary is that it seems some blur benches got a *little* faster, and
some got a *little* slower.

I'm going to do a before/after image comparison of 16 vs. 8 bit fixed point
intermediate calculations and see if we notice any appreciable difference,
both between the 16/8 versions themselves, and also between the analytic
solutions and the 16/8 versions.

Greg Humphreys | Software Engineer [skia, chrome] | humper@google.com |
434-260-0543


On Mon, Feb 11, 2013 at 9:38 AM, Greg Humphreys <humper@google.com> wrote:

> Ah, the summary is that the bench trybots don't email me when they're
> done, which limits their usefulness since it can take a pretty long time
> for the results to finish.  Now I have to go hunt down all the results and
> compare them to ToT.
>
> I'll do that now.
>
> Greg Humphreys | Software Engineer [skia, chrome] | humper@google.com |
> 434-260-0543
>
>
> On Mon, Feb 11, 2013 at 9:36 AM, Mike Reed <reed@google.com> wrote:
>
>> ... what is the summary of the bench-results for your CL?
>>
>>
>> On Sat, Feb 9, 2013 at 9:14 AM, Greg Humphreys <humper@google.com> wrote:
>>
>>> I also submitted bench runs for other platforms, but the system didn't
>>> send me mail when they were done like I thought it would, so I have to hunt
>>> those down.
>>>
>>> Greg Humphreys | Software Engineer [skia, chrome] | humper@google.com |
>>> 434-260-0543
>>>
>>>
>>> On Sat, Feb 9, 2013 at 9:13 AM, Greg Humphreys <humper@google.com>wrote:
>>>
>>>> Bench results here:
>>>>
>>>>
>>>>
http://70.32.156.53:10117/builders/Skia_Shuttle_Ubuntu12_ATI5770_Float_Bench_...
>>>>
>>>> Greg Humphreys | Software Engineer [skia, chrome] | humper@google.com |
>>>>  434-260-0543
>>>>
>>>>
>>>> On Fri, Feb 8, 2013 at 4:51 PM, <humper@google.com> wrote:
>>>>
>>>>>
>>>>> https://codereview.appspot.**com/7307076/diff/2001/src/**
>>>>>
effects/SkBlurMask.cpp<https://codereview.appspot.com/7307076/diff/2001/src/effects/SkBlurMask.cpp>
>>>>> File src/effects/SkBlurMask.cpp (right):
>>>>>
>>>>> https://codereview.appspot.**com/7307076/diff/2001/src/**
>>>>>
effects/SkBlurMask.cpp#**newcode54<https://codereview.appspot.com/7307076/diff/2001/src/effects/SkBlurMask.cpp#newcode54>
>>>>> src/effects/SkBlurMask.cpp:54: *dptr = (sum * scale + (1 << 15)) >> 16;
>>>>> \
>>>>> Yep, we're in agreement -- I'm just waiting for the bench trybots to
>>>>> finish their thing before asking anyone to sign off.  Might not happen
>>>>> until Monday at this rate :)
>>>>>
>>>>>
>>>>> On 2013/02/08 21:50:30, Stephen White wrote:
>>>>>
>>>>>> On 2013/02/08 21:36:29, Humper wrote:
>>>>>> > After talking to Mike briefly about this, he agreed that the + (1 <<
>>>>>>
>>>>> 15) code
>>>>>
>>>>>> > was needed for correctness, so benching it separately wasn't
>>>>>>
>>>>> important for
>>>>>
>>>>>> > landing this.
>>>>>>
>>>>>
>>>>>  Yes, I agree that the rounding is good.  I'm not sure about the
>>>>>>
>>>>> uint16_t change,
>>>>>
>>>>>> though.  All the other blur implementations in Skia and WebKit use
>>>>>>
>>>>> same-sized (8
>>>>>
>>>>>> or 8888) intermediate buffers, and see no visible loss of quality
>>>>>>
>>>>> AFAIK.  Even
>>>>>
>>>>>> the non-separable blur in this file used 8-bit temporary buffers
>>>>>>
>>>>> between the 3
>>>>>
>>>>>> SAT passes, and it seemed to be OK.  You're also adding a conversion
>>>>>>
>>>>> step that
>>>>>
>>>>>> wasn't there before, and possibly could be skipped by templating the
>>>>>>
>>>>> blur loop
>>>>>
>>>>>> to go from 8 -> 16 on the first pass, if 16 bit is absolutely
>>>>>>
>>>>> necessary.
>>>>>
>>>>>  > I'm submitting some bench trybots with the new code so we can
>>>>>>
>>>>> compare against
>>>>>
>>>>>> > ToT performance.
>>>>>> >
>>>>>> > On 2013/02/08 20:13:18, Stephen White wrote:
>>>>>> > > On 2013/02/08 19:51:21, Humper wrote:
>>>>>> > > > Sorry, do you mean benchmark the code with and without the + (1
>>>>>>
>>>>> << 15)
>>>>>
>>>>>> > >
>>>>>> > > Yes, I'd like to see how each change (16bit, rounding) separately
>>>>>>
>>>>> affects
>>>>>
>>>>>> > > performance.  (I actually had a 32bit flavour of this at some
>>>>>>
>>>>> point, and it
>>>>>
>>>>>> > > wasn't horribly slower on Intel, but I have no idea how ARM
>>>>>>
>>>>> devices might
>>>>>
>>>>>> > > behave.)
>>>>>> > >
>>>>>> > > > On 2013/02/08 19:24:18, Stephen White wrote:
>>>>>> > > > > On 2013/02/08 19:02:32, Humper wrote:
>>>>>> > > > > > effectively add 1/2; this prevents bias and makes sure that
>>>>>>
>>>>> we blur an
>>>>>
>>>>>> > > array
>>>>>> > > > > of
>>>>>> > > > > > 255's to an array of 255's, not 249's.
>>>>>> > > > >
>>>>>> > > > > This looks good, but could you benchmark all the rounding
>>>>>>
>>>>> alone,
>>>>>
>>>>>> > separately
>>>>>> > > > from
>>>>>> > > > > the 16bit change?
>>>>>> > > >
>>>>>> > >
>>>>>> >
>>>>>>
>>>>>
>>>>>
>>>>>
https://codereview.appspot.**com/7307076/<https://codereview.appspot.com/7307...
>>>>>
>>>>
>>>>
>>>
>>
>

Sign in to reply to this message.

Humper

In these experiments I saw that in 8 bit mode we lack the precision to ...

12 years, 6 months ago (2013-02-11 17:23:56 UTC) #18

In these experiments I saw that in 8 bit mode we lack the precision to make
sure that averaging a bunch of 255's yields 255.  They don't quite dip down
as low as 249 like before, but in many cases I did see them go down to 253.

tl;dr version: performance impact of making 16 bit fixed point copies for
intermediate calculations seem to be basically noise, and the results are
measurably higher quality.

Greg Humphreys | Software Engineer [skia, chrome] | humper@google.com |
434-260-0543


On Mon, Feb 11, 2013 at 10:29 AM, Greg Humphreys <humper@google.com> wrote:

> Nevermind, the mails were just going to spam.
>
> The summary is that it seems some blur benches got a *little* faster, and
> some got a *little* slower.
>
> I'm going to do a before/after image comparison of 16 vs. 8 bit fixed
> point intermediate calculations and see if we notice any appreciable
> difference, both between the 16/8 versions themselves, and also between the
> analytic solutions and the 16/8 versions.
>
> Greg Humphreys | Software Engineer [skia, chrome] | humper@google.com |
> 434-260-0543
>
>
> On Mon, Feb 11, 2013 at 9:38 AM, Greg Humphreys <humper@google.com> wrote:
>
>> Ah, the summary is that the bench trybots don't email me when they're
>> done, which limits their usefulness since it can take a pretty long time
>> for the results to finish.  Now I have to go hunt down all the results and
>> compare them to ToT.
>>
>> I'll do that now.
>>
>> Greg Humphreys | Software Engineer [skia, chrome] | humper@google.com |
>> 434-260-0543
>>
>>
>> On Mon, Feb 11, 2013 at 9:36 AM, Mike Reed <reed@google.com> wrote:
>>
>>> ... what is the summary of the bench-results for your CL?
>>>
>>>
>>> On Sat, Feb 9, 2013 at 9:14 AM, Greg Humphreys <humper@google.com>wrote:
>>>
>>>> I also submitted bench runs for other platforms, but the system didn't
>>>> send me mail when they were done like I thought it would, so I have to
hunt
>>>> those down.
>>>>
>>>> Greg Humphreys | Software Engineer [skia, chrome] | humper@google.com |
>>>>  434-260-0543
>>>>
>>>>
>>>> On Sat, Feb 9, 2013 at 9:13 AM, Greg Humphreys <humper@google.com>wrote:
>>>>
>>>>> Bench results here:
>>>>>
>>>>>
>>>>>
http://70.32.156.53:10117/builders/Skia_Shuttle_Ubuntu12_ATI5770_Float_Bench_...
>>>>>
>>>>> Greg Humphreys | Software Engineer [skia, chrome] | humper@google.com
>>>>>  | 434-260-0543
>>>>>
>>>>>
>>>>> On Fri, Feb 8, 2013 at 4:51 PM, <humper@google.com> wrote:
>>>>>
>>>>>>
>>>>>> https://codereview.appspot.**com/7307076/diff/2001/src/**
>>>>>>
effects/SkBlurMask.cpp<https://codereview.appspot.com/7307076/diff/2001/src/effects/SkBlurMask.cpp>
>>>>>> File src/effects/SkBlurMask.cpp (right):
>>>>>>
>>>>>> https://codereview.appspot.**com/7307076/diff/2001/src/**
>>>>>>
effects/SkBlurMask.cpp#**newcode54<https://codereview.appspot.com/7307076/diff/2001/src/effects/SkBlurMask.cpp#newcode54>
>>>>>> src/effects/SkBlurMask.cpp:54: *dptr = (sum * scale + (1 << 15)) >>
>>>>>> 16;
>>>>>> \
>>>>>> Yep, we're in agreement -- I'm just waiting for the bench trybots to
>>>>>> finish their thing before asking anyone to sign off.  Might not happen
>>>>>> until Monday at this rate :)
>>>>>>
>>>>>>
>>>>>> On 2013/02/08 21:50:30, Stephen White wrote:
>>>>>>
>>>>>>> On 2013/02/08 21:36:29, Humper wrote:
>>>>>>> > After talking to Mike briefly about this, he agreed that the + (1
>>>>>>> <<
>>>>>>>
>>>>>> 15) code
>>>>>>
>>>>>>> > was needed for correctness, so benching it separately wasn't
>>>>>>>
>>>>>> important for
>>>>>>
>>>>>>> > landing this.
>>>>>>>
>>>>>>
>>>>>>  Yes, I agree that the rounding is good.  I'm not sure about the
>>>>>>>
>>>>>> uint16_t change,
>>>>>>
>>>>>>> though.  All the other blur implementations in Skia and WebKit use
>>>>>>>
>>>>>> same-sized (8
>>>>>>
>>>>>>> or 8888) intermediate buffers, and see no visible loss of quality
>>>>>>>
>>>>>> AFAIK.  Even
>>>>>>
>>>>>>> the non-separable blur in this file used 8-bit temporary buffers
>>>>>>>
>>>>>> between the 3
>>>>>>
>>>>>>> SAT passes, and it seemed to be OK.  You're also adding a conversion
>>>>>>>
>>>>>> step that
>>>>>>
>>>>>>> wasn't there before, and possibly could be skipped by templating the
>>>>>>>
>>>>>> blur loop
>>>>>>
>>>>>>> to go from 8 -> 16 on the first pass, if 16 bit is absolutely
>>>>>>>
>>>>>> necessary.
>>>>>>
>>>>>>  > I'm submitting some bench trybots with the new code so we can
>>>>>>>
>>>>>> compare against
>>>>>>
>>>>>>> > ToT performance.
>>>>>>> >
>>>>>>> > On 2013/02/08 20:13:18, Stephen White wrote:
>>>>>>> > > On 2013/02/08 19:51:21, Humper wrote:
>>>>>>> > > > Sorry, do you mean benchmark the code with and without the + (1
>>>>>>>
>>>>>> << 15)
>>>>>>
>>>>>>> > >
>>>>>>> > > Yes, I'd like to see how each change (16bit, rounding) separately
>>>>>>>
>>>>>> affects
>>>>>>
>>>>>>> > > performance.  (I actually had a 32bit flavour of this at some
>>>>>>>
>>>>>> point, and it
>>>>>>
>>>>>>> > > wasn't horribly slower on Intel, but I have no idea how ARM
>>>>>>>
>>>>>> devices might
>>>>>>
>>>>>>> > > behave.)
>>>>>>> > >
>>>>>>> > > > On 2013/02/08 19:24:18, Stephen White wrote:
>>>>>>> > > > > On 2013/02/08 19:02:32, Humper wrote:
>>>>>>> > > > > > effectively add 1/2; this prevents bias and makes sure that
>>>>>>>
>>>>>> we blur an
>>>>>>
>>>>>>> > > array
>>>>>>> > > > > of
>>>>>>> > > > > > 255's to an array of 255's, not 249's.
>>>>>>> > > > >
>>>>>>> > > > > This looks good, but could you benchmark all the rounding
>>>>>>>
>>>>>> alone,
>>>>>>
>>>>>>> > separately
>>>>>>> > > > from
>>>>>>> > > > > the 16bit change?
>>>>>>> > > >
>>>>>>> > >
>>>>>>> >
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
https://codereview.appspot.**com/7307076/<https://codereview.appspot.com/7307...
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

Sign in to reply to this message.

Humper

CL updated with some style cleanups as well as support for rectangles that are thinner ...

12 years, 6 months ago (2013-02-12 14:56:28 UTC) #19

CL updated with some style cleanups as well as support for rectangles that are
thinner than the support of the blur kernel.

On 2013/02/11 17:23:56, Humper wrote:
> In these experiments I saw that in 8 bit mode we lack the precision to make
> sure that averaging a bunch of 255's yields 255.  They don't quite dip down
> as low as 249 like before, but in many cases I did see them go down to 253.
> 
> tl;dr version: performance impact of making 16 bit fixed point copies for
> intermediate calculations seem to be basically noise, and the results are
> measurably higher quality.
> 
> Greg Humphreys | Software Engineer [skia, chrome] | mailto:humper@google.com |
> 434-260-0543
> 
> 
> On Mon, Feb 11, 2013 at 10:29 AM, Greg Humphreys <mailto:humper@google.com>
wrote:
> 
> > Nevermind, the mails were just going to spam.
> >
> > The summary is that it seems some blur benches got a *little* faster, and
> > some got a *little* slower.
> >
> > I'm going to do a before/after image comparison of 16 vs. 8 bit fixed
> > point intermediate calculations and see if we notice any appreciable
> > difference, both between the 16/8 versions themselves, and also between the
> > analytic solutions and the 16/8 versions.
> >
> > Greg Humphreys | Software Engineer [skia, chrome] | mailto:humper@google.com
|
> > 434-260-0543
> >
> >
> > On Mon, Feb 11, 2013 at 9:38 AM, Greg Humphreys <mailto:humper@google.com>
wrote:
> >
> >> Ah, the summary is that the bench trybots don't email me when they're
> >> done, which limits their usefulness since it can take a pretty long time
> >> for the results to finish.  Now I have to go hunt down all the results and
> >> compare them to ToT.
> >>
> >> I'll do that now.
> >>
> >> Greg Humphreys | Software Engineer [skia, chrome] |
mailto:humper@google.com |
> >> 434-260-0543
> >>
> >>
> >> On Mon, Feb 11, 2013 at 9:36 AM, Mike Reed <mailto:reed@google.com> wrote:
> >>
> >>> ... what is the summary of the bench-results for your CL?
> >>>
> >>>
> >>> On Sat, Feb 9, 2013 at 9:14 AM, Greg Humphreys <humper@google.com>wrote:
> >>>
> >>>> I also submitted bench runs for other platforms, but the system didn't
> >>>> send me mail when they were done like I thought it would, so I have to
> hunt
> >>>> those down.
> >>>>
> >>>> Greg Humphreys | Software Engineer [skia, chrome] |
mailto:humper@google.com |
> >>>>  434-260-0543
> >>>>
> >>>>
> >>>> On Sat, Feb 9, 2013 at 9:13 AM, Greg Humphreys <humper@google.com>wrote:
> >>>>
> >>>>> Bench results here:
> >>>>>
> >>>>>
> >>>>>
>
http://70.32.156.53:10117/builders/Skia_Shuttle_Ubuntu12_ATI5770_Float_Bench_...
> >>>>>
> >>>>> Greg Humphreys | Software Engineer [skia, chrome] |
mailto:humper@google.com
> >>>>>  | 434-260-0543
> >>>>>
> >>>>>
> >>>>> On Fri, Feb 8, 2013 at 4:51 PM, <mailto:humper@google.com> wrote:
> >>>>>
> >>>>>>
> >>>>>> https://codereview.appspot.**com/7307076/diff/2001/src/**
> >>>>>>
>
effects/SkBlurMask.cpp<https://codereview.appspot.com/7307076/diff/2001/src/effects/SkBlurMask.cpp>
> >>>>>> File src/effects/SkBlurMask.cpp (right):
> >>>>>>
> >>>>>> https://codereview.appspot.**com/7307076/diff/2001/src/**
> >>>>>>
>
effects/SkBlurMask.cpp#**newcode54<https://codereview.appspot.com/7307076/diff/2001/src/effects/SkBlurMask.cpp#newcode54>
> >>>>>> src/effects/SkBlurMask.cpp:54: *dptr = (sum * scale + (1 << 15)) >>
> >>>>>> 16;
> >>>>>> \
> >>>>>> Yep, we're in agreement -- I'm just waiting for the bench trybots to
> >>>>>> finish their thing before asking anyone to sign off.  Might not happen
> >>>>>> until Monday at this rate :)
> >>>>>>
> >>>>>>
> >>>>>> On 2013/02/08 21:50:30, Stephen White wrote:
> >>>>>>
> >>>>>>> On 2013/02/08 21:36:29, Humper wrote:
> >>>>>>> > After talking to Mike briefly about this, he agreed that the + (1
> >>>>>>> <<
> >>>>>>>
> >>>>>> 15) code
> >>>>>>
> >>>>>>> > was needed for correctness, so benching it separately wasn't
> >>>>>>>
> >>>>>> important for
> >>>>>>
> >>>>>>> > landing this.
> >>>>>>>
> >>>>>>
> >>>>>>  Yes, I agree that the rounding is good.  I'm not sure about the
> >>>>>>>
> >>>>>> uint16_t change,
> >>>>>>
> >>>>>>> though.  All the other blur implementations in Skia and WebKit use
> >>>>>>>
> >>>>>> same-sized (8
> >>>>>>
> >>>>>>> or 8888) intermediate buffers, and see no visible loss of quality
> >>>>>>>
> >>>>>> AFAIK.  Even
> >>>>>>
> >>>>>>> the non-separable blur in this file used 8-bit temporary buffers
> >>>>>>>
> >>>>>> between the 3
> >>>>>>
> >>>>>>> SAT passes, and it seemed to be OK.  You're also adding a conversion
> >>>>>>>
> >>>>>> step that
> >>>>>>
> >>>>>>> wasn't there before, and possibly could be skipped by templating the
> >>>>>>>
> >>>>>> blur loop
> >>>>>>
> >>>>>>> to go from 8 -> 16 on the first pass, if 16 bit is absolutely
> >>>>>>>
> >>>>>> necessary.
> >>>>>>
> >>>>>>  > I'm submitting some bench trybots with the new code so we can
> >>>>>>>
> >>>>>> compare against
> >>>>>>
> >>>>>>> > ToT performance.
> >>>>>>> >
> >>>>>>> > On 2013/02/08 20:13:18, Stephen White wrote:
> >>>>>>> > > On 2013/02/08 19:51:21, Humper wrote:
> >>>>>>> > > > Sorry, do you mean benchmark the code with and without the + (1
> >>>>>>>
> >>>>>> << 15)
> >>>>>>
> >>>>>>> > >
> >>>>>>> > > Yes, I'd like to see how each change (16bit, rounding) separately
> >>>>>>>
> >>>>>> affects
> >>>>>>
> >>>>>>> > > performance.  (I actually had a 32bit flavour of this at some
> >>>>>>>
> >>>>>> point, and it
> >>>>>>
> >>>>>>> > > wasn't horribly slower on Intel, but I have no idea how ARM
> >>>>>>>
> >>>>>> devices might
> >>>>>>
> >>>>>>> > > behave.)
> >>>>>>> > >
> >>>>>>> > > > On 2013/02/08 19:24:18, Stephen White wrote:
> >>>>>>> > > > > On 2013/02/08 19:02:32, Humper wrote:
> >>>>>>> > > > > > effectively add 1/2; this prevents bias and makes sure that
> >>>>>>>
> >>>>>> we blur an
> >>>>>>
> >>>>>>> > > array
> >>>>>>> > > > > of
> >>>>>>> > > > > > 255's to an array of 255's, not 249's.
> >>>>>>> > > > >
> >>>>>>> > > > > This looks good, but could you benchmark all the rounding
> >>>>>>>
> >>>>>> alone,
> >>>>>>
> >>>>>>> > separately
> >>>>>>> > > > from
> >>>>>>> > > > > the 16bit change?
> >>>>>>> > > >
> >>>>>>> > >
> >>>>>>> >
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
>
https://codereview.appspot.**com/7307076/%3Chttps://codereview.appspot.com/73...>
> >>>>>>
> >>>>>
> >>>>>
> >>>>
> >>>
> >>
> >

Sign in to reply to this message.

reed1

https://codereview.appspot.com/7307076/diff/13002/src/effects/SkBlurMask.cpp File src/effects/SkBlurMask.cpp (right): https://codereview.appspot.com/7307076/diff/13002/src/effects/SkBlurMask.cpp#newcode198 src/effects/SkBlurMask.cpp:198: { I know its not new to this CL, ...

12 years, 6 months ago (2013-02-12 15:14:35 UTC) #20

Humper

https://codereview.appspot.com/7307076/diff/13002/src/effects/SkBlurMask.cpp File src/effects/SkBlurMask.cpp (right): https://codereview.appspot.com/7307076/diff/13002/src/effects/SkBlurMask.cpp#newcode198 src/effects/SkBlurMask.cpp:198: { To be honest the details of boxBlurInterp are ...

12 years, 6 months ago (2013-02-12 20:34:04 UTC) #21

https://codereview.appspot.com/7307076/diff/13002/src/effects/SkBlurMask.cpp
File src/effects/SkBlurMask.cpp (right):

https://codereview.appspot.com/7307076/diff/13002/src/effects/SkBlurMask.cpp#...
src/effects/SkBlurMask.cpp:198: {
To be honest the details of boxBlurInterp are a little opaque to me, since I
didn't write it.  Maybe we can open a new issue for Steven to document this code
a little?

On 2013/02/12 15:14:35, reed1 wrote:
> I know its not new to this CL, but it is not at all clear what math-space we
are
> dealing with in this function. I see byte >> 7 which is just adding the
high-bit
> for some reason, and byte << 8 / kernel, etc. Can this be described in a short
> block-comment somewhere?

https://codereview.appspot.com/7307076/diff/13002/src/effects/SkBlurMask.cpp#...
src/effects/SkBlurMask.cpp:1161: static inline unsigned int profile_lookup(
unsigned int *profile, int loc, int blurred_width, int sharp_width ) {
Not sure -- I'll use the Skia one.

On 2013/02/12 15:14:35, reed1 wrote:
> is this abs() call getting inlined for sure? If not, we have SkAbs32() which
> does.

https://codereview.appspot.com/7307076/diff/13002/src/effects/SkBlurMask.cpp#...
src/effects/SkBlurMask.cpp:1163: int ox = dx >> 1;
On 2013/02/12 15:14:35, reed1 wrote:
> if (...) {
> }

Done.

https://codereview.appspot.com/7307076/diff/13002/src/effects/SkBlurMask.cpp#...
src/effects/SkBlurMask.cpp:1238: for (int y = 0 ; y < dstHeight ; ++y)
On 2013/02/12 15:14:35, reed1 wrote:
> for (...) {

Done.

https://codereview.appspot.com/7307076/diff/13002/src/effects/SkBlurMask.h
File src/effects/SkBlurMask.h (right):

https://codereview.appspot.com/7307076/diff/13002/src/effects/SkBlurMask.h#ne...
src/effects/SkBlurMask.h:33: SkIPoint *margin = NULL);
In this case it's the ground truth gaussian blur that I was using for debugging.
 I don't think we should remove it from the interface because it's really useful
to have around should things go wonky in the future.  I'll rename it to
something like BlurGroundTruth or something

On 2013/02/12 15:14:35, reed1 wrote:
> // What does "Simple" mean here? Why no Quality param? Why is radius renamed
to
> provided_radius?
> 
> In general, even though this header is "private", I think we should try to
make
> its interface clean, consistent, and somewhat clear/documented. (not just
> limited to this CL).

Sign in to reply to this message.

Humper

New code uploaded. Incidentally, the 'quality' and 'style' parameters to BlurRect aren't used anywhere. They ...

12 years, 6 months ago (2013-02-12 20:38:11 UTC) #22

New code uploaded.

Incidentally, the 'quality' and 'style' parameters to BlurRect aren't used
anywhere.  They should probably be removed, since BlurRect doesn't really
support them.

On 2013/02/12 20:34:04, Humper wrote:
> https://codereview.appspot.com/7307076/diff/13002/src/effects/SkBlurMask.cpp
> File src/effects/SkBlurMask.cpp (right):
> 
>
https://codereview.appspot.com/7307076/diff/13002/src/effects/SkBlurMask.cpp#...
> src/effects/SkBlurMask.cpp:198: {
> To be honest the details of boxBlurInterp are a little opaque to me, since I
> didn't write it.  Maybe we can open a new issue for Steven to document this
code
> a little?
> 
> On 2013/02/12 15:14:35, reed1 wrote:
> > I know its not new to this CL, but it is not at all clear what math-space we
> are
> > dealing with in this function. I see byte >> 7 which is just adding the
> high-bit
> > for some reason, and byte << 8 / kernel, etc. Can this be described in a
short
> > block-comment somewhere?
> 
>
https://codereview.appspot.com/7307076/diff/13002/src/effects/SkBlurMask.cpp#...
> src/effects/SkBlurMask.cpp:1161: static inline unsigned int profile_lookup(
> unsigned int *profile, int loc, int blurred_width, int sharp_width ) {
> Not sure -- I'll use the Skia one.
> 
> On 2013/02/12 15:14:35, reed1 wrote:
> > is this abs() call getting inlined for sure? If not, we have SkAbs32() which
> > does.
> 
>
https://codereview.appspot.com/7307076/diff/13002/src/effects/SkBlurMask.cpp#...
> src/effects/SkBlurMask.cpp:1163: int ox = dx >> 1;
> On 2013/02/12 15:14:35, reed1 wrote:
> > if (...) {
> > }
> 
> Done.
> 
>
https://codereview.appspot.com/7307076/diff/13002/src/effects/SkBlurMask.cpp#...
> src/effects/SkBlurMask.cpp:1238: for (int y = 0 ; y < dstHeight ; ++y)
> On 2013/02/12 15:14:35, reed1 wrote:
> > for (...) {
> 
> Done.
> 
> https://codereview.appspot.com/7307076/diff/13002/src/effects/SkBlurMask.h
> File src/effects/SkBlurMask.h (right):
> 
>
https://codereview.appspot.com/7307076/diff/13002/src/effects/SkBlurMask.h#ne...
> src/effects/SkBlurMask.h:33: SkIPoint *margin = NULL);
> In this case it's the ground truth gaussian blur that I was using for
debugging.
>  I don't think we should remove it from the interface because it's really
useful
> to have around should things go wonky in the future.  I'll rename it to
> something like BlurGroundTruth or something
> 
> On 2013/02/12 15:14:35, reed1 wrote:
> > // What does "Simple" mean here? Why no Quality param? Why is radius renamed
> to
> > provided_radius?
> > 
> > In general, even though this header is "private", I think we should try to
> make
> > its interface clean, consistent, and somewhat clear/documented. (not just
> > limited to this CL).

Sign in to reply to this message.

Humper

Final draft adds support for blur styles to the fast analytic path; they match up ...

12 years, 6 months ago (2013-02-15 00:27:25 UTC) #23

Final draft adds support for blur styles to the fast analytic path; they match
up with their slow counterparts.

On 2013/02/12 20:38:11, Humper wrote:
> New code uploaded.
> 
> Incidentally, the 'quality' and 'style' parameters to BlurRect aren't used
> anywhere.  They should probably be removed, since BlurRect doesn't really
> support them.
> 
> On 2013/02/12 20:34:04, Humper wrote:
> > https://codereview.appspot.com/7307076/diff/13002/src/effects/SkBlurMask.cpp
> > File src/effects/SkBlurMask.cpp (right):
> > 
> >
>
https://codereview.appspot.com/7307076/diff/13002/src/effects/SkBlurMask.cpp#...
> > src/effects/SkBlurMask.cpp:198: {
> > To be honest the details of boxBlurInterp are a little opaque to me, since I
> > didn't write it.  Maybe we can open a new issue for Steven to document this
> code
> > a little?
> > 
> > On 2013/02/12 15:14:35, reed1 wrote:
> > > I know its not new to this CL, but it is not at all clear what math-space
we
> > are
> > > dealing with in this function. I see byte >> 7 which is just adding the
> > high-bit
> > > for some reason, and byte << 8 / kernel, etc. Can this be described in a
> short
> > > block-comment somewhere?
> > 
> >
>
https://codereview.appspot.com/7307076/diff/13002/src/effects/SkBlurMask.cpp#...
> > src/effects/SkBlurMask.cpp:1161: static inline unsigned int profile_lookup(
> > unsigned int *profile, int loc, int blurred_width, int sharp_width ) {
> > Not sure -- I'll use the Skia one.
> > 
> > On 2013/02/12 15:14:35, reed1 wrote:
> > > is this abs() call getting inlined for sure? If not, we have SkAbs32()
which
> > > does.
> > 
> >
>
https://codereview.appspot.com/7307076/diff/13002/src/effects/SkBlurMask.cpp#...
> > src/effects/SkBlurMask.cpp:1163: int ox = dx >> 1;
> > On 2013/02/12 15:14:35, reed1 wrote:
> > > if (...) {
> > > }
> > 
> > Done.
> > 
> >
>
https://codereview.appspot.com/7307076/diff/13002/src/effects/SkBlurMask.cpp#...
> > src/effects/SkBlurMask.cpp:1238: for (int y = 0 ; y < dstHeight ; ++y)
> > On 2013/02/12 15:14:35, reed1 wrote:
> > > for (...) {
> > 
> > Done.
> > 
> > https://codereview.appspot.com/7307076/diff/13002/src/effects/SkBlurMask.h
> > File src/effects/SkBlurMask.h (right):
> > 
> >
>
https://codereview.appspot.com/7307076/diff/13002/src/effects/SkBlurMask.h#ne...
> > src/effects/SkBlurMask.h:33: SkIPoint *margin = NULL);
> > In this case it's the ground truth gaussian blur that I was using for
> debugging.
> >  I don't think we should remove it from the interface because it's really
> useful
> > to have around should things go wonky in the future.  I'll rename it to
> > something like BlurGroundTruth or something
> > 
> > On 2013/02/12 15:14:35, reed1 wrote:
> > > // What does "Simple" mean here? Why no Quality param? Why is radius
renamed
> > to
> > > provided_radius?
> > > 
> > > In general, even though this header is "private", I think we should try to
> > make
> > > its interface clean, consistent, and somewhat clear/documented. (not just
> > > limited to this CL).

Sign in to reply to this message.

Stephen White

You're also going to need to wrap the SkBlurMask changes in an #ifdef, so that ...

12 years, 5 months ago (2013-02-15 16:41:33 UTC) #24

You're also going to need to wrap the SkBlurMask changes in an #ifdef, so that
we can safely rebaseline WebKit.

https://codereview.appspot.com/7307076/diff/13002/src/effects/SkBlurMask.cpp
File src/effects/SkBlurMask.cpp (right):

https://codereview.appspot.com/7307076/diff/13002/src/effects/SkBlurMask.cpp#...
src/effects/SkBlurMask.cpp:198: {
On 2013/02/12 20:34:04, Humper wrote:
> To be honest the details of boxBlurInterp are a little opaque to me, since I
> didn't write it. 

This is the reason your changes are rather concerning to me:  it feels like
you've hacked it until it works, rather than providing conclusive evidence that
the solutions are correct.

> Maybe we can open a new issue for Steven to document this code
> a little?
> 
> On 2013/02/12 15:14:35, reed1 wrote:
> > I know its not new to this CL, but it is not at all clear what math-space we
> are
> > dealing with in this function. I see byte >> 7 which is just adding the
> high-bit
> > for some reason, and byte << 8 / kernel, etc. Can this be described in a
short
> > block-comment somewhere?
> 

Sure.  At a minimum, I could provide a commented-out, non-unrolled version of
the code broken out into the two cases (width < diameter, width >= diameter), as
I did for the non-interp case, which might make it clearer.

https://codereview.appspot.com/7307076/diff/13003/src/effects/SkBlurMask.cpp
File src/effects/SkBlurMask.cpp (right):

https://codereview.appspot.com/7307076/diff/13003/src/effects/SkBlurMask.cpp#...
src/effects/SkBlurMask.cpp:32: static int boxBlur(const uint16_t* src, int
src_y_stride, uint16_t* dst,
I'm still not convinced that 16 bit is required here.  If the box blur is not
producing all-white on a given pass, it won't be all-white just by increasing
the precision in the intermediate buffers (ie., in 8bit, between passes it's
either 255 or it's not.)

For N white pixels, the box blur ends up doing (N * 255) / N), which should
always be 255 (up to overflow which would show up as a much worse problem!)
However, the division is done with an approximation, and as you've pointed out
with the rounding fix, may not result in 255.  If rounding fixes that part, then
it should be fine to use 8bit in the intermediate buffers.

We could try substituting an actual integer divide (temporarily), just to verify
that the division approximation is to blame.

Sign in to reply to this message.

reed1

Stephen, since you understand the current code better, do you have an alternate fix for ...

12 years, 5 months ago (2013-02-15 17:01:55 UTC) #26

Stephen White

On 2013/02/15 17:01:55, reed1 wrote: > Stephen, since you understand the current code better, do ...

12 years, 5 months ago (2013-02-15 18:22:18 UTC) #27

reed1

There seem to be a lot of changes in SkBlurMask.cpp that are just renamings (e.g. ...

12 years, 5 months ago (2013-02-20 14:28:23 UTC) #28

Humper

I'll pare down the tests / benches. As for the style changes, I made those ...

12 years, 5 months ago (2013-02-20 14:33:35 UTC) #29

reed1

https://codereview.appspot.com/7307076/diff/21001/src/effects/SkBlurMask.cpp File src/effects/SkBlurMask.cpp (right): https://codereview.appspot.com/7307076/diff/21001/src/effects/SkBlurMask.cpp#newcode533 src/effects/SkBlurMask.cpp:533: uint32_t tmp = sum[px+py] + sum[nx+ny] - sum[nx+py] - ...

12 years, 5 months ago (2013-02-20 14:40:56 UTC) #30

Humper

On 2013/02/20 14:40:56, reed1 wrote: > Ah. Perhaps we just add a // TODO comment ...

12 years, 5 months ago (2013-02-20 14:47:57 UTC) #31

Stephen White

https://codereview.appspot.com/7307076/diff/21001/src/effects/SkBlurMask.cpp File src/effects/SkBlurMask.cpp (right): https://codereview.appspot.com/7307076/diff/21001/src/effects/SkBlurMask.cpp#newcode533 src/effects/SkBlurMask.cpp:533: uint32_t tmp = sum[px+py] + sum[nx+ny] - sum[nx+py] - ...

12 years, 5 months ago (2013-02-20 14:56:52 UTC) #32

Humper

On 2013/02/20 14:47:57, Humper wrote: > On 2013/02/20 14:40:56, reed1 wrote: > > Ah. Perhaps ...

12 years, 5 months ago (2013-02-20 15:03:13 UTC) #33

reed1

You have a note in your description about needing a guard for layoutests. Should you ...

12 years, 5 months ago (2013-02-20 15:48:19 UTC) #34

reed1

12 years, 5 months ago (2013-02-20 15:49:12 UTC) #35

Soon we should consider splitting some of this into a separate impl file (as we
finally did for all the different backends for gradients). SkBlurRect.cpp is
mighty in stature...

Sign in to reply to this message.

Issue 7307076: Fix a few bugs in both fast and slow blur; implementations now match visually. Also provide a way … (Closed)

Description

Patch Set 1 #

Patch Set 2 : re-enable old GM tests #

Patch Set 3 : Style cleanups and restore old high->low quality trigger behavior #

Patch Set 4 : fix syntax error #

Patch Set 5 : style cleanups, and add support for rectangles that are thinner than the support of the blur kernel #

Patch Set 6 : style nitpicks and function rename. #

Patch Set 7 : Add support for blur styles, redo gm to exercise it. I cut out some of the blur radii from the gm … #

Patch Set 8 : Merge 8 bit rollback from stephen #

Patch Set 9 : fix issues with merge #

Patch Set 10 : Pare down the # of tests / GMs #

Messages