DescriptionThis patch speeds up GPU blurs by clearing only when necessary. This gives a 1.8X speedup on the Blurs sample, and 2.3X on the BigBlur sample.
We don't need to clear while downsampling, since each step reads only the pixels written in the previous step. We can avoid destination clears before convolution by disabling blending. We also don't need to clear when upsampling, since the upsample step also only reads pixels written by the convolution. The only clears we then need to do are on each side of the srcRect used for convolution, and a 1-pixel border for bilinear upsampling. Since our srcRect is always offset to (0, 0), we only need to clear on the right and bottom.
Patch Set 1 #Patch Set 2 : Add 1-pixel clears for bilinear upsampling #Patch Set 3 : Move comment to the right place #
MessagesTotal messages: 5
|