Rietveld Code Review Tool
Help | Bug tracker | Discussion group | Source code | Sign in
(3945)

Issue 5700076: Implemented SSE version of ClampX_ClampY_{no}filter_affine (Closed)

Can't Edit
Can't Publish+Mail
Start Review
Created:
12 years, 10 months ago by Jin A.Yang
Modified:
12 years, 10 months ago
Reviewers:
TomH, reed1
Base URL:
http://skia.googlecode.com/svn/trunk/
Visibility:
Public.

Description

This is the follow up for SSE version of ClampX_ClampY_{no}filter_scale. With this SSE patch, drawBitmap with rotation can boost about 20+%. Below is some benchmark data: Original version: out\Release\bench.exe -config 8888 -match bitmap -rotate -f orceFilter 0 -repeat 20 skia bench: alpha=0xFF antialias=1 filter=0 rotate=1 scale=0 clip=0 dither=defau lt strokeWidth=none scalar=float system=WIN32 running bench [640 480] bitmap_8888_update 8888: cmsecs = 52.26 running bench [640 480] bitmap_8888_update_volatile 8888: cmsecs = 49.92 running bench [640 480] bitmap_index8 8888: cmsecs = 54.60 running bench [640 480] bitmap_index8_A 8888: cmsecs = 67.86 running bench [640 480] bitmap_4444 8888: cmsecs = 66.30 running bench [640 480] bitmap_4444_A 8888: cmsecs = 82.68 running bench [640 480] bitmap_565 8888: cmsecs = 72.54 running bench [640 480] bitmap_8888 8888: cmsecs = 50.70 running bench [640 480] bitmap_8888_A 8888: cmsecs = 67.08 out\Release\bench.exe -config 8888 -match bitmap -rotate -f orceFilter 1 -repeat 20 skia bench: alpha=0xFF antialias=1 filter=1 rotate=1 scale=0 clip=0 dither=defau lt strokeWidth=none scalar=float system=WIN32 running bench [640 480] bitmap_8888_update 8888: cmsecs = 155.22 running bench [640 480] bitmap_8888_update_volatile 8888: cmsecs = 153.66 running bench [640 480] bitmap_index8 8888: cmsecs = 173.16 running bench [640 480] bitmap_index8_A 8888: cmsecs = 190.32 running bench [640 480] bitmap_4444 8888: cmsecs = 141.18 running bench [640 480] bitmap_4444_A 8888: cmsecs = 163.80 running bench [640 480] bitmap_565 8888: cmsecs = 145.86 running bench [640 480] bitmap_8888 8888: cmsecs = 153.66 running bench [640 480] bitmap_8888_A 8888: cmsecs = 173.94 The SSE2 version: out\Release\bench.exe -config 8888 -match bitmap -rotate -f orceFilter 0 -repeat 20 skia bench: alpha=0xFF antialias=1 filter=0 rotate=1 scale=0 clip=0 dither=defau lt strokeWidth=none scalar=float system=WIN32 running bench [640 480] bitmap_8888_update 8888: cmsecs = 39.78 running bench [640 480] bitmap_8888_update_volatile 8888: cmsecs = 41.34 running bench [640 480] bitmap_index8 8888: cmsecs = 44.46 running bench [640 480] bitmap_index8_A 8888: cmsecs = 60.06 running bench [640 480] bitmap_4444 8888: cmsecs = 60.06 running bench [640 480] bitmap_4444_A 8888: cmsecs = 73.32 running bench [640 480] bitmap_565 8888: cmsecs = 67.08 running bench [640 480] bitmap_8888 8888: cmsecs = 43.68 running bench [640 480] bitmap_8888_A 8888: cmsecs = 58.50 out\Release\bench.exe -config 8888 -match bitmap -rotate -f orceFilter 1 -repeat 20 skia bench: alpha=0xFF antialias=1 filter=1 rotate=1 scale=0 clip=0 dither=defau lt strokeWidth=none scalar=float system=WIN32 running bench [640 480] bitmap_8888_update 8888: cmsecs = 122.46 running bench [640 480] bitmap_8888_update_volatile 8888: cmsecs = 121.68 running bench [640 480] bitmap_index8 8888: cmsecs = 136.50 running bench [640 480] bitmap_index8_A 8888: cmsecs = 156.00 running bench [640 480] bitmap_4444 8888: cmsecs = 113.88 running bench [640 480] bitmap_4444_A 8888: cmsecs = 126.36 running bench [640 480] bitmap_565 8888: cmsecs = 113.10 running bench [640 480] bitmap_8888 8888: cmsecs = 118.56 running bench [640 480] bitmap_8888_A 8888: cmsecs = 138.84

Patch Set 1 #

Unified diffs Side-by-side diffs Delta from patch set Stats (+163 lines, -0 lines) Patch
src/core/SkBitmapProcState.h View 1 chunk +4 lines, -0 lines 0 comments Download
src/opts/SkBitmapProcState_opts_SSE2.h View 1 chunk +4 lines, -0 lines 0 comments Download
src/opts/SkBitmapProcState_opts_SSE2.cpp View 1 chunk +149 lines, -0 lines 0 comments Download
src/opts/opts_check_SSE2.cpp View 1 chunk +6 lines, -0 lines 0 comments Download

Messages

Total messages: 5
Jin A.Yang
12 years, 10 months ago (2012-02-27 02:03:19 UTC) #1
Jin A.Yang
Hi Tom, Please help review this patch when you are available, thanks!
12 years, 10 months ago (2012-02-28 13:17:16 UTC) #2
TomH
Thank you! I'm seeing 25-30% speedup. Committed as r3272. Do you have a real-world use ...
12 years, 10 months ago (2012-02-28 15:43:59 UTC) #3
Jin A.Yang
On 2012/02/28 15:43:59, TomH wrote: > Thank you! I'm seeing 25-30% speedup. Committed as r3272. ...
12 years, 10 months ago (2012-02-29 01:21:51 UTC) #4
TomH
12 years, 10 months ago (2012-02-29 15:08:32 UTC) #5
On 2012/02/29 01:21:51, Jin Yang wrote:
> There is another bottleneck in this case, that is the
> S32_opaque_D32_filter_DXDY.
> I am working on the SSSE3 version of this now.

Sounds great!
Please close this issue when you get a chance.
Sign in to reply to this message.

Powered by Google App Engine
RSS Feeds Recent Issues | This issue
This is Rietveld f62528b