Rietveld Code Review Tool
Help | Bug tracker | Discussion group | Source code | Sign in
(1)

Issue 5790043: make S4444_opaque_D32_nofilter_DX faster with SSE2 support

Can't Edit
Can't Publish+Mail
Start Review
Created:
13 years, 10 months ago by guanqun
Modified:
13 years, 10 months ago
Reviewers:
Jin A.Yang, TomH
CC:
skia-review_googlegroups.com
Base URL:
http://skia.googlecode.com/svn/trunk
Visibility:
Public.

Description

make S4444_opaque_D32_nofilter_DX faster with SSE2 support On an Atom based netbook, before this patch: [skia.git]$ ./out/Release/bench -config 8888 -repeat 100 -match repeatTile_4444 skia bench: alpha=0xFF antialias=1 filter=0 rotate=0 scale=0 clip=0 dither=default strokeWidth=none scalar=float system=UNIX running bench [640 480] repeatTile_4444 8888: cmsecs = 113.83 With this patch: [skia.git]$ ./out/Release/bench -config 8888 -repeat 100 -match repeatTile_4444 skia bench: alpha=0xFF antialias=1 filter=0 rotate=0 scale=0 clip=0 dither=default strokeWidth=none scalar=float system=UNIX running bench [640 480] repeatTile_4444 8888: cmsecs = 82.71 So this is about 27% performance boost. I've also tested with gm tool and the result stays intact. BUG= TEST=

Patch Set 1 #

Total comments: 4

Patch Set 2 : comment fixes #

Total comments: 7

Patch Set 3 : add the blank line #

Unified diffs Side-by-side diffs Delta from patch set Stats (+117 lines, -0 lines) Patch
src/core/SkBitmapProcState.h View 1 1 chunk +2 lines, -0 lines 0 comments Download
src/opts/SkBitmapProcState_opts_SSE2.h View 1 1 chunk +3 lines, -0 lines 0 comments Download
src/opts/SkBitmapProcState_opts_SSE2.cpp View 1 2 2 chunks +106 lines, -0 lines 0 comments Download
src/opts/opts_check_SSE2.cpp View 1 1 chunk +6 lines, -0 lines 0 comments Download

Messages

Total messages: 7
guanqun
Please review. Thanks!
13 years, 10 months ago (2012-03-08 07:36:04 UTC) #1
TomH
We'll take a look. Is there an actual end-user use case for this? We weren't ...
13 years, 10 months ago (2012-03-08 14:01:02 UTC) #2
TomH
http://codereview.appspot.com/5790043/diff/1/src/opts/SkBitmapProcState_opts_SSE2.cpp File src/opts/SkBitmapProcState_opts_SSE2.cpp (right): http://codereview.appspot.com/5790043/diff/1/src/opts/SkBitmapProcState_opts_SSE2.cpp#newcode643 src/opts/SkBitmapProcState_opts_SSE2.cpp:643: * So this optimization is specific for such config. ...
13 years, 10 months ago (2012-03-08 19:40:18 UTC) #3
guanqun
I happened to notice this slowness when I was running bench on the netbook. So ...
13 years, 10 months ago (2012-03-09 01:33:34 UTC) #4
Jin A.Yang
http://codereview.appspot.com/5790043/diff/6001/src/opts/SkBitmapProcState_opts_SSE2.cpp File src/opts/SkBitmapProcState_opts_SSE2.cpp (right): http://codereview.appspot.com/5790043/diff/6001/src/opts/SkBitmapProcState_opts_SSE2.cpp#newcode637 src/opts/SkBitmapProcState_opts_SSE2.cpp:637: namespace { why do we need the namespace http://codereview.appspot.com/5790043/diff/6001/src/opts/SkBitmapProcState_opts_SSE2.cpp#newcode658 ...
13 years, 10 months ago (2012-03-09 04:38:36 UTC) #5
guanqun
http://codereview.appspot.com/5790043/diff/6001/src/opts/SkBitmapProcState_opts_SSE2.cpp File src/opts/SkBitmapProcState_opts_SSE2.cpp (right): http://codereview.appspot.com/5790043/diff/6001/src/opts/SkBitmapProcState_opts_SSE2.cpp#newcode637 src/opts/SkBitmapProcState_opts_SSE2.cpp:637: namespace { On 2012/03/09 04:38:37, Jin Yang wrote: > ...
13 years, 10 months ago (2012-03-09 05:05:06 UTC) #6
Jin A.Yang
13 years, 10 months ago (2012-03-09 08:37:32 UTC) #7
http://codereview.appspot.com/5790043/diff/6001/src/opts/SkBitmapProcState_op...
File src/opts/SkBitmapProcState_opts_SSE2.cpp (right):

http://codereview.appspot.com/5790043/diff/6001/src/opts/SkBitmapProcState_op...
src/opts/SkBitmapProcState_opts_SSE2.cpp:637: namespace {
On 2012/03/09 05:05:07, guanqun wrote:
> On 2012/03/09 04:38:37, Jin Yang wrote:
> > why do we need the namespace
> 
> I want to put all the auxiliary functions in the anonymous namespace just as
the
> way it's done in SkBitmapProcState_opts_SSE3.cpp.
that file used the namespace as a tricky to convince gcc to inline the template
function. I think you do not need the tricky there.
Sign in to reply to this message.

Powered by Google App Engine
RSS Feeds Recent Issues | This issue
This is Rietveld f62528b