Rietveld Code Review Tool
Help | Bug tracker | Discussion group | Source code | Sign in
(66)

Issue 206045: Pre-review: run state overhaul (Closed)

Can't Edit
Can't Publish+Mail
Start Review
Created:
12 years, 9 months ago by larrygritz
Modified:
12 years, 9 months ago
Reviewers:
Chris Foster, aconty, ckulla
CC:
dev-osl_imageworks.com, osl-dev_googlegroups.com
Base URL:
http://openshadinglanguage.googlecode.com/svn/trunk/
Visibility:
Public.

Description

We've done extensive analysis showing that for typical shading batches we see in the wild, our method of having an array of true/false runflags is very wasteful, and despite the extra indirection involved, having a flat list of the "on" points is much more efficient (5x or more speed gain on tight inner loops). We're not sure yet exactly what this will translate to in overall runtime gains, but it's got to help. This is not a final review, just showing progress along the way. The code in this pre-review adds the index list to the Runstate, and maintains it properly. I validate this by checking at every instruction that the runflags and indices, and all tests pass! So I know for sure that the indices are being set up properly. The whole point is that now I can convert the shadeops one by one, at all times keeping the renderer working, and then only when everything is converted rip out the old runflags entirely.

Patch Set 1 #

Total comments: 1

Patch Set 2 : Full implementation of compile-time selection of runflags, indices, or spans #

Patch Set 3 : Update patch set to sync with trunk #

Total comments: 2
Unified diffs Side-by-side diffs Delta from patch set Stats (+1317 lines, -1055 lines) Patch
src/include/osl_pvt.h View 2 3 chunks +8 lines, -6 lines 0 comments Download
src/include/oslconfig.h View 2 1 chunk +9 lines, -0 lines 0 comments Download
src/liboslexec/background.cpp View 2 1 chunk +1 line, -2 lines 0 comments Download
src/liboslexec/bsdf_ashikhmin_velvet.cpp View 2 1 chunk +1 line, -2 lines 0 comments Download
src/liboslexec/bsdf_cloth.cpp View 2 1 chunk +1 line, -2 lines 0 comments Download
src/liboslexec/bsdf_cloth_specular.cpp View 2 1 chunk +1 line, -2 lines 0 comments Download
src/liboslexec/bsdf_diffuse.cpp View 2 1 chunk +2 lines, -4 lines 0 comments Download
src/liboslexec/bsdf_hair.cpp View 2 1 chunk +2 lines, -4 lines 0 comments Download
src/liboslexec/bsdf_microfacet.cpp View 2 1 chunk +4 lines, -8 lines 0 comments Download
src/liboslexec/bsdf_phong.cpp View 2 2 chunks +2 lines, -4 lines 0 comments Download
src/liboslexec/bsdf_reflection.cpp View 2 1 chunk +2 lines, -4 lines 0 comments Download
src/liboslexec/bsdf_refraction.cpp View 2 1 chunk +2 lines, -4 lines 0 comments Download
src/liboslexec/bsdf_transparent.cpp View 2 1 chunk +1 line, -2 lines 0 comments Download
src/liboslexec/bsdf_ward.cpp View 2 1 chunk +1 line, -2 lines 0 comments Download
src/liboslexec/bsdf_westin.cpp View 2 1 chunk +2 lines, -4 lines 0 comments Download
src/liboslexec/bssrdf.cpp View 2 1 chunk +1 line, -2 lines 0 comments Download
src/liboslexec/context.cpp View 2 3 chunks +4 lines, -1 line 0 comments Download
src/liboslexec/emissive.cpp View 2 1 chunk +3 lines, -6 lines 0 comments Download
src/liboslexec/exec.cpp View 1 2 22 chunks +190 lines, -80 lines 1 comment Download
src/liboslexec/oparray.cpp View 2 5 chunks +41 lines, -49 lines 0 comments Download
src/liboslexec/opassign.cpp View 2 5 chunks +20 lines, -20 lines 0 comments Download
src/liboslexec/opattribute.cpp View 2 4 chunks +22 lines, -25 lines 0 comments Download
src/liboslexec/opcolor.cpp View 2 3 chunks +8 lines, -8 lines 0 comments Download
src/liboslexec/opcompare.cpp View 2 6 chunks +6 lines, -6 lines 0 comments Download
src/liboslexec/opcontrol.cpp View 1 2 9 chunks +201 lines, -70 lines 0 comments Download
src/liboslexec/opderivs.cpp View 2 7 chunks +21 lines, -21 lines 0 comments Download
src/liboslexec/opinteger.cpp View 2 9 chunks +12 lines, -18 lines 1 comment Download
src/liboslexec/opmath.cpp View 2 8 chunks +12 lines, -12 lines 0 comments Download
src/liboslexec/opmathfunc.cpp View 2 25 chunks +85 lines, -127 lines 0 comments Download
src/liboslexec/opmatrix.cpp View 2 8 chunks +52 lines, -59 lines 0 comments Download
src/liboslexec/opmessage.cpp View 2 2 chunks +58 lines, -62 lines 0 comments Download
src/liboslexec/opmisc.cpp View 1 chunk +0 lines, -2 lines 0 comments Download
src/liboslexec/opnoise.cpp View 2 4 chunks +8 lines, -13 lines 0 comments Download
src/liboslexec/opspline.cpp View 5 chunks +11 lines, -13 lines 0 comments Download
src/liboslexec/opstring.cpp View 2 14 chunks +98 lines, -74 lines 0 comments Download
src/liboslexec/optexture.cpp View 2 3 chunks +58 lines, -65 lines 0 comments Download
src/liboslexec/opvector.cpp View 2 17 chunks +108 lines, -132 lines 0 comments Download
src/liboslexec/oslexec_pvt.h View 1 2 9 chunks +86 lines, -30 lines 0 comments Download
src/liboslexec/oslops.h View 2 22 chunks +173 lines, -110 lines 0 comments Download

Messages

Total messages: 14
larrygritz
12 years, 9 months ago (2010-02-09 05:58:25 UTC) #1
larrygritz
Incidentally, how much slower do we render by keeping both the runflags and indices in ...
12 years, 9 months ago (2010-02-09 06:19:37 UTC) #2
aconty
LGTM
12 years, 9 months ago (2010-02-09 17:57:44 UTC) #3
ckulla
LGTM2 http://codereview.appspot.com/206045/diff/1/4 File src/liboslexec/exec.cpp (right): http://codereview.appspot.com/206045/diff/1/4#newcode583 src/liboslexec/exec.cpp:583: #if 1 Should this be #if 0'd ? ...
12 years, 9 months ago (2010-02-09 18:16:13 UTC) #4
lg_imageworks.com
No, it's 2% slower with it '#if 0'. I'm not going to commit in this ...
12 years, 9 months ago (2010-02-09 19:10:05 UTC) #5
larrygritz
Full implementation of compile-time selection of runflags, indices, or spans
12 years, 9 months ago (2010-02-14 20:12:09 UTC) #6
larrygritz
On 2010/02/14 20:12:09, larrygritz wrote: > Full implementation of compile-time selection of runflags, indices, or ...
12 years, 9 months ago (2010-02-14 20:14:18 UTC) #7
larrygritz
Update patch set to sync with trunk
12 years, 9 months ago (2010-02-23 02:17:26 UTC) #8
larrygritz
OK, I just updated the patch set again to keep up with the trunk, in ...
12 years, 9 months ago (2010-02-23 02:21:53 UTC) #9
ckulla
LGTM I agree that it would be good to commit this as it cleans things ...
12 years, 9 months ago (2010-02-23 02:39:04 UTC) #10
lg_imageworks.com
On Feb 22, 2010, at 6:39 PM, <ckulla@gmail.com> <ckulla@gmail.com> wrote: > http://codereview.appspot.com/206045/diff/2001/2018#newcode542 > src/liboslexec/exec.cpp:542: // ...
12 years, 9 months ago (2010-02-23 04:18:53 UTC) #11
lg_larrygritz.com
On Feb 22, 2010, at 6:39 PM, ckulla@gmail.com wrote: > Have you been able to ...
12 years, 9 months ago (2010-02-23 05:09:24 UTC) #12
Chris Foster
LGTM. Too bad you didn't see a performance improvement with this. Very perplexing. http://codereview.appspot.com/206045/diff/2001/2040 File ...
12 years, 9 months ago (2010-02-23 06:01:51 UTC) #13
ckulla
12 years, 9 months ago (2010-02-23 06:57:34 UTC) #14
> I replicated the results wherein spans worked slightly fastest when all points
> were on (which is what testshade does).
> 
> I didn't rig testshade to do sparse sets.  Then again, with your recent
change,

I was thinking you could do something like:
if (cellnoise(100*u,100*v) > 0.9) {
  // lots of arithmetic ops
}

to emulate the sparse points (you can vary the amount of "sparseness" by
changing the constants).

But it is true that we don't really have the sparse input batches anymore, so
perhaps the whole exercise is now only relevant for varying conditionals (which
may be noticeable someday, or for other types of applications with very large
batches).
Sign in to reply to this message.

Powered by Google App Engine
RSS Feeds Recent Issues | This issue
This is Rietveld f62528b