Rietveld Code Review Tool
Help | Bug tracker | Discussion group | Source code | Sign in
(63)

Issue 206045: Pre-review: run state overhaul (Closed)

Can't Edit
Can't Publish+Mail
Start Review
Created:
14 years, 1 month ago by larrygritz
Modified:
14 years, 1 month ago
Reviewers:
Chris Foster, aconty, ckulla
CC:
dev-osl_imageworks.com, osl-dev_googlegroups.com
Base URL:
http://openshadinglanguage.googlecode.com/svn/trunk/
Visibility:
Public.

Description

We've done extensive analysis showing that for typical shading batches we see in the wild, our method of having an array of true/false runflags is very wasteful, and despite the extra indirection involved, having a flat list of the "on" points is much more efficient (5x or more speed gain on tight inner loops). We're not sure yet exactly what this will translate to in overall runtime gains, but it's got to help. This is not a final review, just showing progress along the way. The code in this pre-review adds the index list to the Runstate, and maintains it properly. I validate this by checking at every instruction that the runflags and indices, and all tests pass! So I know for sure that the indices are being set up properly. The whole point is that now I can convert the shadeops one by one, at all times keeping the renderer working, and then only when everything is converted rip out the old runflags entirely.

Patch Set 1 #

Total comments: 1

Patch Set 2 : Full implementation of compile-time selection of runflags, indices, or spans #

Patch Set 3 : Update patch set to sync with trunk #

Total comments: 2
Unified diffs Side-by-side diffs Delta from patch set Stats (+1317 lines, -1055 lines) Patch
src/include/osl_pvt.h View 2 3 chunks +8 lines, -6 lines 0 comments Download
src/include/oslconfig.h View 2 1 chunk +9 lines, -0 lines 0 comments Download
src/liboslexec/background.cpp View 2 1 chunk +1 line, -2 lines 0 comments Download
src/liboslexec/bsdf_ashikhmin_velvet.cpp View 2 1 chunk +1 line, -2 lines 0 comments Download
src/liboslexec/bsdf_cloth.cpp View 2 1 chunk +1 line, -2 lines 0 comments Download
src/liboslexec/bsdf_cloth_specular.cpp View 2 1 chunk +1 line, -2 lines 0 comments Download
src/liboslexec/bsdf_diffuse.cpp View 2 1 chunk +2 lines, -4 lines 0 comments Download
src/liboslexec/bsdf_hair.cpp View 2 1 chunk +2 lines, -4 lines 0 comments Download
src/liboslexec/bsdf_microfacet.cpp View 2 1 chunk +4 lines, -8 lines 0 comments Download
src/liboslexec/bsdf_phong.cpp View 2 2 chunks +2 lines, -4 lines 0 comments Download
src/liboslexec/bsdf_reflection.cpp View 2 1 chunk +2 lines, -4 lines 0 comments Download
src/liboslexec/bsdf_refraction.cpp View 2 1 chunk +2 lines, -4 lines 0 comments Download
src/liboslexec/bsdf_transparent.cpp View 2 1 chunk +1 line, -2 lines 0 comments Download
src/liboslexec/bsdf_ward.cpp View 2 1 chunk +1 line, -2 lines 0 comments Download
src/liboslexec/bsdf_westin.cpp View 2 1 chunk +2 lines, -4 lines 0 comments Download
src/liboslexec/bssrdf.cpp View 2 1 chunk +1 line, -2 lines 0 comments Download
src/liboslexec/context.cpp View 2 3 chunks +4 lines, -1 line 0 comments Download
src/liboslexec/emissive.cpp View 2 1 chunk +3 lines, -6 lines 0 comments Download
src/liboslexec/exec.cpp View 1 2 22 chunks +190 lines, -80 lines 1 comment Download
src/liboslexec/oparray.cpp View 2 5 chunks +41 lines, -49 lines 0 comments Download
src/liboslexec/opassign.cpp View 2 5 chunks +20 lines, -20 lines 0 comments Download
src/liboslexec/opattribute.cpp View 2 4 chunks +22 lines, -25 lines 0 comments Download
src/liboslexec/opcolor.cpp View 2 3 chunks +8 lines, -8 lines 0 comments Download
src/liboslexec/opcompare.cpp View 2 6 chunks +6 lines, -6 lines 0 comments Download
src/liboslexec/opcontrol.cpp View 1 2 9 chunks +201 lines, -70 lines 0 comments Download
src/liboslexec/opderivs.cpp View 2 7 chunks +21 lines, -21 lines 0 comments Download
src/liboslexec/opinteger.cpp View 2 9 chunks +12 lines, -18 lines 1 comment Download
src/liboslexec/opmath.cpp View 2 8 chunks +12 lines, -12 lines 0 comments Download
src/liboslexec/opmathfunc.cpp View 2 25 chunks +85 lines, -127 lines 0 comments Download
src/liboslexec/opmatrix.cpp View 2 8 chunks +52 lines, -59 lines 0 comments Download
src/liboslexec/opmessage.cpp View 2 2 chunks +58 lines, -62 lines 0 comments Download
src/liboslexec/opmisc.cpp View 1 chunk +0 lines, -2 lines 0 comments Download
src/liboslexec/opnoise.cpp View 2 4 chunks +8 lines, -13 lines 0 comments Download
src/liboslexec/opspline.cpp View 5 chunks +11 lines, -13 lines 0 comments Download
src/liboslexec/opstring.cpp View 2 14 chunks +98 lines, -74 lines 0 comments Download
src/liboslexec/optexture.cpp View 2 3 chunks +58 lines, -65 lines 0 comments Download
src/liboslexec/opvector.cpp View 2 17 chunks +108 lines, -132 lines 0 comments Download
src/liboslexec/oslexec_pvt.h View 1 2 9 chunks +86 lines, -30 lines 0 comments Download
src/liboslexec/oslops.h View 2 22 chunks +173 lines, -110 lines 0 comments Download

Messages

Total messages: 14
larrygritz
14 years, 1 month ago (2010-02-09 05:58:25 UTC) #1
larrygritz
Incidentally, how much slower do we render by keeping both the runflags and indices in ...
14 years, 1 month ago (2010-02-09 06:19:37 UTC) #2
aconty
LGTM
14 years, 1 month ago (2010-02-09 17:57:44 UTC) #3
ckulla
LGTM2 http://codereview.appspot.com/206045/diff/1/4 File src/liboslexec/exec.cpp (right): http://codereview.appspot.com/206045/diff/1/4#newcode583 src/liboslexec/exec.cpp:583: #if 1 Should this be #if 0'd ? ...
14 years, 1 month ago (2010-02-09 18:16:13 UTC) #4
lg_imageworks.com
No, it's 2% slower with it '#if 0'. I'm not going to commit in this ...
14 years, 1 month ago (2010-02-09 19:10:05 UTC) #5
larrygritz
Full implementation of compile-time selection of runflags, indices, or spans
14 years, 1 month ago (2010-02-14 20:12:09 UTC) #6
larrygritz
On 2010/02/14 20:12:09, larrygritz wrote: > Full implementation of compile-time selection of runflags, indices, or ...
14 years, 1 month ago (2010-02-14 20:14:18 UTC) #7
larrygritz
Update patch set to sync with trunk
14 years, 1 month ago (2010-02-23 02:17:26 UTC) #8
larrygritz
OK, I just updated the patch set again to keep up with the trunk, in ...
14 years, 1 month ago (2010-02-23 02:21:53 UTC) #9
ckulla
LGTM I agree that it would be good to commit this as it cleans things ...
14 years, 1 month ago (2010-02-23 02:39:04 UTC) #10
lg_imageworks.com
On Feb 22, 2010, at 6:39 PM, <ckulla@gmail.com> <ckulla@gmail.com> wrote: > http://codereview.appspot.com/206045/diff/2001/2018#newcode542 > src/liboslexec/exec.cpp:542: // ...
14 years, 1 month ago (2010-02-23 04:18:53 UTC) #11
lg_larrygritz.com
On Feb 22, 2010, at 6:39 PM, ckulla@gmail.com wrote: > Have you been able to ...
14 years, 1 month ago (2010-02-23 05:09:24 UTC) #12
Chris Foster
LGTM. Too bad you didn't see a performance improvement with this. Very perplexing. http://codereview.appspot.com/206045/diff/2001/2040 File ...
14 years, 1 month ago (2010-02-23 06:01:51 UTC) #13
ckulla
14 years, 1 month ago (2010-02-23 06:57:34 UTC) #14
> I replicated the results wherein spans worked slightly fastest when all points
> were on (which is what testshade does).
> 
> I didn't rig testshade to do sparse sets.  Then again, with your recent
change,

I was thinking you could do something like:
if (cellnoise(100*u,100*v) > 0.9) {
  // lots of arithmetic ops
}

to emulate the sparse points (you can vary the amount of "sparseness" by
changing the constants).

But it is true that we don't really have the sparse input batches anymore, so
perhaps the whole exercise is now only relevant for varying conditionals (which
may be noticeable someday, or for other types of applications with very large
batches).
Sign in to reply to this message.

Powered by Google App Engine
RSS Feeds Recent Issues | This issue
This is Rietveld f62528b