DescriptionFirst stab at runtime optimization of OSL bytecode, based on the actual instance parameters. This is a pretty big refactor, in that it moves quite a bit of compiler infrastructure (like lifetime analysis and temporary coalescing) to runtime rather than in oslc.
The basic idea is that we allow a hint in the shader parameter metadata, [[ int lockgeom=1|0 ]] that says whether or not that parameter is "locked" with respect to the geometry (if locked, its value cannot be overridden by interpolated geometric primitive variables). This means that by the time we shade, many of the parameters (that are not connected or overridden on the geom) have known values, and we can constant-fold the crap out of the code for each instance. For example, multiplication by 1 is eliminated, "if (foo)" can end up removing large swaths of code if foo's value is known, etc.
I've implemented only a very few obvious optimizations, but it's already showing a 25% speedup overall in our renderer (even counting the extra runtime we spend optimizing shader code). I expect this to improve speed even more dramatically as I continue to add other optimizations as well as make the analysis more sophisticated (for example, it currently doesn't consider connections at all, even though it could often know the connected value and treat that, too, as a constant).
I'm aware that many parts of this are rough around the edges. It's a work in progress. But the basic infrastructure is working, and there's a definite speed gain, so I'd like to commit this even though it's an ongoing project. Then subsequent additional optimizations and refactorings will be smaller changes that we can do incrementally.
There are new ShadingSystem options "lockgeom" that gives the default value for "lockgeom" (so you can "opt in" or "opt out", depending on your taste), and for the optimization level to perform at runtime.
Patch Set 1 #Patch Set 2 : Minor bug fix: ShaderGroup::clear needed to set m_optimized=false #
Total comments: 10
MessagesTotal messages: 11
|