|
|
Created:
13 years ago by John Bauman Modified:
12 years, 8 months ago CC:
angleproject-review_googlegroups.com Base URL:
https://angleproject.googlecode.com/svn/trunk Visibility:
Public. |
DescriptionAvoid recreating constant vertex buffers unnecessarily.
To avoid continually recreating constant vertex buffers instead use streaming buffers to hold that data.
BUG=
TEST=
Committed: http://code.google.com/p/angleproject/source/detail?r=744
Patch Set 1 #Patch Set 2 : Fix initialization #
Total comments: 2
Patch Set 3 : check for null, move a bit around #Patch Set 4 : Only handle constant vertex data #
Total comments: 3
MessagesTotal messages: 17
This offers a pretty big speedup on certain apps. I suspect NVIDIA drivers are more affected by this, because they're slower to create resources.
Sign in to reply to this message.
Overall looks reasonable. However, other than doing the same type of optimizations, there doesn't really seem to be any good reason for the index buffer and constant buffer changes to be in the same patch. At 4K per buffer that is potentially 68K (16 vertex, 1 line loop) you are adding per-context, assuming all the buffers are used (which they probably aren't). Not necessarily a problem, but just wanted that to be clear. Still would like Nicolas to look at this. http://codereview.appspot.com/4437072/diff/2001/src/libGLESv2/Context.h File src/libGLESv2/Context.h (right): http://codereview.appspot.com/4437072/diff/2001/src/libGLESv2/Context.h#newco... src/libGLESv2/Context.h:498: StreamingIndexBuffer *mIB; maybe move this up a couple lines? (after mBlit) http://codereview.appspot.com/4437072/diff/2001/src/libGLESv2/VertexDataManag... File src/libGLESv2/VertexDataManager.cpp (right): http://codereview.appspot.com/4437072/diff/2001/src/libGLESv2/VertexDataManag... src/libGLESv2/VertexDataManager.cpp:242: float *data = static_cast<float*>(buffer->map(VertexAttribute(), requiredSpace, &mCurrentValueOffsets[i])); the map() call can return NULL which is not handled safely here.
Sign in to reply to this message.
On 2011/04/26 16:31:39, dgkoch wrote: > Overall looks reasonable. However, other than doing the same type of > optimizations, there doesn't really seem to be any good reason for the index > buffer and constant buffer changes to be in the same patch. > > At 4K per buffer that is potentially 68K (16 vertex, 1 line loop) you are adding > per-context, assuming all the buffers are used (which they probably aren't). > Not necessarily a problem, but just wanted that to be clear. Yeah, it could waste some space, but in WDDM the minimum allocation size is 1 page, so unless the driver does some suballocation (which would also help avoid the performance problems we're seeing) we'd just be wasting space anyway. > > Still would like Nicolas to look at this. > > http://codereview.appspot.com/4437072/diff/2001/src/libGLESv2/Context.h > File src/libGLESv2/Context.h (right): > > http://codereview.appspot.com/4437072/diff/2001/src/libGLESv2/Context.h#newco... > src/libGLESv2/Context.h:498: StreamingIndexBuffer *mIB; > maybe move this up a couple lines? (after mBlit) Good idea. > > http://codereview.appspot.com/4437072/diff/2001/src/libGLESv2/VertexDataManag... > File src/libGLESv2/VertexDataManager.cpp (right): > > http://codereview.appspot.com/4437072/diff/2001/src/libGLESv2/VertexDataManag... > src/libGLESv2/VertexDataManager.cpp:242: float *data = > static_cast<float*>(buffer->map(VertexAttribute(), requiredSpace, > &mCurrentValueOffsets[i])); > the map() call can return NULL which is not handled safely here. Good point.
Sign in to reply to this message.
Looks great to me. My only nitpick is that it's probably overkill to optimize the line loops. It's a very rarely used draw type, and applications that do use it are not likely to be performance critical. So I doubt it's worth the trouble. Since you optimized it anyway, I'd recommend to at least make it a separate patch so that if it ever caused any kind of issue it can be rolled back independently. Another suggestion is to rename Context::mIB to mClosingIndexBuffer or mClosingIB just to make it a bit clearer what it's used for (and what it shouldn't be used for). Nice work!
Sign in to reply to this message.
On 2011/04/27 12:14:02, nicolas wrote: > Looks great to me. > > My only nitpick is that it's probably overkill to optimize the line loops. It's > a very rarely used draw type, and applications that do use it are not likely to > be performance critical. So I doubt it's worth the trouble. Since you optimized > it anyway, I'd recommend to at least make it a separate patch so that if it ever > caused any kind of issue it can be rolled back independently. > > Another suggestion is to rename Context::mIB to mClosingIndexBuffer or > mClosingIB just to make it a bit clearer what it's used for (and what it > shouldn't be used for). > Sure, I'll separate it out. I have seen several apps (mostly doing canvas emulation) where closing loops has been by far the performance bottleneck, so I'm pretty sure it's worthwhile.
Sign in to reply to this message.
Remove the index-buffer related part.
Sign in to reply to this message.
Hi John, One tiny nit re initializing the new member. LGTM after that is fixed. http://codereview.appspot.com/4437072/diff/3006/src/libGLESv2/VertexDataManag... File src/libGLESv2/VertexDataManager.cpp (right): http://codereview.appspot.com/4437072/diff/3006/src/libGLESv2/VertexDataManag... src/libGLESv2/VertexDataManager.cpp:35: mCurrentValueBuffer[i] = NULL; lets initialize mCurrentValueOffsets[i] = 0; here as well
Sign in to reply to this message.
There might be a severe issue with this implementation. We discovered a while ago that several ATI devices don't allow stride 0 for a vertex buffer created with the D3DUSAGE_DYNAMIC flag. Daniel, are you able to confirm that this patch does not work on your ATI system?
Sign in to reply to this message.
Just to clarify, the StreamingVertexBuffer uses D3DUSAGE_DYNAMIC, which is necessary for efficient 'streaming' updates. So using this class for storing constant attributes, with a stride 0 in the vertex declaration, might be an issue on ATI devices. We could use a cache of ConstantVertexBuffer's, but if the attribute value is frequently updated and rarely the same, this wouldn't be any faster than the current implemenation. John, can you tell us a bit more about the typical application behavior you're seeing? We could also keep the implementation that uses streaming buffers, for devices that do support D3DUSAGE_DYNAMIC with stride 0. Probably drawing something off-screen can tell us deterministically if it's supported or not. This could be done in separate patches.
Sign in to reply to this message.
On 2011/04/28 10:09:08, nicolas wrote: > Just to clarify, the StreamingVertexBuffer uses D3DUSAGE_DYNAMIC, which is > necessary for efficient 'streaming' updates. So using this class for storing > constant attributes, with a stride 0 in the vertex declaration, might be an > issue on ATI devices. Wow, that's pretty bad. I guess the DCTs/DTMs aren't extensive enough. > > We could use a cache of ConstantVertexBuffer's, but if the attribute value is > frequently updated and rarely the same, this wouldn't be any faster than the > current implemenation. John, can you tell us a bit more about the typical > application behavior you're seeing? Actually, I haven't really investigated that yet. It may even be that not dirtying the constant value when it's being set to what it already is would be enough. I'll have to look some more. > > We could also keep the implementation that uses streaming buffers, for devices > that do support D3DUSAGE_DYNAMIC with stride 0. Probably drawing something > off-screen can tell us deterministically if it's supported or not. This could be > done in separate patches. As long as we can be sure the test won't accidentally succeed, this might work.
Sign in to reply to this message.
On 2011/04/28 10:31:22, John Bauman wrote: > On 2011/04/28 10:09:08, nicolas wrote: > > Just to clarify, the StreamingVertexBuffer uses D3DUSAGE_DYNAMIC, which is > > necessary for efficient 'streaming' updates. So using this class for storing > > constant attributes, with a stride 0 in the vertex declaration, might be an > > issue on ATI devices. > > Wow, that's pretty bad. I guess the DCTs/DTMs aren't extensive enough. Right. Good catch Nicolas, I had forgotten about that. I'll try to give this patch a test here on my ATI system to see how it handles it. Regardless, we know there are problems zero stride + dynamic, and we'll probably have to avoid it. > > > > We could use a cache of ConstantVertexBuffer's, but if the attribute value is > > frequently updated and rarely the same, this wouldn't be any faster than the > > current implemenation. John, can you tell us a bit more about the typical > > application behavior you're seeing? > Actually, I haven't really investigated that yet. It may even be that not > dirtying the constant value when it's being set to what it already is would be > enough. I'll have to look some more. Yes, lets put in checks to filter out redundant vertex attrib value updates first (as a separate patch) and then re-evaluate if we need to do constant buffer caching or change it to using streaming buffers. > > > > We could also keep the implementation that uses streaming buffers, for devices > > that do support D3DUSAGE_DYNAMIC with stride 0. Probably drawing something > > off-screen can tell us deterministically if it's supported or not. This could > be > > done in separate patches. > As long as we can be sure the test won't accidentally succeed, this might work.
Sign in to reply to this message.
I just got a chance to play around with the test app you wrote, and it turns out that this works perfectly if the vertex buffer is at least 4096 bytes and fails if the buffer is smaller than that, which explains why this patch happens to work. That seems like a somewhat dangerous thing to rely on, but this patch helps the performance of some workloads enough that I think it may be worth it. On 2011/04/28 13:45:11, dgkoch wrote: > On 2011/04/28 10:31:22, John Bauman wrote: > > On 2011/04/28 10:09:08, nicolas wrote: > > > Just to clarify, the StreamingVertexBuffer uses D3DUSAGE_DYNAMIC, which is > > > necessary for efficient 'streaming' updates. So using this class for storing > > > constant attributes, with a stride 0 in the vertex declaration, might be an > > > issue on ATI devices. > > > > Wow, that's pretty bad. I guess the DCTs/DTMs aren't extensive enough. > > Right. Good catch Nicolas, I had forgotten about that. I'll try to give this > patch a test here on my ATI system to see how it handles it. Regardless, we > know there are problems zero stride + dynamic, and we'll probably have to avoid > it. > > > > > > > We could use a cache of ConstantVertexBuffer's, but if the attribute value > is > > > frequently updated and rarely the same, this wouldn't be any faster than the > > > current implemenation. John, can you tell us a bit more about the typical > > > application behavior you're seeing? > > Actually, I haven't really investigated that yet. It may even be that not > > dirtying the constant value when it's being set to what it already is would be > > enough. I'll have to look some more. > > Yes, lets put in checks to filter out redundant vertex attrib value updates > first (as a separate patch) and then re-evaluate if we need to do constant > buffer caching or change it to using streaming buffers. > > > > > > > We could also keep the implementation that uses streaming buffers, for > devices > > > that do support D3DUSAGE_DYNAMIC with stride 0. Probably drawing something > > > off-screen can tell us deterministically if it's supported or not. This > could > > be > > > done in separate patches. > > As long as we can be sure the test won't accidentally succeed, this might > work.
Sign in to reply to this message.
Would this be ok to check in? It seems to work completely consistently with the given buffer size, and I can't think of any other good way to speed up these operations.
Sign in to reply to this message.
A couple of minor comments. LGTM after those are handled. I believe we also still haven't filtered out redundant current attrib updates which could potentially help this even more. http://codereview.appspot.com/4437072/diff/3006/src/libGLESv2/VertexDataManag... File src/libGLESv2/VertexDataManager.cpp (right): http://codereview.appspot.com/4437072/diff/3006/src/libGLESv2/VertexDataManag... src/libGLESv2/VertexDataManager.cpp:24: enum { CONSTANT_VERTEX_BUFFER_SIZE = 4096 }; you might want to include a comment about this size needing to be at least 4k to avoid zero stride issues http://codereview.appspot.com/4437072/diff/3006/src/libGLESv2/VertexDataManag... src/libGLESv2/VertexDataManager.cpp:35: mCurrentValueBuffer[i] = NULL; On 2011/04/28 03:02:20, dgkoch wrote: > lets initialize mCurrentValueOffsets[i] = 0; here as well This is still out-standing..
Sign in to reply to this message.
On Thu, Sep 1, 2011 at 8:12 PM, <daniel@transgaming.com> wrote: > A couple of minor comments. LGTM after those are handled. > > I believe we also still haven't filtered out redundant current attrib > updates which could potentially help this even more. > > > I tried a patch with that and it didn't help with my use-case at all. Of course, it could help with with other apps. > > > http://codereview.appspot.com/**4437072/diff/3006/src/** > libGLESv2/VertexDataManager.**cpp<http://codereview.appspot.com/4437072/diff/3006/src/libGLESv2/VertexDataManager.cpp> > File src/libGLESv2/**VertexDataManager.cpp (right): > > http://codereview.appspot.com/**4437072/diff/3006/src/** > libGLESv2/VertexDataManager.**cpp#newcode24<http://codereview.appspot.com/4437072/diff/3006/src/libGLESv2/VertexDataManager.cpp#newcode24> > src/libGLESv2/**VertexDataManager.cpp:24: enum { > CONSTANT_VERTEX_BUFFER_SIZE = 4096 }; > you might want to include a comment about this size needing to be at > least 4k to avoid zero stride issues > Good point > > http://codereview.appspot.com/**4437072/diff/3006/src/** > libGLESv2/VertexDataManager.**cpp#newcode35<http://codereview.appspot.com/4437072/diff/3006/src/libGLESv2/VertexDataManager.cpp#newcode35> > src/libGLESv2/**VertexDataManager.cpp:35: mCurrentValueBuffer[i] = NULL; > On 2011/04/28 03:02:20, dgkoch wrote: > >> lets initialize mCurrentValueOffsets[i] = 0; here as well >> > > This is still out-standing.. Oh, sorry. Forgot. > > > http://codereview.appspot.com/**4437072/<http://codereview.appspot.com/4437072/> >
Sign in to reply to this message.
Did you have an updated patchset for this?
Sign in to reply to this message.
On 2011/09/11 02:46:10, dgkoch wrote: > Did you have an updated patchset for this? nm. you checked the updated version in.
Sign in to reply to this message.
|