|
|
Created:
10 years, 9 months ago by Peter Barnes Modified:
10 years, 8 months ago CC:
ns-developers_isi.edu, ns-3-reviews_googlegroups.com Visibility:
Public. |
DescriptionThere are many cases where you want to use several different rng distributions,
but you only want to manage one seed when doing replicated runs.
This patch allows several RandomVariableStreams to use the same underlying RngStream.
Patch Set 1 #
Total comments: 2
Patch Set 2 : Implement ShareStream. Hide RngStream from public API. #
Total comments: 3
MessagesTotal messages: 11
There are many cases where you want to use several different rng distributions, but you only want to manage one seed when doing replicated runs. This patch allows several RandomVariableStreams to use the same underlying RngStream. Code review: https://codereview.appspot.com/114060043/ Affected files: M src/core/model/random-variable-stream.h M src/core/model/random-variable-stream.cc M src/core/model/rng-stream.h M src/core/model/rng-stream.cc
Sign in to reply to this message.
On 2014/07/21 17:41:08, barnes26_llnl.gov wrote: > There are many cases where you want to use several different rng distributions, > but you only want to manage one seed when doing replicated runs. > > This patch allows several RandomVariableStreams to use the same underlying > RngStream. > > Code review: https://codereview.appspot.com/114060043/ > > Affected files: > M src/core/model/random-variable-stream.h > M src/core/model/random-variable-stream.cc > M src/core/model/rng-stream.h > M src/core/model/rng-stream.cc I'm not against, but I think we need to provide a detailed (well, more detailed) set of examples about how to set the rng streams. I.e., how to use the new functionality? Pros and cons w.r.t., the global seed. As a side note, the docs could be improved also for the helpers AssignStreams function, i.e., who's calling them and so on.
Sign in to reply to this message.
https://codereview.appspot.com/114060043/diff/1/src/core/model/rng-stream.h File src/core/model/rng-stream.h (right): https://codereview.appspot.com/114060043/diff/1/src/core/model/rng-stream.h#n... src/core/model/rng-stream.h:71: * \param [in] o Other RngStream this is "\param [in] r Other RngStream"
Sign in to reply to this message.
I don't see the need for this patch, can you elaborate? Each RandomVariableStream object has a SetStream method that can be used to assign it to whatever substream is desired, including assigning multiple RandomVariableStream instances to the same underlying RNG. The main addition that I see with your proposed patch is to expose a handle to the underlying RngStream object so one can set the seed on each one on an individual basis. But I'm not sure of the use case for this and it seems to go against the guidance of this generator to use one seed for the simulation manage independent replications by incrementing run numbers. If anything, I would suggest that we modify src/core/wscript to remove rng-stream.h from the publicly exported headers (I don't think it was intended to be accessed by user programs and my guess is that it was accidentally added to the wscript at some point).
Sign in to reply to this message.
On Jul 21, 2014, at 2:54 PM, tomh.org@gmail.com wrote: > I don't see the need for this patch, can you elaborate? Sure. The basic issue is to control what things vary across different executions of a model. One example is testing with a fixed mobility pattern and several different error models (or vice versa). Usually you would want to repeat the mobility pattern *exactly*, despite the change in error model. Our current support for this is the AssignStreams functions everywhere. Typical use of AssignStreams goes like this: int64_t stream = baseStream; stream += foo.AssignStreams (stream); stream += bar.AssignStreams (stream); What happens when I change foo, and it uses an extra stream? Then bar gets different streams than before, which is not what I wanted. What I want is to use the same underlying rng for the invariant parts of my model, and use a separate rng (or set of rng streams) for the varying parts of my model. The proposed patch makes this straightforward. (An alternative would be to predetermine the maximum number of streams that I'll ever want to use in the in/variant parts per model element, then increment by that number: const int64_t spacing = <max number of streams per node>; const int64_t invOffset = <max number of variant streams>; int64_t stream = baseStream; stream += foo0.AssignStreams (stream + offset); // variant part stream += bar0.AssignStreams (stream); // invariant stream = baseStream + spacing; stream += foo0.AssignStreams (stream + offset); stream += bar0.AssignStreams (stream); This requires me to decide at the outset those two constants. If I find out later I need more, I have to redo all the prior executions, to have the invariant parts of my model always execute from the correct streams.) While you might think that giving everything an independent stream is always the right thing, there are cases where it's not what you want. Think of a traffic generator that uses two random distributions, one to decide if to send traffic this event, the other to decide where. If they have independent streams, then the destination sequence will always be the same, even if the first distribution decided not to send. Instead, I might want them to draw from the same underlying RngStream, so the consumption of a value (deciding if to send) alters the subsequent sequence of destinations. This is even more of an issue in distributed simulation. I don't want each parallel process to use the same set of auto-assigned streams, as happens now by default. I can't use the parallel process SystemId, either, because that depends on how many processes I have. (It's an absolute requirement that the parallelization give the exact same results as the sequential execution, which implies that the parallel execution can't depend on how the model was parallelized, or how many parallel processes are involved.) Instead, one solution is to choose streams based on the node id. > Each RandomVariableStream object has a SetStream method that can be used to assign it to whatever substream is desired, including assigning multiple RandomVariableStream instances to the same underlying RNG. Well I definitely don't want to do that. Then the two instances will be highly correlated, since they will have the same starting point on the ring. The proposed patch doesn't use a *replica* of the RngStream, which is what SetStream would do; it uses the exact same rng state. > The main addition that I see with your proposed patch is to expose a handle to the underlying RngStream object so one can set the seed on each one on an individual basis. Actually I think of it as exactly the opposite: set the seed once for several different distributions, rather than having to set the seed for each one independently. > But I'm not sure of the use case for this and it seems to go against the guidance of this generator to use one seed for the simulation manage independent replications by incrementing run numbers. This isn't about independent replications, but about controlling (repeating) the randomness in parts of a model. > If anything, I would suggest that we modify src/core/wscript to remove rng-stream.h from the publicly exported headers (I don't think it was intended to be accessed by user programs and my guess is that it was accidentally added to the wscript at some point). If we don't want to leave rng-stream.h public, in RandomVariableStream instead of the proposed Get/SetRngStream we could have a single public function void ShareRng (RandomVariableStream &rngStream) This would keep RngStream inaccessible, yet enable distributions to draw from the same underlying RngStream. > https://codereview.appspot.com/114060043/ _______________________________________________________________________ Dr. Peter D. Barnes, Jr. Physics Division Lawrence Livermore National Laboratory Physical and Life Sciences 7000 East Avenue, L-50 email: pdbarnes@llnl.gov P. O. Box 808 Voice: (925) 422-3384 Livermore, California 94550 Fax: (925) 423-3371
Sign in to reply to this message.
On 2014/07/22 22:46:54, barnes26_llnl.gov wrote: > On Jul 21, 2014, at 2:54 PM, mailto:tomh.org@gmail.com wrote: > > > I don't see the need for this patch, can you elaborate? > > Sure. The basic issue is to control what things vary across different > executions of a model. One example is testing with a fixed mobility pattern and > several different error models (or vice versa). Usually you would want to > repeat the mobility pattern *exactly*, despite the change in error model. > > Our current support for this is the AssignStreams functions everywhere. Typical > use of AssignStreams goes like this: > > int64_t stream = baseStream; > > stream += foo.AssignStreams (stream); > stream += bar.AssignStreams (stream); > > What happens when I change foo, and it uses an extra stream? Then bar gets > different streams than before, which is not what I wanted. You would want to increment (pad) the stream value passed to bar sufficiently to account for possible variability in foo. > > What I want is to use the same underlying rng for the invariant parts of my > model, and use a separate rng (or set of rng streams) for the varying parts of > my model. The proposed patch makes this straightforward. > > (An alternative would be to predetermine the maximum number of streams that I'll > ever want to use in the in/variant parts per model element, then increment by > that number: > > const int64_t spacing = <max number of streams per node>; > const int64_t invOffset = <max number of variant streams>; > > int64_t stream = baseStream; > stream += foo0.AssignStreams (stream + offset); // variant part > stream += bar0.AssignStreams (stream); // invariant > > stream = baseStream + spacing; > stream += foo0.AssignStreams (stream + offset); > stream += bar0.AssignStreams (stream); > > This requires me to decide at the outset those two constants. If I find out > later I need more, I have to redo all the prior executions, to have the > invariant parts of my model always execute from the correct streams.) I agree that this is fragile in the way you describe; I am not sure that a foolproof solution is easy, though. I'm not seeing the leap here to sharing RngStream state as a solution to the above invariance problem. Sharing leads to coupling between random variables, not invariance. > > While you might think that giving everything an independent stream is always the > right thing, there are cases where it's not what you want. Think of a traffic > generator that uses two random distributions, one to decide if to send traffic > this event, the other to decide where. If they have independent streams, then > the destination sequence will always be the same, even if the first distribution > decided not to send. Instead, I might want them to draw from the same > underlying RngStream, so the consumption of a value (deciding if to send) alters > the subsequent sequence of destinations. Under what conditions will the destination sequence "always be the same" despite the send sequence varying? How is the send sequence varying? Are you talking about varying the distribution itself (e.g. Normal to Uniform)? > > This is even more of an issue in distributed simulation. I don't want each > parallel process to use the same set of auto-assigned streams, as happens now by > default. I can't use the parallel process SystemId, either, because that > depends on how many processes I have. (It's an absolute requirement that the > parallelization give the exact same results as the sequential execution, which > implies that the parallel execution can't depend on how the model was > parallelized, or how many parallel processes are involved.) Instead, one > solution is to choose streams based on the node id. I hadn't considered this, but it seems to me that our current auto-assignment will fail this requirement. Does this mean that parallel simulations need to explicitly assign somehow all RngStreams? That we need to document this more explicitly, or think about ways to improve the auto-assignment? > > > Each RandomVariableStream object has a SetStream method that can be used to > assign it to whatever substream is desired, including assigning multiple > RandomVariableStream instances to the same underlying RNG. > > Well I definitely don't want to do that. Then the two instances will be highly > correlated, since they will have the same starting point on the ring. The > proposed patch doesn't use a *replica* of the RngStream, which is what SetStream > would do; it uses the exact same rng state. > > > The main addition that I see with your proposed patch is to expose a handle to > the underlying RngStream object so one can set the seed on each one on an > individual basis. > > Actually I think of it as exactly the opposite: set the seed once for several > different distributions, rather than having to set the seed for each one > independently. > > > But I'm not sure of the use case for this and it seems to go against the > guidance of this generator to use one seed for the simulation manage independent > replications by incrementing run numbers. > > This isn't about independent replications, but about controlling (repeating) the > randomness in parts of a model. > > > > If anything, I would suggest that we modify src/core/wscript to remove > rng-stream.h from the publicly exported headers (I don't think it was intended > to be accessed by user programs and my guess is that it was accidentally added > to the wscript at some point). > > If we don't want to leave rng-stream.h public, in RandomVariableStream instead > of the proposed Get/SetRngStream we could have a single public function > > void ShareRng (RandomVariableStream &rngStream) > > This would keep RngStream inaccessible, yet enable distributions to draw from > the same underlying RngStream. I would be OK with exporting RngStream such as your patch if there is some value in it; my comment was that it wasn't really intended to be exported as public API and isn't adding anything right now. But as you probably gather from my above comments, I'm still trying to understand better the problem you are trying to solve, and why shared Rng streams solves it.
Sign in to reply to this message.
In thinking about this more, clearly this provides more granular control on RngStream assignment than setting the stream number. The seed and run number can also be controlled, and made invariant even if the global seed and run number change. I suppose that one example use case would be to lock down the mobility pattern for a scenario even across global seed and run number changes. Is this the idea? If so, is the sharing aspect mainly an enhancement to allow users to reduce the number of stream objects that need to be manually configured, in situations where there is enough invariance in the use of the stream that the sharing doesn't matter, or is the use case you explained about the coupling of two distributions the main driver for this? I'm just trying to understand (more from a perspective of how this is explained to users) and having more trouble seeing the latter use case. Is the idea then that the very granular control will admit some new higher level solutions to the problem of making parallel and sequential simulations match? And possibly new higher level helpers in addition to the AssignStreams()? (such as e.g. AssignSharedStream()?)
Sign in to reply to this message.
On Jul 23, 2014, at 12:07 AM, tomh.org@gmail.com wrote: > On 2014/07/22 22:46:54, barnes26_llnl.gov wrote: >> On Jul 21, 2014, at 2:54 PM, mailto:tomh.org@gmail.com wrote: >> What happens when I change foo, and it uses an extra stream? Then bar gets different streams than before, which is not what I wanted. > > You would want to increment (pad) the stream value passed to bar > sufficiently to account for possible variability in foo. >> (An alternative would be to predetermine the maximum number of streams that I'll ever want to use in the in/variant parts per model element, then increment by that number: >> >> This requires me to decide at the outset those two constants. If I find out later I need more, I have to redo all the prior executions, to have the invariant parts of my model always execute from the correct streams.) > > I agree that this is fragile in the way you describe; I am not sure that a foolproof solution is easy, though. I agree this is a challenging problem, not for the foolhardy. > I'm not seeing the leap here to sharing RngStream state as a solution to the above invariance problem. Sharing leads to coupling between random variables, not invariance. Right, but I get to choose which variables are coupled, and which aren't. That let's me choose the balance between ease of setup (via shared rng), and invariance/independence (by independent streams). >> While you might think that giving everything an independent stream is always the right thing, there are cases where it's not what you want. Think of a traffic generator that uses two random distributions, one to decide if to send traffic this event, the other to decide where. If they have independent streams, then the destination sequence will always be the same, even if the first distribution decided not to send. Instead, I might want them to draw from the same underlying RngStream, so the consumption of a value (deciding if to send) alters the subsequent sequence of destinations. > > Under what conditions will the destination sequence "always be the same" despite the send sequence varying? How is the send sequence varying? The notional example code I had in mind was like: if (m_sendRng.GetValue () < m_sendProbability) { // send to self } else { dest = m_dest.Rng.GetValue (); ... No matter how I set the sendRng, the destination sequence will be the same (first send will always go to A, second always to B, etc). However, I might consider the overall traffic pattern to be what's important: self, self, A, self, B, ... In which case I want a change in self to result in a change in the remote destination sequence at the same time. > Are you talking about varying the distribution itself (e.g. Normal to Uniform)? In the abstract, this is the general use case: I want one stream (to configure) but need variables from different distributions at different times. >> This is even more of an issue in distributed simulation. I don't want each parallel process to use the same set of auto-assigned streams, as happens now by default. I can't use the parallel process SystemId, either, because that depends on how many processes I have. (It's an absolute requirement that the parallelization give the exact same results as the sequential execution, which implies that the parallel execution can't depend on how the model was parallelized, or how many parallel processes are involved.) Instead, one solution is to choose streams based on the node id. > > I hadn't considered this, but it seems to me that our current auto-assignment will fail this requirement. Exactly. > Does this mean that parallel simulations need to explicitly assign somehow all RngStreams? That we need to document this more explicitly, or think about ways to improve the auto-assignment? Yes, something along these lines. What we're leaning toward at the moment is to add a globally unique node id, then use that in deriving the stream numbers used by higher level things, but this has it's own problems. (For starters, the "how many (consecutive) streams is a node likely to need" problem above? How to assign streams to channels, which are associated with several nodes?) >>> Each RandomVariableStream object has a SetStream method that can be used to assign it to whatever substream is desired, including assigning multiple RandomVariableStream instances to the same underlying RNG. >> >> Well I definitely don't want to do that. Then the two instances will be highly correlated, since they will have the same starting point on the ring. The proposed patch doesn't use a *replica* of the RngStream, which is what SetStream would do; it uses the exact same rng state. >> >>> The main addition that I see with your proposed patch is to expose a handle to the underlying RngStream object so one can set the seed on each one on an individual basis. >> >> Actually I think of it as exactly the opposite: set the seed once for several different distributions, rather than having to set the seed for each one independently. >> >>> But I'm not sure of the use case for this and it seems to go against the guidance of this generator to use one seed for the simulation manage independent replications by incrementing run numbers. > >> This isn't about independent replications, but about controlling (repeating) the randomness in parts of a model. >> >>> If anything, I would suggest that we modify src/core/wscript to remove rng-stream.h from the publicly exported headers (I don't think it was intended to be accessed by user programs and my guess is that it was accidentally added to the wscript at some point). >> >> If we don't want to leave rng-stream.h public, in RandomVariableStream instead of the proposed Get/SetRngStream we could have a single public function >> >> void ShareRng (RandomVariableStream &rngStream) >> >> This would keep RngStream inaccessible, yet enable distributions to draw from the same underlying RngStream. > > I would be OK with exporting RngStream such as your patch if there is > some value in it; my comment was that it wasn't really intended to be > exported as public API and isn't adding anything right now. The more I think about it the more I think this should be done with ShareRng() or ShareStream(), rather than making explicit use of RngStream. > But as you probably gather from my above comments, I'm still trying to understand better the problem you are trying to solve, and why shared Rng streams solves it. > > https://codereview.appspot.com/114060043/ On Jul 23, 2014, at 6:58 AM, tomh.org@gmail.com wrote: > In thinking about this more, clearly this provides more granular control on RngStream assignment than setting the stream number. The seed and run number can also be controlled, and made invariant even if the global seed and run number change. I suppose that one example use case would be to lock down the mobility pattern for a scenario even across global seed and run number changes. Is this the idea? Actually, I think of this as enabling coarser control, by joining RandomVariables to a common stream. I don't think it directly addresses the mobility lock-down case. > If so, is the sharing aspect mainly an enhancement to allow users to reduce the number of stream objects that need to be manually configured, in situations where there is enough invariance in the use of the stream that the sharing doesn't matter, or is the use case you explained about the coupling of two distributions the main driver for this? Yes, both of these are correct: enable a reduction in the number of streams which have to be configured, enable two distributions from the same stream. > I'm just trying to understand (more from a perspective of how this is explained to users) and having more trouble seeing the latter use case. > > Is the idea then that the very granular control will admit some new higher level solutions to the problem of making parallel and sequential simulations match? And possibly new higher level helpers in addition to the AssignStreams()? (such as e.g. AssignSharedStream()?) That's the goal. > > https://codereview.appspot.com/114060043/ _______________________________________________________________________ Dr. Peter D. Barnes, Jr. Physics Division Lawrence Livermore National Laboratory Physical and Life Sciences 7000 East Avenue, L-50 email: pdbarnes@llnl.gov P. O. Box 808 Voice: (925) 422-3384 Livermore, California 94550 Fax: (925) 423-3371
Sign in to reply to this message.
Implement ShareStream; hide RngStream from public API.
Sign in to reply to this message.
On 2014/08/11 23:00:51, Peter Barnes wrote: > Implement ShareStream; hide RngStream from public API. Bump (Apparently the cc's didn't go out when I posted the new patch.)
Sign in to reply to this message.
New API is fine with me, documentation is improved. Example code and test could follow. https://codereview.appspot.com/114060043/diff/1/src/core/model/rng-stream.h File src/core/model/rng-stream.h (right): https://codereview.appspot.com/114060043/diff/1/src/core/model/rng-stream.h#n... src/core/model/rng-stream.h:38: * For a give seed the period of the entire generator is approximately s/give/given https://codereview.appspot.com/114060043/diff/20001/doc/manual/source/random-... File doc/manual/source/random-variables.rst (right): https://codereview.appspot.com/114060043/diff/20001/doc/manual/source/random-... doc/manual/source/random-variables.rst:351: stream numbers. add to end of this sentence: "or assigned automatically by the simulator when AssignStreams() is not used." https://codereview.appspot.com/114060043/diff/20001/src/core/model/random-var... File src/core/model/random-variable-stream.h (right): https://codereview.appspot.com/114060043/diff/20001/src/core/model/random-var... src/core/model/random-variable-stream.h:136: * configured stream number. s/stream number/stream index https://codereview.appspot.com/114060043/diff/20001/src/core/model/random-var... src/core/model/random-variable-stream.h:153: Ptr<RngStream> m_rng; Is this just a style (non-technical) change? Just curious.
Sign in to reply to this message.
|