Issue 114060043: Shared RngStream

Can't Edit
Can't Publish+Mail
Start Review

Created:
10 years, 9 months ago by Peter Barnes

Modified:
10 years, 8 months ago

Reviewers:
Tom Henderson, barnes26, Tommaso Pecorella

CC:
ns-developers_isi.edu, ns-3-reviews_googlegroups.com

Visibility:
Public.

More Reviews

Description

There are many cases where you want to use several different rng distributions, but you only want to manage one seed when doing replicated runs. This patch allows several RandomVariableStreams to use the same underlying RngStream.

Patch Set 1 #

Total comments: 2

Patch Set 2 : Implement ShareStream. Hide RngStream from public API. #

Total comments: 3

Created: 10 years, 8 months ago

Download [raw] [tar.bz2]

	Unified diffs	Side-by-side diffs	Delta from patch set	Stats (+160 lines, -103 lines)			Patch
M	CHANGES.html	View	1	1 chunk	+2 lines, -0 lines	0 comments	Download
M	RELEASE_NOTES	View	1	1 chunk	+2 lines, -0 lines	0 comments	Download
M	doc/manual/source/random-variables.rst	View	1	12 chunks	+123 lines, -48 lines	1 comment	Download
M	src/core/model/random-variable-stream.h	View	1	3 chunks	+12 lines, -18 lines	2 comments	Download
M	src/core/model/random-variable-stream.cc	View	1	15 chunks	+20 lines, -35 lines	0 comments	Download
M	src/core/model/rng-stream.h	View	1	1 chunk	+1 line, -1 line	0 comments	Download
M	src/core/wscript	View	1	1 chunk	+0 lines, -1 line	0 comments	Download

Messages

Total messages: 11

Expand All Messages | Collapse All Messages

barnes26_llnl.gov

There are many cases where you want to use several different rng distributions, but you ...

10 years, 9 months ago (2014-07-21 17:41:08 UTC) #1

Tommaso Pecorella

On 2014/07/21 17:41:08, barnes26_llnl.gov wrote: > There are many cases where you want to use ...

10 years, 9 months ago (2014-07-21 18:00:04 UTC) #2

Tommaso Pecorella

https://codereview.appspot.com/114060043/diff/1/src/core/model/rng-stream.h File src/core/model/rng-stream.h (right): https://codereview.appspot.com/114060043/diff/1/src/core/model/rng-stream.h#newcode71 src/core/model/rng-stream.h:71: * \param [in] o Other RngStream this is "\param ...

10 years, 9 months ago (2014-07-21 18:00:27 UTC) #3

Tom Henderson

I don't see the need for this patch, can you elaborate? Each RandomVariableStream object has ...

10 years, 9 months ago (2014-07-21 21:54:04 UTC) #4

barnes26_llnl.gov

On Jul 21, 2014, at 2:54 PM, tomh.org@gmail.com wrote: > I don't see the need ...

10 years, 9 months ago (2014-07-22 22:46:54 UTC) #5

On Jul 21, 2014, at 2:54 PM, tomh.org@gmail.com wrote:

> I don't see the need for this patch, can you elaborate?

Sure.  The basic issue is to control what things vary across different
executions of a model.  One example is testing with a fixed mobility pattern and
several different error models (or vice versa).  Usually you would want to
repeat the mobility pattern *exactly*, despite the change in error model.

Our current support for this is the AssignStreams functions everywhere.  Typical
use of AssignStreams goes like this:

  int64_t stream = baseStream;

  stream += foo.AssignStreams (stream);
  stream += bar.AssignStreams (stream);

What happens when I change foo, and it uses an extra stream?  Then bar gets
different streams than before, which is not what I wanted.

What I want is to use the same underlying rng for the invariant parts of my
model, and use a separate rng (or set of rng streams) for the varying parts of
my model.  The proposed patch makes this straightforward.

(An alternative would be to predetermine the maximum number of streams that I'll
ever want to use in the in/variant parts per model element, then increment by
that number:

  const int64_t spacing   = <max number of streams per node>;
  const int64_t invOffset = <max number of variant streams>;

  int64_t stream = baseStream;
  stream += foo0.AssignStreams (stream + offset);  // variant part
  stream += bar0.AssignStreams (stream);           // invariant

  stream = baseStream + spacing;
  stream += foo0.AssignStreams (stream + offset);
  stream += bar0.AssignStreams (stream);

This requires me to decide at the outset those two constants.  If I find out
later I need more, I have to redo all the prior executions, to have the
invariant parts of my model always execute from the correct streams.)

While you might think that giving everything an independent stream is always the
right thing, there are cases where it's not what you want.  Think of a traffic
generator that uses two random distributions, one to decide if to send traffic
this event, the other to decide where.  If they have independent streams, then
the destination sequence will always be the same, even if the first distribution
decided not to send.  Instead, I might want them to draw from the same
underlying RngStream, so the consumption of a value (deciding if to send) alters
the subsequent sequence of destinations.

This is even more of an issue in distributed simulation.  I don't want each
parallel process to use the same set of auto-assigned streams, as happens now by
default.  I can't use the parallel process SystemId, either, because that
depends on how many processes I have.  (It's an absolute requirement that the
parallelization give the exact same results as the sequential execution, which
implies that the parallel execution can't depend on how the model was
parallelized, or how many parallel processes are involved.)  Instead, one
solution is to choose streams based on the node id.

> Each RandomVariableStream object has a SetStream method that can be used to
assign it to whatever substream is desired, including assigning multiple
RandomVariableStream instances to the same underlying RNG.  

Well I definitely don't want to do that.  Then the two instances will be highly
correlated, since they will have the same starting point on the ring.  The
proposed patch doesn't use a *replica* of the RngStream, which is what SetStream
would do; it uses the exact same rng state.

> The main addition that I see with your proposed patch is to expose a handle to
the underlying RngStream object so one can set the seed on each one on an
individual basis.  

Actually I think of it as exactly the opposite:  set the seed once for several
different distributions, rather than having to set the seed for each one
independently.

> But I'm not sure of the use case for this and it seems to go against the
guidance of this generator to use one seed for the simulation manage independent
replications by incrementing run numbers.

This isn't about independent replications, but about controlling (repeating) the
randomness in parts of a model. 

> If anything, I would suggest that we modify src/core/wscript to remove
rng-stream.h from the publicly exported headers (I don't think it was intended
to be accessed by user programs and my guess is that it was accidentally added
to the wscript at some point).

If we don't want to leave rng-stream.h public, in RandomVariableStream instead
of the proposed Get/SetRngStream we could have a single public function

  void ShareRng (RandomVariableStream &rngStream)

This would keep RngStream inaccessible, yet enable distributions to draw from
the same underlying RngStream.

> https://codereview.appspot.com/114060043/

_______________________________________________________________________
Dr. Peter D. Barnes, Jr.                Physics Division
Lawrence Livermore National Laboratory  Physical and Life Sciences
7000 East Avenue, L-50                  email:  pdbarnes@llnl.gov
P. O. Box 808                           Voice:  (925) 422-3384
Livermore, California 94550             Fax:    (925) 423-3371

Tom Henderson

On 2014/07/22 22:46:54, barnes26_llnl.gov wrote: > On Jul 21, 2014, at 2:54 PM, mailto:tomh.org@gmail.com wrote: ...

10 years, 9 months ago (2014-07-23 07:07:30 UTC) #6

On 2014/07/22 22:46:54, barnes26_llnl.gov wrote:
> On Jul 21, 2014, at 2:54 PM, mailto:tomh.org@gmail.com wrote:
> 
> > I don't see the need for this patch, can you elaborate?
> 
> Sure.  The basic issue is to control what things vary across different
> executions of a model.  One example is testing with a fixed mobility pattern
and
> several different error models (or vice versa).  Usually you would want to
> repeat the mobility pattern *exactly*, despite the change in error model.
> 
> Our current support for this is the AssignStreams functions everywhere. 
Typical
> use of AssignStreams goes like this:
> 
>   int64_t stream = baseStream;
>   
>   stream += foo.AssignStreams (stream);
>   stream += bar.AssignStreams (stream);
> 
> What happens when I change foo, and it uses an extra stream?  Then bar gets
> different streams than before, which is not what I wanted.

You would want to increment (pad) the stream value passed to bar sufficiently to
account for possible variability in foo.

> 
> What I want is to use the same underlying rng for the invariant parts of my
> model, and use a separate rng (or set of rng streams) for the varying parts of
> my model.  The proposed patch makes this straightforward.
> 
> (An alternative would be to predetermine the maximum number of streams that
I'll
> ever want to use in the in/variant parts per model element, then increment by
> that number:
>  
>   const int64_t spacing   = <max number of streams per node>;
>   const int64_t invOffset = <max number of variant streams>;
> 
>   int64_t stream = baseStream;
>   stream += foo0.AssignStreams (stream + offset);  // variant part
>   stream += bar0.AssignStreams (stream);           // invariant
> 
>   stream = baseStream + spacing;
>   stream += foo0.AssignStreams (stream + offset);
>   stream += bar0.AssignStreams (stream);
> 
> This requires me to decide at the outset those two constants.  If I find out
> later I need more, I have to redo all the prior executions, to have the
> invariant parts of my model always execute from the correct streams.)

I agree that this is fragile in the way you describe; I am not sure that a
foolproof solution is easy, though.  

I'm not seeing the leap here to sharing RngStream state as a solution to the
above invariance problem.  Sharing leads to coupling between random variables,
not invariance.

> 
> While you might think that giving everything an independent stream is always
the
> right thing, there are cases where it's not what you want.  Think of a traffic
> generator that uses two random distributions, one to decide if to send traffic
> this event, the other to decide where.  If they have independent streams, then
> the destination sequence will always be the same, even if the first
distribution
> decided not to send.  Instead, I might want them to draw from the same
> underlying RngStream, so the consumption of a value (deciding if to send)
alters
> the subsequent sequence of destinations.

Under what conditions will the destination sequence "always be the same" despite
the send sequence varying?  How is the send sequence varying?  Are you talking
about varying the distribution itself (e.g. Normal to Uniform)?

> 
> This is even more of an issue in distributed simulation.  I don't want each
> parallel process to use the same set of auto-assigned streams, as happens now
by
> default.  I can't use the parallel process SystemId, either, because that
> depends on how many processes I have.  (It's an absolute requirement that the
> parallelization give the exact same results as the sequential execution, which
> implies that the parallel execution can't depend on how the model was
> parallelized, or how many parallel processes are involved.)  Instead, one
> solution is to choose streams based on the node id.

I hadn't considered this, but it seems to me that our current auto-assignment
will fail this requirement.  Does this mean that parallel simulations need to
explicitly assign somehow all RngStreams?    That we need to document this more
explicitly, or think about ways to improve the auto-assignment?

> 
> > Each RandomVariableStream object has a SetStream method that can be used to
> assign it to whatever substream is desired, including assigning multiple
> RandomVariableStream instances to the same underlying RNG.  
> 
> Well I definitely don't want to do that.  Then the two instances will be
highly
> correlated, since they will have the same starting point on the ring.  The
> proposed patch doesn't use a *replica* of the RngStream, which is what
SetStream
> would do; it uses the exact same rng state.
> 
> > The main addition that I see with your proposed patch is to expose a handle
to
> the underlying RngStream object so one can set the seed on each one on an
> individual basis.  
> 
> Actually I think of it as exactly the opposite:  set the seed once for several
> different distributions, rather than having to set the seed for each one
> independently.
> 
> > But I'm not sure of the use case for this and it seems to go against the
> guidance of this generator to use one seed for the simulation manage
independent
> replications by incrementing run numbers.
> 
> This isn't about independent replications, but about controlling (repeating)
the
> randomness in parts of a model. 
> 
> 
> > If anything, I would suggest that we modify src/core/wscript to remove
> rng-stream.h from the publicly exported headers (I don't think it was intended
> to be accessed by user programs and my guess is that it was accidentally added
> to the wscript at some point).
> 
> If we don't want to leave rng-stream.h public, in RandomVariableStream instead
> of the proposed Get/SetRngStream we could have a single public function
> 
>   void ShareRng (RandomVariableStream &rngStream)
> 
> This would keep RngStream inaccessible, yet enable distributions to draw from
> the same underlying RngStream.

I would be OK with exporting RngStream such as your patch if there is some value
in it; my comment was that it wasn't really intended to be exported as public
API and isn't adding anything right now.  But as you probably gather from my
above comments, I'm still trying to understand better the problem you are trying
to solve, and why shared Rng streams solves it.

Tom Henderson

In thinking about this more, clearly this provides more granular control on RngStream assignment than ...

10 years, 9 months ago (2014-07-23 13:58:10 UTC) #7

barnes26_llnl.gov

On Jul 23, 2014, at 12:07 AM, tomh.org@gmail.com wrote: > On 2014/07/22 22:46:54, barnes26_llnl.gov wrote: ...

10 years, 9 months ago (2014-07-28 19:39:59 UTC) #8

On Jul 23, 2014, at 12:07 AM, tomh.org@gmail.com wrote:

> On 2014/07/22 22:46:54, barnes26_llnl.gov wrote:
>> On Jul 21, 2014, at 2:54 PM, mailto:tomh.org@gmail.com wrote:
>> What happens when I change foo, and it uses an extra stream?  Then bar gets
different streams than before, which is not what I wanted.
> 
> You would want to increment (pad) the stream value passed to bar
> sufficiently to account for possible variability in foo.

>> (An alternative would be to predetermine the maximum number of streams that
I'll ever want to use in the in/variant parts per model element, then increment
by that number:
>> 
>> This requires me to decide at the outset those two constants.  If I find out
later I need more, I have to redo all the prior executions, to have the
invariant parts of my model always execute from the correct streams.)
> 
> I agree that this is fragile in the way you describe; I am not sure that a
foolproof solution is easy, though.

I agree this is a challenging problem, not for the foolhardy.

> I'm not seeing the leap here to sharing RngStream state as a solution to the
above invariance problem.  Sharing leads to coupling between random variables,
not invariance.

Right, but I get to choose which variables are coupled, and which aren't.  That
let's me choose the balance between ease of setup (via shared rng), and
invariance/independence (by independent streams).

>> While you might think that giving everything an independent stream is always
the right thing, there are cases where it's not what you want.  Think of a
traffic generator that uses two random distributions, one to decide if to send
traffic this event, the other to decide where.  If they have independent
streams, then the destination sequence will always be the same, even if the
first distribution decided not to send.  Instead, I might want them to draw from
the same underlying RngStream, so the consumption of a value (deciding if to
send) alters the subsequent sequence of destinations. 
> 
> Under what conditions will the destination sequence "always be the same"
despite the send sequence varying?  How is the send sequence varying? 

The notional example code I had in mind was like:

  if (m_sendRng.GetValue () < m_sendProbability)
    {
      // send to self
    }
  else
    {
      dest = m_dest.Rng.GetValue ();
      ...


No matter how I set the sendRng, the destination sequence will be the same
(first send will always go to A, second always to B, etc).  However, I might
consider the overall traffic pattern to be what's important:  self, self, A,
self, B, ... In which case I want a change in self to result in a change in the
remote destination sequence at the same time.

> Are you talking about varying the distribution itself (e.g. Normal to
Uniform)?

In the abstract, this is the general use case:  I want one stream (to configure)
but need variables from different distributions at different times.

>> This is even more of an issue in distributed simulation.  I don't want each
parallel process to use the same set of auto-assigned streams, as happens now by
default.  I can't use the parallel process SystemId, either, because that
depends on how many processes I have.  (It's an absolute requirement that the
parallelization give the exact same results as the sequential execution, which
implies that the parallel execution can't depend on how the model was
parallelized, or how many parallel processes are involved.)  Instead, one
solution is to choose streams based on the node id.
> 
> I hadn't considered this, but it seems to me that our current auto-assignment
will fail this requirement.  

Exactly. 

> Does this mean that parallel simulations need to explicitly assign somehow all
RngStreams?  That we need to document this more explicitly, or think about ways
to improve the auto-assignment?

Yes, something along these lines.  What we're leaning toward at the moment is to
add a globally unique node id, then use that in deriving the stream numbers used
by higher level things, but this has it's own problems.  (For starters, the "how
many (consecutive) streams is a node likely to need" problem above?  How to
assign streams to channels, which are associated with several nodes?)

>>> Each RandomVariableStream object has a SetStream method that can be used to
assign it to whatever substream is desired, including assigning multiple
RandomVariableStream instances to the same underlying RNG. 
>> 
>> Well I definitely don't want to do that.  Then the two instances will be
highly correlated, since they will have the same starting point on the ring. 
The proposed patch doesn't use a *replica* of the RngStream, which is what
SetStream would do; it uses the exact same rng state.
>> 
>>> The main addition that I see with your proposed patch is to expose a handle
to the underlying RngStream object so one can set the seed on each one on an
individual basis.
>> 
>> Actually I think of it as exactly the opposite:  set the seed once for
several different distributions, rather than having to set the seed for each one
independently.
>> 
>>> But I'm not sure of the use case for this and it seems to go against the
guidance of this generator to use one seed for the simulation manage independent
replications by incrementing run numbers.
> 
>> This isn't about independent replications, but about controlling (repeating)
the randomness in parts of a model.
>> 
>>> If anything, I would suggest that we modify src/core/wscript to remove
rng-stream.h from the publicly exported headers (I don't think it was intended
to be accessed by user programs and my guess is that it was accidentally added
to the wscript at some point).
>> 
>> If we don't want to leave rng-stream.h public, in RandomVariableStream
instead of the proposed Get/SetRngStream we could have a single public function
>> 
>>   void ShareRng (RandomVariableStream &rngStream)
>> 
>> This would keep RngStream inaccessible, yet enable distributions to draw from
the same underlying RngStream.
> 
> I would be OK with exporting RngStream such as your patch if there is
> some value in it; my comment was that it wasn't really intended to be
> exported as public API and isn't adding anything right now.  

The more I think about it the more I think this should be done with ShareRng()
or ShareStream(), rather than making explicit use of RngStream.

> But as you probably gather from my above comments, I'm still trying to
understand better the problem you are trying to solve, and why shared Rng
streams solves it.
> 
> https://codereview.appspot.com/114060043/


On Jul 23, 2014, at 6:58 AM, tomh.org@gmail.com wrote:
> In thinking about this more, clearly this provides more granular control on
RngStream assignment than setting the stream number.  The seed and run number
can also be controlled, and made invariant even if the global seed and run
number change.  I suppose that one example use case would be to lock down the
mobility pattern for a scenario even across global seed and run number changes. 
Is this the idea?

Actually, I think of this as enabling coarser control, by joining
RandomVariables to a common stream.  I don't think it directly addresses the
mobility lock-down case.

> If so, is the sharing aspect mainly an enhancement to allow users to reduce
the number of stream objects that need to be manually configured, in situations
where there is enough invariance in the use of the stream that the sharing
doesn't matter, or is the use case you explained about the coupling of two
distributions the main driver for this?  

Yes, both of these are correct:  enable a reduction in the number of streams
which have to be configured, enable two distributions from the same stream.

> I'm just trying to understand (more from a perspective of how this is
explained to users) and having more trouble seeing the latter use case. 
> 
> Is the idea then that the very granular control will admit some new higher
level solutions to the problem of making parallel and sequential simulations
match?  And possibly new higher level helpers in addition to the
AssignStreams()?  (such as e.g. AssignSharedStream()?) 

That's the goal.  

> 
> https://codereview.appspot.com/114060043/

_______________________________________________________________________
Dr. Peter D. Barnes, Jr.                Physics Division
Lawrence Livermore National Laboratory  Physical and Life Sciences
7000 East Avenue, L-50                  email:  pdbarnes@llnl.gov
P. O. Box 808                           Voice:  (925) 422-3384
Livermore, California 94550             Fax:    (925) 423-3371

Peter Barnes

Implement ShareStream; hide RngStream from public API.

10 years, 8 months ago (2014-08-11 23:00:51 UTC) #9

Peter Barnes

On 2014/08/11 23:00:51, Peter Barnes wrote: > Implement ShareStream; hide RngStream from public API. Bump ...

10 years, 8 months ago (2014-08-13 21:34:42 UTC) #10

Tom Henderson

10 years, 8 months ago (2014-08-15 14:13:42 UTC) #11

New API is fine with me, documentation is improved.  Example code and test could
follow.

https://codereview.appspot.com/114060043/diff/1/src/core/model/rng-stream.h
File src/core/model/rng-stream.h (right):

https://codereview.appspot.com/114060043/diff/1/src/core/model/rng-stream.h#n...
src/core/model/rng-stream.h:38: * For a give seed the period of the entire
generator is approximately
s/give/given

https://codereview.appspot.com/114060043/diff/20001/doc/manual/source/random-...
File doc/manual/source/random-variables.rst (right):

https://codereview.appspot.com/114060043/diff/20001/doc/manual/source/random-...
doc/manual/source/random-variables.rst:351: stream numbers.
add to end of this sentence:  "or assigned automatically by the simulator when
AssignStreams() is not used."

https://codereview.appspot.com/114060043/diff/20001/src/core/model/random-var...
File src/core/model/random-variable-stream.h (right):

https://codereview.appspot.com/114060043/diff/20001/src/core/model/random-var...
src/core/model/random-variable-stream.h:136: * configured stream number.
s/stream number/stream index

https://codereview.appspot.com/114060043/diff/20001/src/core/model/random-var...
src/core/model/random-variable-stream.h:153: Ptr<RngStream> m_rng;
Is this just a style (non-technical) change?  Just curious.

Expand All Messages | Collapse All Messages