Rietveld Code Review Tool
Help | Bug tracker | Discussion group | Source code | Sign in
(860)

Issue 14234043: Null Message Parallel Scheduler

Can't Edit
Can't Publish+Mail
Start Review
Created:
8 years, 3 months ago by Steve Smith
Modified:
8 years, 1 month ago
CC:
ns-3-reviews_googlegroups.com
Visibility:
Public.

Description

Null Message Parallel Scheduler At LLNL we have been working on additions to NS-3 to improve the parallel performance using a null message based scheduler. Enclosed is a patch and general overview comments are enclosed below to help provide the context for what was changed. I’ve tested with the existing parallel examples and some large problems sets we are running locally. On some of our larger tests we have observed 5x speedups. This is a patch to add a new parallel scheduling algorithm based on null messages, a common parallel DES scheduling algorithm. The null message scheduler has better scaling properties when running on large numbers of nodes since it does not require a global communication. Null message replaces the global time synchronization with one based only on communication with neighborsNote that null message will not be faster for all network topologies, for example in a fully coupled topology (all MPI tasks have parallel P2P links with all other MPI Tasks) null message will perform slower than the existing algorithm. Keeping both algorithms available is desirable. The design of the changes was based on the existing set of classes with modifications to allow more than one parallel scheduler implementation. The original design used a singleton like object, the MpiInterface class, to provide the interface from NS3 packages to the communication layer. In order to add a second algorithm option, MpiInterface was refactored along the lines of the Strategy pattern. A new singleton, the ParallelController, was introduced to delegate the calls to the concrete algorithm selection. An abstract base class ParallelCommunicationsInteface was added for concrete implementations to inherit from. Class diagram is shown below. The existing parallel scheduler classes were renamed to “GrantedTimeWindowSimulationImpl” and “GrantedTime WindowMpiInterface” from the more generic “DistributedSimulationImpl” and MpiInterface in an attempt to reflect the choice of synchronization algorithms more clearly. NullMessageSimulationImpl and NullMessageMpiInterface are the corresponding null message additions. The algorithm selection for which ParallelController strategy to use is done in the Enable method of the ParallelController and is based off of the SimulatorImplementationType global value so that the communications interface used corresponds to the Simulator being used. The parallel examples have been modified to allow this selection. If this patch is accepted, parallel user application codes will have to do some minor class renaming and ensure that the SimulatorImplementationType value is set before the Enable call. The design should hopefully be sufficiently generic enough to allow for additional communication layers/algorithms. A new Scheduler and ParallelCommunicationsInterface implementations are required for each new algorithm. The existing MpiReceiver class was retained as it was general enough to use for both implementations.

Patch Set 1 #

Total comments: 14

Patch Set 2 : Null message scheduler with code changes based on review comments. #

Unified diffs Side-by-side diffs Delta from patch set Stats (+2793 lines, -454 lines) Patch
M src/internet/model/global-route-manager-impl.cc View 1 1 chunk +2 lines, -1 line 0 comments Download
M src/mpi/doc/distributed.rst View 1 3 chunks +65 lines, -4 lines 0 comments Download
M src/mpi/examples/nms-p2p-nix-distributed.cc View 1 3 chunks +3 lines, -6 lines 0 comments Download
M src/mpi/examples/simple-distributed.cc View 1 3 chunks +40 lines, -9 lines 0 comments Download
A src/mpi/examples/simple-distributed-empty-node.cc View 1 1 chunk +309 lines, -0 lines 0 comments Download
M src/mpi/examples/third-distributed.cc View 1 2 chunks +28 lines, -15 lines 0 comments Download
M src/mpi/examples/wscript View 1 chunk +4 lines, -0 lines 0 comments Download
M src/mpi/model/distributed-simulator-impl.h View 1 4 chunks +6 lines, -5 lines 0 comments Download
M src/mpi/model/distributed-simulator-impl.cc View 1 19 chunks +43 lines, -11 lines 0 comments Download
M src/mpi/model/granted-time-window-mpi-interface.h View 1 7 chunks +27 lines, -23 lines 0 comments Download
M src/mpi/model/granted-time-window-mpi-interface.cc View 1 15 chunks +56 lines, -28 lines 0 comments Download
M src/mpi/model/mpi-interface.h View 1 3 chunks +33 lines, -97 lines 0 comments Download
M src/mpi/model/mpi-interface.cc View 1 2 chunks +66 lines, -246 lines 0 comments Download
M src/mpi/model/mpi-receiver.h View 2 chunks +3 lines, -3 lines 0 comments Download
M src/mpi/model/mpi-receiver.cc View 1 chunk +2 lines, -2 lines 0 comments Download
A src/mpi/model/null-message-mpi-interface.h View 1 chunk +228 lines, -0 lines 0 comments Download
A src/mpi/model/null-message-mpi-interface.cc View 1 1 chunk +459 lines, -0 lines 0 comments Download
A src/mpi/model/null-message-simulator-impl.h View 1 chunk +212 lines, -0 lines 0 comments Download
A src/mpi/model/null-message-simulator-impl.cc View 1 1 chunk +603 lines, -0 lines 0 comments Download
A src/mpi/model/parallel-communication-interface.h View 1 1 chunk +95 lines, -0 lines 0 comments Download
A src/mpi/model/remote-channel-bundle.h View 1 1 chunk +151 lines, -0 lines 0 comments Download
A src/mpi/model/remote-channel-bundle.cc View 1 1 chunk +130 lines, -0 lines 0 comments Download
A src/mpi/model/remote-channel-bundle-manager.h View 1 chunk +107 lines, -0 lines 0 comments Download
A src/mpi/model/remote-channel-bundle-manager.cc View 1 chunk +112 lines, -0 lines 0 comments Download
M src/mpi/wscript View 1 1 chunk +8 lines, -3 lines 0 comments Download
M src/point-to-point/helper/point-to-point-helper.cc View 1 1 chunk +1 line, -0 lines 0 comments Download
M src/point-to-point/model/point-to-point-net-device.cc View 1 chunk +0 lines, -1 line 0 comments Download

Messages

Total messages: 4
Tom Henderson
Just an initial comment-- I'd like to see src/mpi/doc/distributed.rst updated (leveraging the explanatory text that ...
8 years, 3 months ago (2013-10-08 13:28:04 UTC) #1
bpswenson
This is great. I really like the refactoring done. I made a few minor comments, ...
8 years, 3 months ago (2013-10-18 20:00:54 UTC) #2
Steve Smith
I believe my latest submission fixes all the issues identified. Documentation was added; a couple ...
8 years, 1 month ago (2013-11-23 00:24:47 UTC) #3
Steve Smith
8 years, 1 month ago (2013-11-23 00:24:49 UTC) #4
I believe my latest submission fixes all the issues identified. 

Documentation was added; a couple of hints on known use cases were added to help
guide user algorithm selection.   

The class name changes have been reverted to their original names and examples
require no changes to run with the existing synchronization algorithm.
Sign in to reply to this message.

Powered by Google App Engine
RSS Feeds Recent Issues | This issue
This is Rietveld f62528b